

# Generated Specification Document

## INTRODUCTION

### -- Overview --

spi\_top is a Wishbone-slave SPI master optimized for software-friendly, deterministic transfers up to 128 bits per frame. It exposes a 32-bit bus with a simple address map (four RX readback words, CTRL, DIVIDER, and SS) and implements registered data-out, single-cycle acknowledge, and no bus error generation. Internally, it wraps two submodules: spi\_clgen, a 16-bit programmable SCK generator that produces launch/capture edge strobes and supports a max-rate special case (divider=0), and spi\_shift, a 128-bit TX/RX shifter supporting MSB- or LSB-first, selectable launch/sample edges, per-byte enables, and a length field where 0 encodes 128 bits. Safe programming is enforced by accepting control, divider, SS, and TX writes only while idle (not tip) to prevent mid-transfer glitches. Transfers are initiated by a start/go control bit, run with precise edge timing determined by tx\_nededge/rx\_nededge, and auto-clear the busy/start state at the final bit. Chip-selects are flexible: up to 8 active-low outputs are driven either continuously by a manual mask or automatically asserted only during an active transfer. An optional interrupt is raised at end-of-frame and is cleared by any acknowledged Wishbone access. Overall, spi\_top provides a wide-frame SPI master with accurate clocking, configurable bit ordering and timing, simple software sequencing, and robust completion signaling.

### -- Key Features --

- Wishbone slave with 32-bit, word-aligned register map (RX[0..3] @0x0–0xC, CTRL @0x10, DIVIDER @0x14, SS @0x18); byte enables; registered read data; single-cycle acknowledge; no bus-error generation.
- Programmable SPI clock via 16-bit divider with exact frequency control (SCK = clk\_in / (2\*(divider+1))); special divider==0 handling for maximum rate; one-cycle-early pos/neg edge strobes for precise timing.
- Flexible framing: 1–128-bit transfers (len==0 encodes 128); MSB-first or LSB-first; independent TX/RX edge selection for CPHA-like mode support.
- Robust transfer control: explicit start (go), transfer-in-progress (tip) tracking, deterministic first-bit preload, automatic end-of-frame busy/start-bit clear.
- End-of-transfer interrupt: asserted on final bit when IE=1; cleared by any acknowledged Wishbone access.
- Safe programming model: CTRL/DIVIDER/SS writable only when idle (!tip); TX buffer updates blocked during active transfers; sticky lower-byte behavior on CTRL to prevent inadvertent clearing.
- Chip-select management: up to 8 active-low CS outputs; manual mode mirrors SS; auto mode asserts CS only during transfers (ASS=1).
- Modular datapath: 128-bit shifter with 32-bit lane writes (per-byte enables) and concurrent RX capture; clean separation of clock generation and shift engine.
- Deterministic edge shaping: optional last\_clk gating to control leading/trailing SCK edges and levels across transfers.
- Observability: full RX frame readable via four words; tip provides busy indication.

### -- Design Goals --

Design an easy-to-integrate Wishbone SPI master (32-bit slave) with deterministic bus behavior (registered read data, single-cycle ack, no errors). Enforce safe, glitch-free operation by gating configuration and TX writes when idle and isolating TX/RX buffers during transfers. Offer protocol flexibility: 1–128-bit frames, MSB/LSB selectable, independent TX/RX edge selection with precise per-bit strobes. Support up to 8 active-low chip-selects with manual and auto-assert modes. Provide deterministic completion: busy/start auto-clear at the last bit and an interrupt-on-done cleared by any acknowledged access. Deliver broad timing control: 16-bit SCK divider (max rate at 0), last-edge/level gating, no spurious toggles at frame end. Streamline software use: word-aligned address map, byte enables, consistent 128-bit RX readback, visible busy status, and zero bus errors.

-- Block Diagram and Components --

## Overview

- spi\_top integrates five sub-blocks: Wishbone slave/register file, SPI clock generator (spi\_clgen), 128-bit shift engine (spi\_shift), chip-select driver, and interrupt logic. All logic is synchronous to wb\_clk\_i with synchronous reset wb\_rst\_i.

## Sub-blocks

### 1) Wishbone slave and register file

- Role: Exposes configuration, TX/RX data, and status to the host via a Wishbone classic slave; provides control and data to internal blocks.
- Address decode (wb\_adr\_i[4:2]):
  - 0x0, 0x4, 0x8, 0xC: RX readback lanes for the 128-bit shift buffer (32-bit each)
  - 0x10: CTRL (control/status bits: go, char\_len[6:0], lsb, tx\_negedge, rx\_negedge, ie, ass; zero-extended on read)
  - 0x14: DIVIDER (SCK divider[15:0])
  - 0x18: SS (chip-select mask[7:0], active-low)
  - Write/Read behavior:
    - CTRL/DIVIDER/SS and TX data writes are accepted only when !tip to avoid mid-transfer glitches
    - wb\_sel\_i honored for lower bytes (e.g., DIVIDER); CTRL lower byte merge makes go sticky until HW clears at end-of-transfer
    - Reads always allowed; wb\_dat\_o is registered (one-cycle latency)
    - Wishbone handshake: wb\_ack\_o pulses per accepted access; wb\_err\_o is constant 0
    - Outputs to internal logic: go, char\_len[6:0], lsb, tx\_negedge, rx\_negedge, ie, ass; divider[15:0]; ss\_reg[7:0]; TX write bus (latch[3:0], byte\_sel[3:0], p\_in[31:0])
    - Inputs from internal logic: p\_out[31:0] (RX read slice), tip (for write gating and CTRL auto-clear)

### 2) SPI clock generator (spi\_clgen)

- Role: Generates SCK and single-cycle pos\_edge/neg\_edge strobes in the system clock domain for precise launch/capture timing
- Inputs: clk\_in=wb\_clk\_i, rst=wb\_rst\_i, enable, divider[15:0], go, last\_clk
- Outputs: clk\_out (SCK), pos\_edge, neg\_edge
- Behavior:
  - SCK half-period = divider + 1 system clocks; divider==0 yields maximum rate with synthesized strobes
  - pos\_edge/neg\_edge are one-cycle strobes indicating intended SCK edges
  - last\_clk gating suppresses the first or shapes the final edge/idle level at transfer start/stop

### 3) Shift engine (spi\_shift)

- Role: 128-bit parallel-to-serial/serial-to-parallel engine; manages bit counting, order, and edge phasing
- Inputs: clk=wb\_clk\_i, rst=wb\_rst\_i, go, len[6:0] (char\_len; 0 encodes 128 bits), lsb, tx\_negedge,

- rx\_negedge, pos\_edge, neg\_edge, s\_in (MISO), s\_clk (SCK), latch[3:0], byte\_sel[3:0], p\_in[31:0]
- Outputs: s\_out (MOSI), p\_out[31:0] (read slice), tip (transfer-in-progress), last\_bit (asserts on final bit time)
- Behavior: Counts on pos\_edge, supports MSB/LSB-first, launches/captures on selected edges, blocks host writes while tip=1, produces last\_bit for clean stop

#### 4) Chip-select driver

- Role: Drives active-low SS[7:0] based on SS register and auto-select control
- Inputs: ss\_reg[7:0], ass, tip
- Output: ss\_pad\_o[7:0]
- Behavior: Manual mode (ass=0) reflects ss\_reg continuously; Auto mode (ass=1) asserts SS only while tip=1

#### 5) Interrupt logic

- Role: Signals transfer completion to the host
- Assertion: wb\_int\_o = 1 when (ie && tip && last\_bit && pos\_edge)
- Clear: wb\_int\_o cleared on any acknowledged Wishbone access (wb\_ack\_o)

Top-level interconnect and data/control flow

- CTRL.go/char\_len/lsb/tx\_negedge/rx\_negedge drive spi\_shift control inputs
- DIVIDER register drives spi\_clgen.divider; top derives spi\_clgen.enable and last\_clk from transfer state (go/tip/last\_bit)
- spi\_clgen outputs: clk\_out drives external SCK pad and spi\_shift.s\_clk; pos\_edge/neg\_edge feed spi\_shift timing
- Serial data path: spi\_shift.s\_out -> MOSI pad; MISO pad -> spi\_shift.s\_in
- RX readback: spi\_shift.p\_out (32-bit slice) -> Wishbone read mux -> wb\_dat\_o (registered)
- TX write path: From Wishbone to spi\_shift via latch/byte\_sel/p\_in, accepted only when !tip
- tip and last\_bit feed: write gating, CTRL auto-clear on end-of-transfer (tip && last\_bit && pos\_edge), SS auto mode, and interrupt generation

External interfaces

- Wishbone: wb\_clk\_i, wb\_rst\_i, wb\_cyc\_i, wb\_stb\_i, wb\_we\_i, wb\_adr\_i, wb\_sel\_i, wb\_dat\_i, wb\_dat\_o, wb\_ack\_o, wb\_err\_o
- SPI pads: sck\_pad\_o (from spi\_clgen.clk\_out), mosi\_pad\_o (from spi\_shift.s\_out), miso\_pad\_i (to spi\_shift.s\_in), ss\_pad\_o[7:0] (from chip-select driver)
- Interrupt: wb\_int\_o

-- Role within System --

spi\_top is the system's memory-mapped SPI master peripheral on a 32-bit Wishbone bus, bridging CPU/software register accesses to the SPI core to control timing, chip-selects, and data movement. It exposes a compact register map for configuration (CTRL, DIVIDER, SS) and data (128-bit TX/RX via 32-bit slices), provides standard Wishbone slave behavior (registered read data, acknowledge handshaking, no bus error generation), and arbitrates safely between bus writes and on-going transfers. Operationally, it generates SCK and edge strobes, serializes MOSI/captures MISO for variable frame lengths and bit order, and manages up to 8 active-low chip-selects in manual or auto modes. For system integration, it raises a "transfer done" interrupt (wb\_int\_o) on completion when enabled and clears it on any acknowledged bus access, delivering a deterministic, software-programmable interface to external SPI devices.

-- Assumptions and Dependencies --

- Single synchronous domain: spi\_top, spi\_clgen, and spi\_shift share one system clock; submodule

resets are asynchronous, active-high.

- Wishbone bus dependency: Requires a 32-bit Wishbone master with word-aligned accesses (`wb_adr_i[4:2]` used for decode). Byte enables must be driven; multi-byte registers honor `wb_sel_i` (e.g., DIVIDER uses `wb_sel_i[1:0]`). No error signaling (`wb_err_o=0`).
- Wishbone timing: Read data is registered and returned with a single-cycle `wb_ack_o` pulse; one ack per access. Any acknowledged Wishbone access clears the interrupt.
- Register write gating: CTRL, DIVIDER, and SS writes are accepted only when idle (`tip=0`). Hardware blocks or ignores mid-transfer writes; software must poll `tip/busy` before writing.
- TX data write dependency: Parallel TX buffer updates are permitted only when idle (`tip=0`); software must not modify TX lanes during an active transfer.
- Address map assumption: RX data occupies 0x00–0x0C (4 words), CTRL at 0x10, DIVIDER at 0x14, SS at 0x18. Reads return zero-extended values. Integration should confirm or override these addresses.
- RX read timing: Reading RX mid-transfer may yield partially updated data; software should read RX after `tip=0`.
- Interrupt behavior: `wb_int_o` asserts at end-of-transfer when `IE=1` (ie `&& tip && last_bit && pos_edge`). Cleared by any acknowledged Wishbone access; no separate clear register.
- CTRL start semantics: The start/go bit ORs with software writes and cannot be cleared while active; it auto-clears at end-of-frame (on `last_bit` at `pos_edge`) together with busy.
- SPI clock generation: SCK is derived from `spi_clgen` using `divider[15:0]`. For `divider > 0`, SCK half-period = `divider + 1` system clocks ( $f_{sck} = f_{clk} / (2^{(divider+1)})$ ). For `divider == 0`,  $SCK \approx f_{clk}/2$  via special strobe synthesis. System must choose divider values compatible with slave timing.
- Edge strobes: `pos_edge` and `neg_edge` are one-clock pulses that precede the actual SCK toggle by one system cycle when `divider > 0`. Shifter launch/capture and end-of-transfer detection depend on these strobes.
- Bit order/phase dependency: `lsb`, `tx_nededge`, and `rx_nededge` configure serialization order and sample/launch phase (CPHA-like). No explicit CPOL control; idle SCK level and first-edge behavior follow `clgen last_clk` gating. Connected slaves must be compatible with the selected phase and implicit idle polarity.
- Chip-select policy: `ss_pad_o` is active-low. Manual mode (`ass=0`) drives SS continuously from the SS register; auto mode (`ass=1`) asserts SS only while `tip=1`. Up to 8 CS outputs are provided; board-level mapping must align SS bits to devices.
- External I/O timing: MOSI updates on the configured launch edge; MISO must meet setup/hold at the capture edge. No extra timing margin is inserted; divider selection must ensure compliance with slave timing.
- Divider special case: For `divider == 0`, `clgen` emits strobes each cycle, including a go-assisted first `pos_edge`; systems must not rely solely on enable gating for the first maximum-rate edge.
- Limits: Max frame length is 128 bits (`len==0` encodes 128). No DMA/burst; operation is programmed via register accesses. No bus error generation; masters must issue valid accesses.
- Reset assumptions: All logic resets synchronously to the system clock with asynchronous assertion; registers default to known values (assumed 0), with SS deasserted (all-high). Integrators should confirm reset defaults.
- Integration dependencies: All logic remains in a single clock domain; if used otherwise, CDC must be added externally. Divider range is 0..65535. Provide timing margins per target system clock and slave requirements. Unused chip-selects may be left unconnected or tied off.

## IO PORTS

-- Clock and Reset --

- System clocking
  - Single synchronous clock domain driven by wb\_clk\_i. All internal logic (spi\_top, Wishbone slave interface, spi\_clgen, spi\_shift) operates in this domain.
  - No internal clock-domain crossings; SPI serial timing uses one-cycle strobes (pos\_edge/neg\_edge) generated in the wb\_clk\_i domain.
- SPI SCK generation
  - SCK is derived inside spi\_clgen from wb\_clk\_i via a programmable 16-bit divider; SCK is not used as a clock for internal logic.
  - For divider > 0, pos\_edge/neg\_edge assert one wb\_clk\_i cycle before the corresponding SCK transitions, giving launch/capture lead time.
  - For divider == 0 (maximum rate), SCK toggles on every wb\_clk\_i cycle when enabled; a strobe is produced each cycle to maintain launch/capture timing.
  - When enable == 0, SCK holds its current level, the counter reloads to the programmed divider for deterministic restart, and timing strobes are suppressed.
  - last\_clk gating suppresses the next low-to-high SCK transition to shape the final clock of a transfer and enforce a defined idle level between transactions.
- Reset
  - Top-level reset wb\_rst\_i is active-high and distributed asynchronously to submodules.
  - spi\_clgen on reset: divider counter initializes to 16'hFFFF; SCK (clk\_out) = 0; pos\_edge = 0; neg\_edge = 0.
  - spi\_shift on reset: tip (transfer-in-progress) = 0; bit counter cleared; s\_out = 0; 128-bit data buffer cleared; no shifting while reset is asserted.
  - Configuration registers (e.g., CTRL, DIVIDER, SS) reset to 0; wb\_int\_o deasserted.
  - After reset release, SCK idles low until enable is asserted; the first enable reloads the divider and resumes normal cadence.
- Wishbone bus timing under clock/reset
  - All Wishbone handshake signals are synchronous to wb\_clk\_i; wb\_ack\_o asserts for one cycle per valid transfer; wb\_dat\_o is registered.
  - No bus error generation (wb\_err\_o = 0); outputs remain deasserted during reset.
- Design guidance
  - Treat pos\_edge/neg\_edge as timing strobes in the wb\_clk\_i domain, not as separate clocks.
  - Ensure system timing meets divider == 0 operation (SCK at f\_sys/2) if used.

-- Wishbone Slave Interface --

Wishbone Classic 32-bit synchronous slave interface. Signals: inputs wb\_clk\_i, wb\_rst\_i, wb\_cyc\_i, wb\_stb\_i, wb\_we\_i, wb\_sel\_i[3:0], wb\_adr\_i[4:2] (word-aligned), wb\_dat\_i[31:0]; outputs wb\_dat\_o[31:0], wb\_ack\_o, wb\_err\_o (tied low), wb\_int\_o. Handshake: single-cycle acknowledgement with one-cycle latency; when wb\_cyc\_i & wb\_stb\_i are asserted, wb\_ack\_o pulses for exactly one clock in the next cycle (no back-pressure). Read timing: wb\_dat\_o is registered and valid in the cycle wb\_ack\_o is asserted. Write timing: data is sampled on the wb\_ack\_o cycle; byte enables wb\_sel\_i[3:0] are honored per-register rules. No stall or retry signals; all bus accesses receive an ACK; wb\_err\_o is never asserted. Addressing: word-aligned decode via wb\_adr\_i[4:2]; register map implements offsets 0x00..0x18. Undefined addresses' behavior is not specified, but the interface still acknowledges accesses per the unconditional ACK rule. Byte-enable usage: RX/TX data window uses all four byte lanes; DIVIDER uses only wb\_sel\_i[1:0] for the lower 16 bits; SS uses only the lower 8 bits; CTRL uses

defined bits within 14-bit width, with special sticky-OR behavior for the start/busy bit in the lower byte. Write acceptance gating: for CTRL, DIVIDER, SS, and TX window, writes are internally accepted only when the SPI transfer-in-progress flag (tip) is 0 (idle); while tip=1, the slave still ACKs the bus access but the internal register write is ignored. Read semantics: side-effect free for register contents; zero-extension on fields narrower than 32 bits; any acknowledged bus access clears the level interrupt wb\_int\_o if set. Interrupt: wb\_int\_o asserted at end of transfer when enabled; cleared by any acknowledged Wishbone access regardless of address or direction.

-- SPI Interface (SCK, MOSI, MISO) --

SPI Interface (SCK, MOSI, MISO)

- SCK generation: Produced by spi\_clgen.clk\_out from the system clock; rate set by 16-bit DIVIDER:  $f_{SCK} = f_{sys} / (2 \times (\text{DIVIDER} + 1))$ . DIVIDER=0 yields  $\sim f_{sys}/2$ .
- SCK activity: Toggles only while a transfer is in progress (spi\_shift.tip=1); otherwise holds a steady idle level.
- SCK edge shaping: A last-edge gate ensures a controlled end-of-transfer (final edge forced high→low) and can suppress the initial low→high transition, maintaining a consistent low idle after reset.
- SCK timing strobes: For DIVIDER>0, pos\_edge/neg\_edge assert one system clock before the actual SCK transition to allow MOSI setup; for DIVIDER=0, strobes are synthesized each system clock and SCK toggles every cycle when enabled.
- MOSI drive: Preloaded with the first TX bit while idle so data is valid before the first active SCK edge; thereafter updates on the selected edge (tx\_negedge=0 → rising, tx\_negedge=1 → falling).
- MOSI bit order: lsb=0 sends MSB-first; lsb=1 sends LSB-first.
- Frame length: len[6:0] encodes 1–127 bits directly; len=0 encodes 128 bits. MOSI updates stop after the final bit to avoid spurious toggles.
- MISO sampling: Captured on the selected edge aligned to SCK (rx\_negedge=0 → rising, rx\_negedge=1 → falling). Sampling is gated off after the last bit to prevent oversampling.
- Signal mapping: SCK=spi\_clgen.clk\_out; MOSI=spi\_shift.s\_out; MISO=spi\_shift.s\_in. Internal pos\_edge/neg\_edge strobes drive shifter timing.
- Mode coverage: Independent TX/RX edge selection supports common CPHA-style timing; idle polarity defaults low (CPOL not exposed in this excerpt). In auto-SS operation, SCK activity follows transfer enable/SS assertion.

-- Chip-Select Outputs --

ss\_pad\_o[7:0] are eight independent, active-low chip-select outputs. Polarity: low = asserted, high = deasserted (idle high). Control is via the 8-bit SS register at address 0x18 (reads are zero-extended); each SS bit directly drives its corresponding output, so a 0 in SS asserts that CS line and a 1 deasserts it. SS writes are accepted only when the SPI engine is idle (!tip) to prevent mid-transfer changes. Behavior is selected by the ASS bit in CTRL: Manual mode (ASS=0) drives ss\_pad\_o continuously from SS, allowing CS to be held across back-to-back frames; Auto mode (ASS=1) deasserts all CS lines when idle (ss\_pad\_o=8'hFF) and asserts only during an active transfer (tip=1), with ss\_pad\_o=SS from transfer start (tip rising) to end (tip falling). CS assertion in Auto mode precedes the first SCK toggle by at least one system cycle when a clock divider is used, and CS deassertion coincides with transfer completion (tip falling), potentially aligning with the transfer-done interrupt. Any subset of the eight CS lines can be asserted simultaneously by programming SS accordingly; unused lines remain high. Writes to SS and CTRL are gated by !tip for safety to avoid CS glitches.

-- Interrupts --

wb\_int\_o is a level-latched, transfer-complete interrupt for the SPI master. Assertion occurs at the final bit of a frame when CTRL.ie=1, driven by (ie && tip && last\_bit && pos\_edge) from spi\_shift/spi\_clgen.

Scope is global across all chip-selects; one interrupt per completed transfer with no per-CS or per-word events. Clearing is solely by any acknowledged Wishbone access (`wb_ack_o`), including reads of RX data or writes to CTRL/DIVIDER/SS; starting a new transfer via a CTRL write also clears a prior interrupt through `wb_ack_o`. There is no event queue; `wb_int_o` remains asserted until serviced, and frequent polling that causes `wb_ack_o` will clear a pending interrupt. If CTRL.ie=0, no interrupt is generated and software must poll. Divider==0 does not alter behavior; end-of-transfer auto-clears the internal start/busy control independently of interrupt clearing.

#### -- Signal Polarity and Levels --

- Reset: `rst` is asynchronous, active-high; propagated to submodules (`spi_clgen`, `spi_shift`). After reset, SCK idles low.
- Wishbone bus: `wb_clk_i` is rising-edge clock. `wb_cyc_i`, `wb_stb_i`, `wb_we_i`, `wb_sel_i[3:0]` are active-high (only `sel[1:0]` affect the 16-bit DIVIDER). `wb_ack_o` is an active-high, single-cycle pulse on valid accesses (`cyc & stb`). `wb_err_o` is tied low. `wb_dat_o` is registered and valid the cycle after acknowledge. `wb_int_o` is an active-high level asserted at end-of-transfer when IE=1 and remains high until any Wishbone access is acknowledged.
- SPI clock (SCK from `spi_clgen`): Initializes low after reset and holds its level when disabled. While enabled, toggles with half-period = divider + 1 input clocks; divider==0 gives max rate (~clk/2). `pos_edge/neg_edge` are internal, active-high, single-cycle strobes (one clk early when divider>0; alternate each clk when divider==0). `last_clk=1` suppresses low-to-high transitions (permits only high-to-low) to shape the final edge/idle level.
- MOSI (`s_out`): Actively driven; preloaded with the first TX bit while idle so the first selected edge drives valid data.
- MISO (`s_in`): Input-only; sampled on the selected capture edge; never driven by this core.
- Edge selection: `tx_nededge=1` updates MOSI on SCK falling edge (else rising). `rx_nededge=1` samples MISO on SCK falling edge (else rising). No explicit CPOL register; default idle SCK is low.
- Chip-selects `ss_pad_o[7:0]`: Active-low outputs (0=asserted, 1=deasserted). Manual (ASS=0): `ss_pad_o` continuously reflects SS register (active-low). Auto (ASS=1): `ss_pad_o` deasserted (all 1s) when idle; asserted per SS only while TIP=1.
- Control/status bits: GO=1 starts a transfer; auto-clears to 0 at end-of-transfer. IE=1 enables `wb_int_o`. ASS=1 enables auto slave select behavior. LSB=1 selects LSB-first (0=MSB-first). TIP is active-high during a transfer. LEN=0 encodes 128 bits; otherwise LEN[6:0] encodes 1..127 bits.
- I/O drive: SCK, MOSI, and CS outputs are actively driven; MISO is input-only.

#### -- Interface Timing and Handshake --

- Clock and reset: All interface timing is synchronous to the Wishbone clock. Reset is asynchronous, active high, and clears the SPI clock generator, shifter, and internal status.
- Wishbone request/ack: A request is present when `wb_cyc_i` and `wb_stb_i` are high. `wb_ack_o` is a single-cycle pulse per request when the core is not already acknowledging. `wb_err_o` is never asserted (constant 0).
- Read latency/data alignment: `wb_dat_o` is registered; read data is valid in the cycle `wb_ack_o` pulses (one-cycle latency from request).
- Byte enables: `wb_sel_i` are honored for multi-byte registers (e.g., DIVIDER[15:0]); partial writes update only the selected bytes.
- Safe-write gating: CTRL, DIVIDER, SS, and the 128-bit data buffer are writable only while idle (tip=0). Writes attempted during tip=1 are ignored, but the Wishbone handshake still completes (ack without error).
- Busy indication: tip is the transfer-in-progress flag used to gate writes and coordinate transactions. Software may poll tip via CTRL/status readback.
- Interrupt handshake: `wb_int_o` asserts synchronously when the final bit completes (tip and `last_bit` at the designated edge) and IE=1. `wb_int_o` clears on any acknowledged Wishbone access (`wb_ack_o`)

- reading or writing any register.
- SPI edge timing: SCK is generated by a divider; half-period equals divider+1 Wishbone clocks. For divider>0, pos\_edge/neg\_edge strobes assert one clock before the SCK toggle to lead launch/capture; for divider==0, strobes are synthesized each cycle to preserve per-edge timing.
- Edge selection: tx\_nedgedge selects the MOSI launch edge; rx\_nedgedge selects the MISO sample edge. On start when idle, the first TX bit is preloaded so the first SCK edge drives valid MOSI.
- Transfer termination: cnt decrements on pos\_edge; last\_bit ends the transfer (tip clears on last\_bit at pos\_edge). TX/RX clocks are gated by last to prevent extra toggles. last\_clk shapes the final SCK level for deterministic idle.
- Chip-select timing: Manual mode (ass=0) drives ss\_pad\_o directly from SS[7:0] (active-low) and may be changed only while idle. Auto mode (ass=1) asserts ss\_pad\_o for the duration of tip=1 and deasserts when idle, providing per-frame SS timing.

## ARCHITECTURE

-- Top-Level Block Overview --

spi\_top is a 32-bit Wishbone-slave SPI master that integrates two submodules: spi\_clgen (programmable SCK generator) and spi\_shift (128-bit shift engine). It exposes a single synchronous clock/reset domain (Wishbone clock), drives SCK/MOSI and captures MISO, and provides up to 8 active-low chip-select outputs. The top maps control/status and data onto a word-aligned register set decoded by wb\_adr\_i[4:2]: RX readback at 0x0..0xC (four 32-bit slices of the last 128-bit frame, read-only), CTRL at 0x10 (frame length, bit order, TX/RX edge selects, start/busy, interrupt enable, auto-SS), DIVIDER at 0x14 (SCK divisor), and SS at 0x18 (manual chip-select mask). Wishbone behavior includes a registered wb\_dat\_o (one-cycle read latency), wb\_ack\_o pulsing on cyc&stb;, byte-lane writes honored, and wb\_err\_o tied low. To ensure safe operation, writes to CTRL/DIVIDER/SS are accepted only when idle (tip=0); CTRL lower-byte writes OR-merge the start/busy bit to prevent software from clearing it mid-transfer. spi\_clgen creates SCK and early pos\_edge/neg\_edge strobes from DIVIDER; spi\_shift uses those strobes and CTRL fields to serialize MOSI, capture MISO, track busy (tip) and signal last\_bit. The top-level auto-clears the start/busy bit at the end of a frame and asserts wb\_int\_o on the final bit when IE is set; any acknowledged Wishbone access clears the interrupt. Chip-selects can be driven manually via SS or automatically (ASS) for the duration of tip. Overall, the block arbitrates host access against active transfers, provides deterministic SCK generation across divider settings (including divider==0), and offers readback of the most recent 128-bit RX frame.

-- Control Path and Register Decode --

- Wishbone register decode and timing
- 32-bit word-aligned Wishbone slave; address decode uses wb\_adr\_i[4:2] (8 words total).
- Data reads are registered: wb\_dat\_o is driven from an internal register, producing one-cycle read latency; wb\_ack\_o pulses for one cycle per qualified access (wb\_cyc\_i & wb\_stb\_i) and aligns with valid wb\_dat\_o. wb\_err\_o is tied low.
- Address map (read/write semantics)
- 0x00, 0x04, 0x08, 0x0C: 128-bit RX/TX data window
- Read: returns RX frame as four 32-bit words: [31:0], [63:32], [95:64], [127:96].
- Write: when idle (!tip), loads the corresponding 32-bit TX lane; per-byte enables honored via

wb\_sel\_i[3:0]. Writes while tip=1 are blocked.

- 0x10: CTRL (14-bit effective)
- Read: zero-extended to 32 bits.
- Write: accepted only when !tip; lower byte honors wb\_sel\_i[0]. On lower-byte writes, bit0 (go/start) is merged by OR with existing value (software cannot clear a live start/busy). Hardware auto-clears bit0 at end-of-frame (tip && last\_bit && pos\_edge).
- 0x14: DIVIDER (16-bit)
- Read: zero-extended to 32 bits.
- Write: accepted only when !tip; lower halfword honors wb\_sel\_i[1:0] (upper half ignored/reads as 0).
- 0x18: SS (8-bit active-low chip-select mask)
- Read: zero-extended to 32 bits.
- Write: accepted only when !tip; lower byte honors wb\_sel\_i[0].
- Unmapped addresses: reserved; behavior not specified (typical implementation reads 0 and ignores writes while acknowledging).

- Control path routing

- CTRL fields (14-bit): go/start, char\_len, lsb, tx\_negedge, rx\_negedge, ie, ass; unused upper bits read as 0.
- go/start -> spi\_shift.go (start transfer) and to spi\_clgen to kick edges when divider==0.
- char\_len -> spi\_shift.len[6:0]; 0 encodes 128 bits, non-zero encodes 1..127 bits.
- lsb -> spi\_shift.lsb (bit order select).
- tx\_negedge, rx\_negedge -> spi\_shift TX/RX edge selection.
- ie -> interrupt enable mask.
- ass -> chip-select mode: 0=manual, 1=auto.
- DIVIDER[15:0] -> spi\_clgen.divider (SCK half-period = divider + 1 input clocks); spi\_clgen drives pos\_edge/neg\_edge strobes to spi\_shift.
- SS[7:0] -> chip-select drive:
- Manual (ass=0): ss\_pad\_o mirrors SS continuously (active-low).
- Auto (ass=1): ss\_pad\_o asserts according to SS only while tip=1; otherwise deasserted.

- Status gating and side-effects

- tip (transfer-in-progress) from spi\_shift gates all register writes (!tip required), controls auto-SS, and qualifies end-of-transfer actions (CTRL auto-clear and interrupt set).
- Interrupt: wb\_int\_o is asserted on (ie && tip && last\_bit && pos\_edge); it is cleared on any acknowledged Wishbone access (wb\_ack\_o).

- Readback and initialization

- RX window returns the most recent captured 128-bit frame; CTRL/DIVIDER/SS return their stored values zero-extended to 32 bits.
- Reset values are unspecified; software must initialize CTRL, DIVIDER, and SS before use.

-- Datapath: Shift Engine --

Datapath: Shift Engine (spi\_shift) is the serial datapath of spi\_top. It converts a 128-bit parallel buffer into MOSI and simultaneously captures MISO back into the same buffer, with programmable frame length, bit order, and launch/capture edge selection.

- Parallel buffer and host access: A single 128-bit data register serves as both TX source and RX destination. While idle (tip=0), software writes update selected 32-bit quarters via latch[3:0] with per-byte enables byte\_sel[3:0] from p\_in[31:0]. p\_out[31:0] reads a 32-bit slice of the 128-bit buffer (slice mapping is internal to this block). All host writes are blocked while tip=1.
- Start/stop and length: A rising go when idle asserts tip to begin a frame. char\_len (len[6:0]) encodes bit count: len=0 means 128 bits; len=1..127 selects that many bits. On start, cnt loads 8'h80 for 128-bit frames or {1'b0,len} otherwise. cnt is an 8-bit down-counter that decrements on each pos\_edge while

- tip=1; last is effectively (cnt==1). Transfer completes at (tip && last && pos\_edge), deasserting tip; spi\_top uses tip/last to auto-clear start and signal done.
- Bit timing and edges: The block consumes pos\_edge and neg\_edge strobes from spi\_clgen. For divider>0, strobes are presented one input-clock early; for divider==0, strobes are synthesized per-cycle to maintain per-edge timing. tx\_clk is a one-cycle strobe selecting rising or falling SCK via tx\_negedge and is gated with last to prevent an extra end-of-frame toggle. rx\_clk is a one-cycle strobe selecting sampling edge via rx\_negedge (aligned to s\_clk) and is gated with last to avoid oversampling the terminal bit.
  - Serialization (MOSI): While idle, s\_out is preloaded with the first TX bit so the first SCK edge launches valid data. On each tx\_clk, s\_out updates from data[tx\_bit\_pos]. tx\_bit\_pos is computed from len, cnt, and lsb to support MSB-first or LSB-first ordering across arbitrary frame lengths.
  - Deserialization (MISO): On each rx\_clk, data[rx\_bit\_pos] <= s\_in, capturing MISO into the shared buffer. rx\_bit\_pos is computed from len, cnt, lsb, and rx\_negedge so bits land in correct positions for the selected bit order and sampling phase.
  - Bit order: lsb selects LSB-first vs MSB-first, applied consistently in both TX (tx\_bit\_pos) and RX (rx\_bit\_pos) index calculations for any frame length.
  - Reset behavior: rst clears cnt, tip, and s\_out, and zeroes the 128-bit data register.
  - Integration: spi\_top provides go, char\_len (len), lsb, tx\_negedge, rx\_negedge; and receives tip and last for auto-clear/interrupt. After completion, software reads the received 128-bit frame via spi\_top's RX address map. Chip-select timing and interrupt policy are managed by spi\_top; spi\_shift focuses on bit timing and data movement.

-- Datapath: Clock Generator --

#### Datapath: Clock Generator (spi\_clgen)

- Purpose: Derive SPI serial clock (SCK) from clk\_in and produce single-cycle edge-aligned strobes for TX/RX timing.
- Inputs: clk\_in, rst, enable, divider[15:0], last\_clk, go.
- Outputs: clk\_out (SCK), pos\_edge, neg\_edge (strobes in clk\_in domain; 1 clk\_in cycle wide).
- Core operation:
  - 16-bit down-counter cnt. While enable==1, cnt decrements each clk\_in cycle; when cnt==0, clk\_out toggles and cnt reloads to divider.
  - When enable==0, cnt reloads to divider and clk\_out holds its level; strobes suppressed (except divider==0 go-assisted start on re-enable).
  - Half-period = divider + 1 clk\_in cycles; SCK frequency = clk\_in / (2\*(divider + 1)).
  - Edge strobes:
    - divider > 0: strobes asserted one clk\_in cycle before the actual SCK toggle (cnt==1).
    - pos\_edge: enable==1 && clk\_out==0 && cnt==1 (precedes rising edge).
    - neg\_edge: enable==1 && clk\_out==1 && cnt==1 (precedes falling edge).
    - divider == 0 (fastest mode): clk\_out toggles every clk\_in when enable==1; generate strobes every cycle aligned to the next toggle.
    - pos\_edge aligns to the cycle preceding a low→high transition; a go term seeds an initial pos\_edge when re-enabling to guarantee a starting cadence.
    - neg\_edge aligns to the cycle preceding a high→low transition.
  - Last-edge gating:
    - last\_clk suppresses the next low→high transition near transfer end while still allowing the high→low transition, shaping the final observed edge/idle level.
  - Reset:
    - On rst: cnt=16'hFFFF; clk\_out=0; pos\_edge=0; neg\_edge=0.
  - Integration:
    - divider sourced from DIVIDER[15:0] register; writes accepted only when the core is idle (!tip) to avoid mid-transfer changes.
    - pos\_edge/neg\_edge feed the shift engine for TX/RX edge selection; pos\_edge provides the cadence

for bit counting and end-of-frame detection.

- clk\_out is driven to pads; toggling is gated by enable and last\_clk per transfer activity.
- Timing notes:
  - For divider>0, strobes are generated one clk\_in cycle ahead of the SCK toggle to provide launch/capture lead time.
  - divider==0 yields approximately clk\_out = clk\_in/2; ensure timing closure for this fastest configuration.
  - Polarity:
  - CPOL is not explicitly modeled; idle-level behavior across frames is governed by enable deassertion and last\_clk gating.

#### -- Interrupt Generation and Clear Logic --

wb\_int\_o asserts exactly at the end-of-transfer timing strobe when IE=1 and tip && last\_bit && pos\_edge is true. Here, ie is the CTRL register's interrupt-enable bit (spi\_top), tip and last\_bit come from spi\_shift, and pos\_edge is the final-bit strobe from spi\_clgen. The assertion is aligned to the final bit's pos\_edge; tip deasserts immediately after, yielding at most one interrupt per frame. wb\_int\_o is sticky and remains high until cleared by any acknowledged Wishbone access: a wb\_ack\_o pulse clears the interrupt regardless of IE state or SPI busy/idle. Reads or writes to any decoded register will clear it; writes may be blocked while busy, but wb\_ack\_o still pulses and clears. wb\_err\_o is not used for clearing (always 0). Independently, the CTRL start/busy bit auto-clears at the same end-of-transfer event and does not depend on bus activity. If IE=0, no interrupt is generated; clearing via wb\_ack\_o does not modify IE. pos\_edge is reliably produced for all DIVIDER values, including 0, ensuring deterministic interrupt timing. Auto-SS (ASS) and chip-select timing are independent of interrupt generation and clearing. Typical software clears by reading RX or any register during the ISR.

#### -- Chip-Select Control (Manual/Auto) --

Up to 8 active-low chip-selects are driven on ss\_pad\_o[7:0]. Each bit of the SS register (0x18) maps 1:1 to a CS line; writing a 0 asserts that CS (low), writing a 1 deasserts it (high). The ASS bit in the CTRL register (0x10) selects the CS control mode. Manual mode (ASS=0): ss\_pad\_o continuously mirrors SS (active-low) regardless of transfer state; software is responsible for asserting/deasserting CS and may hold CS across multiple frames. Writes to SS are accepted only when the SPI is idle (tip=0) to prevent mid-transfer glitches; changes take effect immediately after acceptance. Multiple CS lines can be asserted simultaneously by programming multiple zeros in SS (use only if attached devices support it). Auto mode (ASS=1): when idle (tip=0) ss\_pad\_o is forced deasserted (all 1s). On transfer start (tip 0→1 after go), ss\_pad\_o asserts per SS and remains asserted for the entire frame; on transfer completion (tip 1→0 after the final bit), ss\_pad\_o is immediately deasserted (all 1s), producing exactly one CS pulse per frame. SS must be preprogrammed while idle and is not changed during an active transfer. Interrupts do not affect chip-select timing; assertion/deassertion is driven solely by transfer progress (tip transitions).

#### -- Clocking and Timing Domains --

Single core clock domain: All internal logic (Wishbone register file, control, interrupt, spi\_clgen, spi\_shift) runs synchronously on a single system/bus clock (wb\_clk\_i/clk\_in). wb\_dat\_o is registered (one-cycle read latency); wb\_ack\_o and wb\_int\_o are generated synchronously to this clock.

SPI clock and strobes: SCK is derived from the core clock via a reloadable divider. With last\_clk gating inactive, SCK toggles every (divider + 1) core cycles ( $f_{SCK} = f_{core} / (2 \times (\text{divider} + 1))$ ). pos\_edge/neg\_edge are one-core-cycle pulses generated in the core domain to mark SCK edges: for divider > 0 they lead the actual SCK toggle by one core cycle; for divider == 0 they occur every cycle without the one-cycle lead. last\_clk can suppress the next rising transition to control the final edge/idle level; SCK initializes low after reset.

Shifter timing domain: The shifter builds tx\_clk (launch) and rx\_clk (capture) from pos\_edge/neg\_edge according to tx\_nedgedge/rx\_nedgedge and bit order. These strobes are core-synchronous and align with intended SCK edges. The internal bit counter decrements on pos\_edge while tip=1; transfer completes on (pos\_edge && last\_bit). External MISO is captured in the core clock domain on rx\_clk; board-level timing must ensure the SCK→MISO return path meets setup/hold relative to rx\_clk.

Chip-select timing: In auto mode (ASS=1), ss\_pad\_o asserts with tip and deasserts synchronously at the final pos\_edge. In manual mode (ASS=0), ss\_pad\_o follows the programmed SS register synchronously to the core clock and is independent of SCK.

Reset behavior: Asynchronous, active-high assertion; deassertion should be synchronized to the core clock. spi\_clgen uses this async reset; spi\_shift and top-level state share the same reset. SCK comes out of reset low.

Clock domain crossings: There are no internal asynchronous CDCs. SCK is an output derived from the core clock; datapath uses core-synchronous strobes rather than sampling on SCK. Special case at divider==0 (SCK ≈ fcore/2) reduces launch/capture margin to core-cycle granularity; ensure external device and PCB timing support this.

-- Data Widths and Buffers --

Wishbone data bus is 32 bits; all register reads are zero-extended to 32 bits. wb\_sel\_i is 4 bits; narrow registers ignore unused upper write bits, and DIVIDER supports per-byte writes via wb\_sel\_i[1:0]. wb\_dat\_o is a registered output with one-cycle latency after decode. The sole data buffer is a single 128-bit frame register in spi\_shift shared by TX/RX; software accesses it as four 32-bit words at 0x00, 0x04, 0x08, 0x0C, with RX readback mapping 0x00→[31:0], 0x04→[63:32], 0x08→[95:64], 0x0C→[127:96]. Host writes to the 128-bit buffer are allowed only when idle (tip=0). There are no FIFOs; transfers are single-shot and software-managed. Parallel data lanes p\_in/p\_out are 32 bits; lane selection uses latch[3:0] and per-byte enables use byte\_sel[3:0]. Serial I/O widths: MOSI (s\_out) 1 bit, MISO (s\_in) 1 bit, SCK (clk\_out) 1 bit. Chip-select bus is 8 bits (active-low) driving ss\_pad\_o[7:0]. Interrupt output wb\_int\_o is 1 bit. Edge qualifier signals pos\_edge and neg\_edge are single-cycle 1-bit pulses in the clk domain.

## OPERATION

-- Reset and Initialization --

Reset polarity and domains: A single active-high rst is distributed to all submodules. spi\_clgen treats rst asynchronously; spi\_shift clears internal state on rst.

Post-reset top-level state:

- Core idle: tip=0; SPI clock (SCK/clk\_out)=0; pos\_edge=0; neg\_edge=0; MOSI (s\_out)=0; RX/TX buffers cleared.
- Interrupt: wb\_int\_o deasserted; it asserts only at end of transfer when IE=1 and is cleared by any acknowledged Wishbone access.
- Wishbone: wb\_dat\_o is registered with unspecified reset value; wb\_ack\_o acknowledges per cyc/stb; no specific reset defaults are defined.

Submodule reset specifics:

- spi\_clgen: cnt=16'hFFFF; clk\_out=0; pos\_edge=0; neg\_edge=0. When not enabled, cnt reloads DIVIDER and SCK holds; no toggles occur until enable is asserted.

- spi\_shift: counters clear; tip=0; s\_out=0; 128-bit shift buffer zeroed. While idle, the first TX bit is preloaded on MOSI before the first SCK edge after start.

Initialization sequence (required before starting transfers):

1) Confirm tip=0 (idle).

2) Program DIVIDER[15:0] to the desired SCK rate.

3) Program CTRL fields (char\_len/len, lsb, tx\_negedge, rx\_negedge, ie, ass) with go=0.

4) Program SS[7:0] (active-low chip-select mask) for the target device(s).

5) Load TX data lanes while !tip.

6) Assert go in CTRL to start; the shifter preloads MOSI and the clock generator begins strobes/SCK per DIVIDER and edge settings.

7) On completion, the busy/start bit auto-clears; wb\_int\_o asserts if IE=1 and is cleared by any acknowledged Wishbone access.

Special/unspecified behaviors and cautions:

- DIVIDER==0 generates maximum-rate strobes; fully program CTRL and SS before asserting go to avoid unintended edges.

- CTRL lower-byte write applies a sticky OR to bit0 (go); do not rely on clearing it during an active transfer.

- Auto SS mode (ass=1) deasserts CS when idle; manual SS mode (ass=0) drives ss\_pad\_o directly from SS—program SS before start to keep CS inactive during initialization.

- last\_clk input reset/default behavior is unspecified; software should not assume a defined value at reset.

-- Bus Handshake and Access Timing --

- Interface: Wishbone Classic slave, zero-wait-state.

- Acknowledge: wb\_ack\_o pulses for exactly one cycle when wb\_cyc\_i & wb\_stb\_i are asserted; internal one-shot prevents multi-cycle acknowledges while a request is held.

- Errors: wb\_err\_o is permanently 0; no error/retry signaling.

- Read timing: Address decode occurs in the request cycle; wb\_dat\_o is registered and valid in the same cycle as wb\_ack\_o. Reads complete in one cycle.

- Write timing: Writes are acknowledged in one cycle with the same wb\_ack\_o behavior. Byte enables (wb\_sel\_i) are honored; DIVIDER[15:0] supports partial updates via wb\_sel\_i[1:0]. CTRL lower byte uses a sticky OR merge on bit0 (start/busy) to prevent software from clearing an active transfer.

- Commit gating: CTRL, DIVIDER, and SS writes are bus-acknowledged immediately but only take effect when !tip (idle). TX shifter lane writes are blocked while tip=1; the bus still returns ack without delay.

- Interrupt interaction: When enabled, wb\_int\_o asserts at end of transfer and is cleared by any acknowledged Wishbone access; it deasserts in the same cycle as wb\_ack\_o for that access.

- Addressing and visibility: Address decode uses wb\_adr\_i[4:2]; accesses are 32-bit word-aligned. RX reads (0x0/0x4/0x8/0xC) return the most recent completed 128-bit frame in 32-bit slices; CTRL, DIVIDER, and SS read back current values.

- Stall behavior: No stall/wait-state mechanism; every access (read or write) completes deterministically in one cycle.

-- Programming Sequence --

1) Reset and verify idle: After reset, ensure the core is idle (tip=0). Software may poll CTRL.go/busy; hardware auto-clears go/busy at the end of a transfer and software cannot clear go while busy (lower-byte OR-merge).

- 2) Configure clock and mode (idle-only writes): Write DIVIDER at 0x14 (wb\_sel\_i[1:0] honored; divider=0 gives maximum SCK). Write CTRL at 0x10 without asserting go: set char\_len (1..127 bits; 0 encodes 128 bits), lsb (bit order), tx\_negedge/rx\_negedge (launch/capture edges), ie (interrupt enable), and ass (SS mode). Write SS at 0x18 to select active-low target device(s). Writes to DIVIDER/CTRL/SS are accepted only when tip=0.
- 3) Load transmit frame (idle-only writes): Write the 128-bit TX payload as four 32-bit words to 0x0, 0x4, 0x8, 0xC (byte enables honored). TX writes are blocked while tip=1. Reads at these addresses return the most recent RX data.
- 4) Start the transfer: While idle, assert CTRL.go=1 in a write to 0x10. In auto-SS mode (ass=1), ss\_pad\_o is driven only while tip=1; in manual mode (ass=0), ss\_pad\_o follows SS continuously.
- 5) Wait for completion: Either poll CTRL.go/busy until it clears (tip goes low) or enable IE and wait for wb\_int\_o to assert at end-of-frame.
- 6) Read results and clear interrupt: Read RX[127:0] from 0x0, 0x4, 0x8, 0xC. Any acknowledged Wishbone access (including these reads) clears wb\_int\_o.
- 7) Repeat transfers: While idle, update TX data and control fields as needed, then reassert go. Avoid writing DIVIDER/CTRL/SS while tip=1; such writes are gated/ignored. Address map (wb\_adr\_i[4:2]): 0x0/0x4/0x8/0xC = RX/TX words, 0x10 = CTRL, 0x14 = DIVIDER, 0x18 = SS.

-- Transfer Lifecycle (Start/Busy/Complete) --

Lifecycle phases and behavior:

- Idle/Program: While tip=0, software may preload TX data, program DIVIDER, CTRL (including len/order/edge/ass/IE), and SS. The start control bit (go) resides in CTRL.
- Start: Writing CTRL with go=1 when tip=0 begins a transfer. The shifter asserts tip=1 and preloads the first MOSI bit before the first SCK edge. The clock generator starts issuing pos\_edge/neg\_edge strobes; at divider==0 it synthesizes the initial strobe from go to start cleanly at maximum rate.
- Busy/Active: While tip=1, bit timing proceeds. The shifter decrements its bit counter on pos\_edge and shifts/samples on the selected edges (tx\_negedge/rx\_negedge) in MSB- or LSB-first order. Host writes to CTRL, DIVIDER, SS, and TX lanes are blocked to preserve timing. The start/busy bit in CTRL is sticky on writes (OR semantics) and cannot be cleared mid-transfer; busy can be polled via CTRL read.
- Complete: On the final bit boundary (tip && last\_bit && pos\_edge), the shifter deasserts tip, hardware auto-clears CTRL start/busy, and the clock generator is gated (last\_clk) so SCK stops cleanly. If IE=1, wb\_int\_o is asserted at this event and is cleared by any Wishbone access that yields wb\_ack\_o. In auto-SS mode (ass=1), ss\_pad\_o deasserts when tip goes low. RX registers then hold the completed frame.
- Determinism/Safety: Frame length is set by CTRL.len (0 encodes 128 bits). No mid-transfer reprogramming or software abort is supported. End-of-frame auto-clear provides a single-shot start per transaction and a consistent busy/done handshake for polling or interrupt-driven use.

-- Edge Selection and Bit Timing --

Edge timing is coordinated by spi\_clgen (SCK generation) and spi\_shift (TX/RX alignment) using programmable edge strobes and selectable launch/capture edges. For divider > 0, single-cycle pos\_edge/neg\_edge strobes are generated one clk\_in cycle before the actual SCK rising/falling edges (pos\_edge when !clk\_out && cnt==1; neg\_edge when clk\_out && cnt==1), giving the datapath one-cycle setup margin. For divider == 0 (maximum rate), the one-cycle lead is removed and pos\_edge/neg\_edge are synthesized each clk\_in cycle, alternating with clk\_out to preserve per-bit timing at minimal margin. TX launch edge is selected by CTRL.tx\_negedge: 0 = rising (use pos\_edge), 1 = falling (use neg\_edge). TX updates are driven by the early strobes so data is valid one clk\_in before the physical SCK toggle when divider > 0; while idle, s\_out is preloaded with the first bit so the first SCK edge sees valid MOSI. RX capture edge is selected by CTRL.rx\_negedge: 0 = rising, 1 = falling. RX sampling is aligned to the actual chosen SCK edge by combining s\_clk with the selected edge strobe;

sampling is inhibited on the final bit to avoid oversampling. The bit counter decrements on pos\_edge while tip=1; last\_bit detection, transfer completion (tip clear), and interrupt assertion occur on (last\_bit && pos\_edge), independent of launch/capture edge selections. TX/RX strobes are masked on the last bit to prevent extra updates/samples. SCK toggles only when enable=1 and cnt==0; last\_clk gating can suppress the final low→high transition to shape the end-of-transfer edge and idle level (no explicit CPOL control; idle depends on last toggle and last\_clk policy). With divider > 0, expect one clk\_in cycle of setup before the SCK transition; with divider == 0, updates/samples occur every clk\_in cycle with minimal external timing margin. Changes to CTRL.tx\_nedge, CTRL.rx\_nedge, and divider are accepted only while idle (tip=0) to ensure deterministic edge timing.

#### -- Readback Behavior --

Reads are supported at all addresses with a one-cycle registered latency: wb\_dat\_o is valid in the cycle wb\_ack\_o is asserted, and wb\_err\_o is never asserted for reads. Any acknowledged Wishbone access (including reads of any address) clears the interrupt (wb\_int\_o); reads have no other side effects and do not stall an ongoing SPI transfer. Address map readbacks: 0x00, 0x04, 0x08, 0x0C return the 128-bit RX buffer in little-to-big 32-bit word order (rx[31:0], rx[63:32], rx[95:64], rx[127:96]); no byte reordering is performed by the core. 0x10 returns the current 14-bit CTRL value zero-extended to 32 bits; the start/busy bit is auto-cleared by hardware at end of transfer. 0x14 returns the current 16-bit DIVIDER value zero-extended to 32 bits (divider==0 reads back as 0). 0x18 returns the current 8-bit SS mask zero-extended to 32 bits. RX readback coherency: the RX buffer is updated as bits arrive and is exposed combinationally to the bus, so multi-word reads during an active transfer (tip=1) can tear and individual words may change between accesses. For a coherent frame, software should read RX only after transfer complete (tip=0 or after servicing the transfer-done interrupt). After reset, the RX buffer reads as 0 until the first transfer completes. Reads are not gated by tip; they always reflect the live internal state.

#### -- Chip-Select Modes and Behavior --

Chip-select outputs ss\_pad\_o[7:0] are active-low (0 = asserted, 1 = deasserted), with up to eight CS lines allowed and multiple lines assertable simultaneously. The SS register (address 0x18) holds the programmed CS mask and reads return the register value; writes to SS and the CTRL.ass mode bit are accepted only when the SPI engine is idle (!tip) to prevent CS glitches. Modes: Manual (ass=0) — ss\_pad\_o continuously equals SS; software is responsible for asserting/deasserting CS and may hold CS across one or more transfers. Auto (ass=1) — when a transfer is in progress (tip=1), ss\_pad\_o reflects SS; when idle (tip=0), all CS outputs are forced deasserted (all 1s) regardless of SS contents. On transfer completion (tip clears), CS deasserts immediately in auto mode, while in manual mode it remains as programmed. CS is guaranteed stable for the entire frame length (1–128 bits) due to the !tip write gating and tip-driven auto behavior. If SS is all 1s, no device is selected even during auto mode transfers (SCK/MOSI may toggle without a target). Reading SS always returns the register value; in auto mode while idle the pad state may not match SS. SS reset value is not specified here.

#### -- Interrupt Handling --

- Interrupt line: Single level-type output wb\_int\_o.
- Source: “Transfer done” only; no underrun/overflow/error interrupts; wb\_err\_o=0.
- Assert condition: Set on the final bit boundary when IE=1: (ie && tip && last\_bit && pos\_edge). Assertion is synchronous with pos\_edge; spi\_shift deasserts tip at the same boundary.
- Enable/disable: Controlled by IE in CTRL (address 0x10). CTRL writes (including IE changes) are accepted only while idle (!tip). If IE=0, no interrupt is generated; software should poll busy/done via CTRL readback (or wrapper STATUS).
- Clear mechanism: Cleared by any acknowledged Wishbone access (wb\_ack\_o), regardless of address or read/write direction. Reading any register (e.g., RX/CTRL/DIVIDER/SS) or performing any

write clears a pending interrupt. Clearing does not require the SPI to be idle.

- Persistence: wb\_int\_o remains asserted after completion until a Wishbone transaction receives wb\_ack\_o. No throttling—each completed frame reasserts wb\_int\_o when IE=1.
- Timing details: pos\_edge from spi\_clgen is used for end-of-transfer detection and is valid even when divider==0. last\_bit is derived in spi\_shift from its down-counter; tip transitions low on (last && pos\_edge), aligning with interrupt set timing.
- Interaction notes: Auto chip-select (ASS) affects ss\_pad\_o timing only and does not influence interrupt generation. Wishbone acknowledgments occur for valid cyc/stb, making clear behavior deterministic.

#### -- Error Handling --

The core does not report errors. Wishbone wb\_err\_o is tied low and wb\_ack\_o asserts for all qualifying accesses, even if a write has no internal effect. While a transfer is in progress (tip=1), writes to CTRL, DIVIDER, SS, and TX are accepted on the bus but ignored internally; no error or retry is signaled. CTRL.go is OR-merged on lower-byte writes and cannot be cleared by software mid-transfer; hardware auto-clears it at frame end. The RX register always exposes the last completed frame and may be overwritten by a new transfer without any overrun/overflow indication. The interrupt (wb\_int\_o) indicates completion only; it is cleared by any acknowledged Wishbone read or write, and unintended accesses that clear it are not treated as errors. Programming bounds are permissive: DIVIDER=0 is valid (fastest SCK) and SS is 8-bit; upper-bit writes are ignored without error. An asynchronous reset aborts any active transfer and returns the core to idle with no error reporting; auto-SS deasserts with tip=0. Accesses outside the documented address map are not flagged as errors; returned data is undefined/zero depending on implementation. The core provides no protocol fault detection (e.g., MISO inactivity, SCK violations), parity/CRC, timeouts, or error counters. There are no FIFOs, hence no underrun/overrun flags.

## REGISTERS

#### -- Address Map --

32-bit Wishbone slave; word-aligned decode using wb\_adr\_i[4:2]. Offsets below are relative to the module base; byte enables are honored as noted. Writes to DATAx/CTRL/DIVIDER/SS take effect only when idle (tip=0); reads are always allowed.

- 0x00: DATA0 — RX[31:0] read; TX[31:0] write when idle. Per-byte write enables; writes ignored during tip=1.
- 0x04: DATA1 — RX[63:32] read; TX[63:32] write when idle. Per-byte write enables; writes ignored during tip=1.
- 0x08: DATA2 — RX[95:64] read; TX[95:64] write when idle. Per-byte write enables; writes ignored during tip=1.
- 0x0C: DATA3 — RX[127:96] read; TX[127:96] write when idle. Per-byte write enables; writes ignored during tip=1.
- 0x10: CTRL — 14-bit control, R/W when idle; reads are zero-extended to 32 bits. Lower byte writes are merged (start/go bit is sticky/ORed); hardware auto-clears start/busy at end of transfer.
- 0x14: DIVIDER — 16-bit SCK divider, R/W when idle. Writes use only lower 16 bits (honor wb\_sel\_i[1:0]); upper 16 bits ignored on write and read as zero.
- 0x18: SS — 8-bit chip-select mask, R/W when idle. Only lower byte writable; upper 24 bits read as zero.

- 0x1C: Reserved/unused — do not use; behavior is implementation-defined if accessed.

Notes: Read data is registered (appears one cycle after access). wb\_ack\_o pulses once per valid access; wb\_err\_o is always 0.

#### -- CTRL Register --

##### CTRL (Control) Register

- Offset: 0x10 (32-bit Wishbone word)
- Width: 32; active control bits [13:0]. Read returns zero-extended value; writes to [31:14] are ignored.
- Reset: 0x0000\_0000 (all fields deasserted)
- Access: Read/Write, but writes are accepted only when SPI is idle (!tip). Wishbone byte enables (wb\_sel\_i) are honored.

Fields (14-bit control payload; exact bit positions per RTL):

- go (start/busy): Write 1 (via lower-byte) when idle to start a transfer. Lower-byte writes OR-merge this bit (writing 0 cannot clear it). Hardware auto-clears it at end-of-transfer (tip && last\_bit && pos\_edge). Read reflects the current busy/start state.
- char\_len[6:0]: Frame length provided to the shifter. Value 0 encodes a 128-bit frame; values 1..127 encode N-bit frames.
- lsb: Bit order; 0 = MSB-first, 1 = LSB-first.
- tx\_negedge: TX launch edge; 0 = launch on rising SCK (pos\_edge), 1 = launch on falling SCK (neg\_edge).
- rx\_negedge: RX sample edge; 0 = sample on rising SCK (pos\_edge), 1 = sample on falling SCK (neg\_edge).
- ie: Interrupt enable for transfer completion. When set, wb\_int\_o asserts when the final bit completes. The interrupt is cleared by any acknowledged Wishbone access.
- ass: Auto slave-select; 0 = manual mode (ss\_pad\_o follows SS register continuously, active-low), 1 = automatic mode (ss\_pad\_o asserted only while a transfer is active, tip=1).
- reserved: Remaining bit(s); read as 0, write as 0.

Behavior and notes:

- Write gating: All CTRL field updates (including edge selects, length, bit order) are ignored while a transfer is active (tip=1) to prevent mid-frame glitches. Byte-lane behavior applies; only the lower-byte write OR-merges go.
- Readback: Returns the current latched control fields, zero-extended to 32 bits.
- End-of-transfer: Hardware clears go on the last bit boundary; if ie=1, wb\_int\_o is asserted at completion and is cleared by any Wishbone transaction that generates wb\_ack\_o.
- SS interaction: With ass=1, ss\_pad\_o asserts only during an active transfer; with ass=0, ss\_pad\_o reflects the SS register regardless of tip.
- Implementation note: CTRL contains 14 effective control bits; bit positions should be confirmed in the RTL. char\_len is 7 bits (len[6:0]).

#### -- DIVIDER Register --

DIVIDER Register: 32-bit R/W at Wishbone word offset 0x14 within spi\_top. Bits [15:0] hold the divider value; bits [31:16] read as 0 and ignore writes. Writes are accepted only when the SPI is idle (tip == 0) to prevent mid-transfer rate changes. Byte-lane enables wb\_sel\_i[1:0] apply to the lower two bytes; upper lanes are ignored. Readback returns the zero-extended 16-bit value; accesses participate in the Wishbone handshake (wb\_dat\_o registered, wb\_ack\_o pulses) and any acknowledged access clears wb\_int\_o. Function: Programs the SPI SCK half-period in units of clk\_in cycles for the clock generator. SCK half-period (cycles) = DIVIDER + 1; SCK frequency f\_SCK = f\_clk\_in / (2 x (DIVIDER + 1)); DIVIDER = 0 selects the fastest rate  $\approx f_{clk\_in}/2$ . Timing/behavior: clgen counter reloads to DIVIDER

when enable == 0 (idle) or cnt == 0; SCK toggles at cnt == 0 when enabled. For DIVIDER > 0, pos\_edge/neg\_edge strobes assert one clk\_in cycle before the SCK toggle (cnt == 1), providing a one-cycle lead; for DIVIDER == 0, strobes are synthesized every clk\_in cycle to maintain per-edge pulses at the highest rate. Update timing: A new DIVIDER value takes effect cleanly at the next enable/start of transfer; no mid-transfer changes occur. Reset: divider register default is unspecified in this context (clgen cnt resets to 0xFFFF). Example: clk\_in = 100 MHz, DIVIDER = 4 → SCK ≈ 10 MHz.

#### -- SS Register --

Chip-Select Mask (SS) register. Address offset: 0x18 on 32-bit Wishbone. Width/Access: 8-bit implemented, read/write; reads return a 32-bit value with bits [31:8]=0. Writes use bits [7:0] only; wb\_dat\_i[31:8] and upper byte lanes are ignored (only wb\_sel\_i[0] is meaningful). Function: Each bit controls one active-low chip-select output ss\_pad\_o[7:0]; 0 asserts (low), 1 deasserts (high). Supports up to 8 chip-selects. Write timing: The SS latch updates only when the core is idle (tip=0). Writes during an active transfer (tip=1) are bus-acknowledged but do not modify the latched SS value, avoiding mid-transfer glitches. Read behavior: Non-destructive; returns the current SS latch, zero-extended. Interaction with CTRL.ASS (auto-slave-select): ASS=0 (manual) — ss\_pad\_o continuously reflects SS (active-low). ASS=1 (auto) — ss\_pad\_o are asserted only while a transfer is active (tip=1); when idle, all ss\_pad\_o are forced high regardless of SS. Usage: Multiple devices can be selected by writing multiple 0s. To keep selection between transfers, use manual mode; in auto mode all SS deassert between transfers. Reset: Not specified in this context.

#### -- RX Data Windows --

RX Data Windows expose the 128-bit receive buffer as four read-only 32-bit words on the Wishbone bus at word-aligned offsets: 0x0 -> bits [31:0], 0x4 -> [63:32], 0x8 -> [95:64], 0xC -> [127:96] (wb\_adr\_i[4:2] selects the lane). Read data is registered: wb\_dat\_o presents the selected 32-bit word with a one-cycle latency. Any acknowledged read of an RX window clears the transfer-done interrupt (wb\_int\_o). Coherency is not enforced across all four words; there is no read-freeze. Software should read after the transfer completes (tip=0 or upon interrupt) to avoid partial/tearing observations; reads during an active transfer may mix old and new data. For frame lengths < 128, only the received bit positions are updated; untouched bits retain their prior values (all zeros after reset). Bit placement follows the lsb control: lsb=1 (LSB-first) packs earlier bits into lower indices; lsb=0 (MSB-first) packs toward higher indices within the active length. Reset clears the entire buffer to 0, so all windows read back as 0 until a transfer populates them. The same addresses may be used for TX writes when idle, but reads at these addresses always return RX content. The rx\_nededge setting only selects the sampling edge and does not alter address or bit mapping.

#### -- Register Access Rules --

- Address map and widths (32-bit word-aligned):
  - 0x00–0x0C: Data window (4 x 32-bit lanes) for RX readback and TX loading.
  - 0x10: CTRL (effective 14 bits; read zero-extended to 32 bits).
  - 0x14: DIVIDER (16 bits; read zero-extended to 32 bits).
  - 0x18: SS (8-bit mask; read zero-extended to 32 bits).
- Read rules:
  - Reads are always accepted; no busy gating.
  - RX window (0x00/04/08/0C) returns slices of the most recent 128-bit received frame: [31:0], [63:32], [95:64], [127:96].
  - CTRL, DIVIDER, and SS read back current values; unused upper bits are zero.
  - CTRL.go reflects actual busy state (auto-clears when transfer completes; not software-cleared).
- Write rules and busy gating:
  - Effectful writes are honored only when idle (!tip): TX data (0x00–0x0C), CTRL (0x10), DIVIDER

(0x14), SS (0x18).

- While busy (tip=1), writes are acknowledged but have no effect on internal state; no bus error is reported.
- Byte-enable handling (wb\_sel\_i[3:0]):
- Honored for multi-byte registers and TX lanes.
- TX data lanes accept per-byte updates when idle; partial writes update only selected bytes of the 128-bit TX buffer.
- DIVIDER: only sel[1:0] affect DIVIDER[15:0]; upper lanes are ignored on write.
- SS: only low byte is writable; higher bytes are ignored on write.
- CTRL.go special semantics:
- Writing 1 to bit0 when idle starts a transfer (sets busy/tip).
- Lower-byte writes use OR semantics for bit0: writing 0 cannot clear it.
- Hardware auto-clears go/busy at end of transfer; software must not rely on write-0 to clear.
- Side effects:
- Any acknowledged Wishbone access (read or write at any address) clears wb\_int\_o.
- Error signaling:
- wb\_err\_o is never asserted for any access; software must obey idle-only write rule.
- Notes:
- DIVIDER accepts any 16-bit value, including 0; behavior handled internally by clock generator.
- TX window shares addresses with RX readback; host writes are blocked by busy gating as above.

-- Byte-Enable and Alignment Semantics --

- Address alignment: 32-bit Wishbone data bus with word-aligned accesses; wb\_adr\_i[4:2] selects the 32-bit word, and address bits [1:0] are ignored.
- RX frame mapping: 128-bit RX is exposed as four consecutive 32-bit words: 0x0 -> rx[31:0], 0x4 -> rx[63:32], 0x8 -> rx[95:64], 0xC -> rx[127:96].
- Endianness and lane mapping: wb\_sel\_i[0..3] map to bus byte lanes [7:0], [15:8], [23:16], [31:24] respectively; narrow fields are placed in least-significant bytes.
- Read semantics: wb\_sel\_i is ignored; a full 32-bit word is returned. Narrow registers are zero-extended (CTRL 14-bit in low 16 bits, DIVIDER 16-bit, SS 8-bit).
- Write semantics (byte enables): writes are honored only when the SPI is idle (!tip); while tip=1, host writes do not update contents. Per-byte enables apply as follows:
- TX buffer (128-bit): address selects the 32-bit quarter; wb\_sel\_i[3:0] select which bytes within that 32-bit word are updated.
- CTRL (14-bit at 0x10 in low 16 bits): wb\_sel\_i[0] updates bits [7:0]; wb\_sel\_i[1] updates bits [13:8]; wb\_sel\_i[2] and [3] are ignored. Bit0 of the lower byte is write-OR (sticky) and is auto-cleared by hardware at end of transfer.
- DIVIDER (16-bit at 0x14): wb\_sel\_i[0] updates bits [7:0]; wb\_sel\_i[1] updates bits [15:8]; wb\_sel\_i[2] and [3] are ignored.
- SS (8-bit at 0x18): wb\_sel\_i[0] updates bits [7:0]; wb\_sel\_i[1..3] are ignored.
- Invalid/unused lanes: writes to non-implemented byte lanes are benign no-ops; no bus error is generated.

-- Reset Values and Sticky Bits --

Reset values: At reset, spi\_top deasserts wb\_int\_o and drives wb\_err\_o to 0. Program-visible registers CTRL[13:0], DIVIDER[15:0], and SS[7:0] have unspecified/implementation-defined reset values; firmware must initialize them before starting any transfer. spi\_clgen initializes cnt[15:0] to 0xFFFF, clk\_out (SCK) to 0, and pos\_edge/neg\_edge to 0; the divider must be programmed before use. spi\_shift initializes cnt to 0, tip (transfer-in-progress) to 0, s\_out (MOSI) to 0, and its 128-bit RX/TX buffer to 0, ensuring a clean idle state. Chip-select outputs (ss\_pad\_o) depend on SS, ASS, and activity (tip); because SS/ASS reset values are unspecified, firmware should program them explicitly to avoid

unintended assertions. Sticky and auto-clear behaviors: CTRL.go is sticky-on-write in the lower byte ( $\text{new\_bit0} = \text{old\_bit0 OR write\_bit0}$ ), preventing software from clearing it with lower-byte writes while active; hardware auto-clears it at end of frame when the final bit completes ( $\text{tip \&& last\_bit \&& pos\_edge}$ ). The interrupt ( $\text{wb\_int\_o}$ ) sets at the same transfer-complete event when  $\text{IE}=1$  and remains asserted (sticky) until any acknowledged Wishbone access occurs; it is cleared on  $\text{wb\_ack\_o}$ , not by writing a specific bit. Write gating: writes to CTRL, DIVIDER, and SS are accepted only when idle ( $!\text{tip}$ ); while  $\text{tip}=1$  these registers cannot be modified, effectively holding their values through the active transfer.

## CORE CONFIGURATION

-- Frame Length Configuration --

Frame length is programmed via CTRL (0x10) field  $\text{char\_len}[6:0]$ . Encoding:  $\text{char\_len}=0$  selects a 128-bit frame;  $\text{char\_len}=N$  (1..127) selects an N-bit frame. Writes to CTRL (including  $\text{char\_len}$ ) are accepted only while the SPI master is idle ( $\text{tip}=0$ ); the programmed length is latched into the shifter when a transfer is started (go asserted while idle). On transfer start, the shifter loads an 8-bit down-counter with  $\{0,\text{char\_len}\}$  for nonzero lengths or 0x80 for  $\text{char\_len}=0$ ; it decrements on each  $\text{pos\_edge}$  strobe, and the final bit occurs when the counter reaches 1. At the final bit, hardware ends the transfer ( $\text{tip}$  deasserts), auto-clears the start/busy control bit, and asserts an interrupt if  $\text{IE}=1$ . For frames shorter than 128 bits, only the selected N bit positions are shifted/sampled; bits outside the programmed frame are not modified in the 128-bit RX/TX buffer. Reading CTRL returns the programmed  $\text{char\_len}$ ; a read value of 0 indicates the 128-bit setting. Notes: CTRL is 14-bit;  $\text{char\_len}$  is a 7-bit field forwarded to  $\text{spi\_shift.len}[6:0]$ . The sticky lower-byte write behavior applies only to start/busy, not  $\text{char\_len}$ . Software should poll  $\text{tip}=0$  before writing  $\text{char\_len}$ , then assert go; frame length can be reprogrammed between transfers without side effects.

-- Bit Order Configuration --

Selects the serial bit order for each SPI frame. Field: CTRL.lsb. Values: 0 = MSB-first (most significant bit transmitted and received first), 1 = LSB-first (least significant bit first). Applies to both MOSI serialization and MISO placement inside  $\text{spi\_shift}$  so that the 128-bit parallel RX view reflects the chosen order. Works for any frame length 1–128 bits ( $\text{len}==0$  encodes 128); only the active frame-length bits are shifted/stored, other buffer bits remain unchanged. The shifter preloads the first bit per the selected order to ensure the first SCK edge launches valid data, and transfer completion is gated to stop exactly at the configured length regardless of order. Update rules: CTRL.lsb is writable only while idle ( $\text{tip}=0$ ); the value is latched when a transfer starts (go) and mid-transfer changes have no effect; readback returns the currently programmed value. Independent of clock polarity/phase ( $\text{tx\_negedge}$ ,  $\text{rx\_negedge}$ ) and SCK divider—only the serial bit traversal is affected. Software must pack TX data to align the intended MSB/LSB of the frame with the selected order; RX readback will match that logical ordering. Reset default for lsb is not specified here (see top-level reset defaults).

-- Launch/Capture Edge Selection --

TX/RX edge selection is programmable per transfer via CTRL.tx\_negedge and CTRL.rx\_negedge (writable only when idle,  $!\text{tip}$ ). A value of 0 selects the SCK rising edge; 1 selects the SCK falling edge. Edge strobes  $\text{pos\_edge}$  and  $\text{neg\_edge}$  originate in the clock generator: for  $\text{divider} > 0$  they assert one  $\text{clk\_in}$  cycle before the actual SCK toggle to provide setup time; for  $\text{divider} == 0$  strobes are synthesized

every cycle, with go creating the initial start strobe at maximum rate. The shifter builds tx\_clk from the selected strobe (pos\_edge or neg\_edge) to launch MOSI; the first transmit bit is preloaded while idle so the first selected SCK edge sees valid data. tx\_clk is gated by the last-bit indicator to prevent any MOSI update after the final bit. Similarly, rx\_clk is built from the selected strobe and aligned to SCK so MISO is sampled exactly at the chosen edge, and it is gated by the last-bit indicator to avoid oversampling. Bit accounting is independent of selection: the counter decrements on pos\_edge while tip=1; tip deasserts on last && pos\_edge and the interrupt uses (ie && tip && last\_bit && pos\_edge). Independent TX/RX edge selection enables CPHA-like behavior (launch on one edge, capture on the opposite); CPOL is not explicitly controlled, but clock-generator last\_clk gating can shape leading/trailing SCK edges across transfers for mode compliance.

#### -- SCK Rate Configuration --

- Purpose: Selects the SPI SCK rate by programming the SCK half-period in units of the internal clk\_in cycles.
- Register: DIVIDER[15:0] at offset 0x14. Writable only when the core is idle (no transfer in progress); writes during a transfer are ignored. Byte enables supported on wb\_sel\_i[1:0] (lower two bytes). Readback is zero-extended to 32 bits.
- Frequency relation: For DIVIDER  $\geq 1$ , SCK half-period =  $(\text{DIVIDER} + 1) \times \text{clk\_in}$  cycles and  $f_{\text{sck}} = f_{\text{clk\_in}} / (2 \times (\text{DIVIDER} + 1))$ . For DIVIDER = 0 (maximum rate), SCK toggles every clk\_in cycle when enabled, so  $f_{\text{sck}} \approx f_{\text{clk\_in}}/2$ .
- Activity gating: SCK is generated only during an active transfer; between transfers the clock holds a stable idle level and does not free-run. On the first enable after idle, the counter reloads from DIVIDER to ensure a deterministic start.
- Timing strobes (internal): For DIVIDER  $> 0$ , pos\_edge/neg\_edge assert one clk\_in cycle before the corresponding SCK edge; for DIVIDER = 0, equivalent strobes are produced every cycle to preserve correct TX/RX timing at the fastest rate.
- End-of-transfer shaping: The final SCK transition can be gated so the last observed edge is well-defined; SCK then holds the idle level between frames.
- Range and granularity: DIVIDER is 16-bit (0–65535). Integer half-period control only; SCK rate is constant for the duration of a frame and should not be changed mid-transfer (writes while active are not taken). Reset initializes SCK low and the counter to 0xFFFF; the first enable reloads from DIVIDER.

#### -- Chip-Select Mode and Width --

Eight active-low chip-select outputs (ss\_pad\_o[7:0]) controlled by the 8-bit SS register and the ASS bit in CTRL. Width: 8 CS lines, each asserted with SS bit=0 and deasserted with SS bit=1; multiple CS lines may be asserted simultaneously. Modes: Manual (ASS=0) — ss\_pad\_o mirrors SS continuously. Auto (ASS=1) — ss\_pad\_o mirrors SS only while a transfer is active (tip=1); when idle (tip=0), all CS lines are forced deasserted (all 1s). Access: SS is writable only when idle (tip=0) to prevent glitches; SS address is 0x18 (readback zero-extended). Recommendation: Leave unused CS lines deasserted by setting their SS bits to 1.

#### -- Maximum Frame Size and Buffering --

Maximum frame size per transfer is 128 bits. CTRL.char\_len is 7-bit: values 1–127 specify that exact number of bits; 0 encodes a full 128-bit frame. An internal 8-bit down-counter supports lengths up to 128 bits. Buffering is single-frame, single-deep: a 128-bit register in spi\_shift both preloads the TX payload and captures RX bits back into the same buffer; reset clears it. The buffer is exposed on the Wishbone bus as four 32-bit lanes, with RX readback at 0x0, 0x4, 0x8, and 0xC returning the most recent 128-bit frame (registered, one-cycle latency). TX data is written into 32-bit quarters with per-byte enables; all writes to CTRL/DIVIDER/SS and the TX buffer are allowed only while idle (tip=0) and are blocked during a transfer (tip=1). There is no FIFO: software must segment payloads >128 bits into

multiple frames using load → start → wait for interrupt/end-of-frame → read. Bit-order (lsb) and edge controls (tx\_negedge/rx\_negedge) affect serialization and bit placement but not capacity.

#### -- Timing Strobes Usage --

Timing strobes are single-cycle pulses (pos\_edge, neg\_edge) generated by spi\_clgen in the clk\_in domain to announce upcoming SCK (clk\_out) edges. For divider > 0, each strobe occurs one clk\_in cycle before the corresponding SCK toggle, providing a one-cycle lead for launch/sample preparation. For divider == 0, strobes are synthesized every clk\_in cycle (alternating with clk\_out) and a go-derived kick produces the initial pos\_edge when starting at the fastest rate. Strobes are produced only while enable is asserted (during an active transfer); when disabled, the counter reloads and SCK holds. spi\_shift derives tx\_clk and rx\_clk by selecting pos\_edge or neg\_edge via tx\_negedge and rx\_negedge and gates these clocks with the last signal to suppress launches/samples after the final bit. The bit counter decrements on pos\_edge; last\_bit is detected from this countdown. Transfer termination keys off pos\_edge: tip deasserts on (last\_bit && pos\_edge), and spi\_top raises wb\_int\_o and auto-clears the start/busy CTRL bit on (ie && tip && last\_bit && pos\_edge). last\_clk gating in spi\_clgen can suppress the next rising SCK toggle to end the frame cleanly (e.g., on a falling edge), while strobes are still issued so that final launch/sample and termination proceed even without a physical SCK transition. All strobe generation, countdown, and termination decisions are synchronous to clk\_in; external SPI pins follow on the subsequent hardware toggle. Divider updates are blocked while tip = 1 to keep strobe cadence stable. In ASS mode, SS asserts with tip so the first strobe occurs while SS is active. Frames always count on pos\_edge regardless of selected launch/sample edges; the design does not provide termination on neg\_edge.

#### -- Constraints and Safe Programming Rules --

- Bus access discipline:
  - All registers are word-aligned; use wb\_sel\_i for byte/halfword writes where supported.
  - wb\_dat\_o is registered; expect one-cycle read latency and a single wb\_ack\_o pulse per access.
  - wb\_err\_o is always 0; do not rely on bus error signaling for protection.
  - Any acknowledged Wishbone access clears the interrupt; avoid unrelated bus transactions while an interrupt is pending to prevent spurious clears.
  - Write gating (idle-only updates):
    - CTRL[13:0], DIVIDER[15:0], SS[7:0], and TX data lanes are writable only when tip=0 (shifter idle). Always read busy (CTRL or wrapper STATUS) and wait for tip=0 before writing.
    - CTRL lower byte write OR-merges with existing bit0 (start/busy); software cannot clear start while hardware is active. Hardware auto-clears start at end-of-frame.
    - DIVIDER honors wb\_sel\_i[1:0]; upper bytes are ignored on write and read as zero.
  - Transfer framing and start/stop:
    - Program DIVIDER, CTRL fields (char\_len, lsb, tx\_negedge, rx\_negedge, ie, ass), SS, and preload TX while tip=0, then assert go.
    - char\_len is 7 bits: 0 encodes 128 bits; 1..127 encode that many bits. Do not program values >127.
    - Do not attempt to retrigger start mid-frame; hardware clears start/busy at end-of-frame.
  - Data coherency:
    - Block parallel TX writes while tip=1; preload only when idle.
    - RX updates during shifting; for a coherent 128-bit read (four words at 0x0/0x4/0x8/0xC), read only when tip=0 (e.g., post-interrupt).
  - Interrupt handling:
    - Transfer-done interrupt asserts at the final bit when IE=1.
    - ISR must perform at least one acknowledged Wishbone access to clear the interrupt; minimize unrelated accesses while the interrupt is pending.
  - Chip-select behavior and safety:
    - SS bits are active-low.

- Manual mode (ass=0): ss\_pad\_o follows SS continuously; do not modify SS mid-transfer (writes are blocked by tip) to avoid glitches.
- Auto mode (ass=1): ss\_pad\_o asserts only while tip=1 and deasserts otherwise. If SS must remain asserted across frames, use manual mode and sequence accordingly.
- Clocking and edge timing (spi\_clgen):
- divider is 16-bit; divider=0 yields maximum SCK  $\approx$  clk\_in/2. Ensure external devices and timing closure can tolerate this rate.
- For divider>0, pos\_edge/neg\_edge strobes lead the SCK toggle by one clk\_in cycle; do not change edge-select fields after go.
- last\_clk gating can suppress leading/trailing edges; do not change related mode fields during a transfer.
- Edge selection and bit order (spi\_shift):
- Choose tx\_negedge/rx\_negedge to match the target SPI mode; do not modify these fields while tip=1.
- lsb selects LSB-first serialization; ensure both ends agree on bit order.
- Reset and initialization:
- rst is asynchronous, active-high. After reset, reprogram DIVIDER, CTRL, SS, and preload TX before starting.
- When enable=0 (idle), the clock generator counter reloads to divider for deterministic restart; do not rely on previous counter values across resets or disables.