

# Single-Core Processor



## CPU

The active part of the computer that does all the work.

- Data manipulation.
- Decision making.

## Datapath

Portion of the processor that contains hardware necessary to perform operations required by the processor (the body)

## Control

Portion of the processor that tells the datapath what needs to be done.(the brain)

# One-Instruction-Per-Cycle RISC-V Machine

## Stage of the Datapath

break up the process of “executing an instruction” into stages, and then connect the stages to create the whole datapath.

- Stage 1: *Instruction Fetch (IF)*
- Stage 2: *Instruction Decode (ID)*
- Stage 3: *Execute (EX) - ALU (Arithmetic-Logic Unit)*
- Stage 4: *Memory Access (MEM)*
- Stage 5: *Write Back to Register (WB)*

These five stages are executed in one clock cycle, which is called One-instruction-Per-cycle.



Register write will be executed at the next rising edge of clock.

## Datapath components

### COMBINATIONAL

- Combinational elements



### STATE AND SEQUENCING

## Register

- Register
- Write Enable:

- Low (or deasserted) (0): Data Out will not change
- Asserted (1): Data Out will become Data In on positive edge of clock



## Register File

Consists of 32 registers

- Register file (regfile, RF) consists of 32 registers:
  - Two 32-bit output busses: busA and busB
  - One 32-bit input bus: busW
- Register is selected by:
  - RA (number) selects the register to put on busA (data)
  - RB (number) selects the register to put on busB (data)
  - RW (number) selects the register to be written via busW (data) when Write Enable is 1
- Clock input (Clk)
  - Clk input is a factor ONLY during write operation
  - During read operation, behaves as a combinational logic block:
    - RA or RB valid  $\Rightarrow$  busA or busB valid after “access time.”



## Memory

- “Magic” Memory
  - One input bus: Data In
  - One output bus: Data Out
- Memory word is found by:
  - For Read: Address selects the word to put on Data Out
  - For Write: Set Write Enable = 1: address selects the memory word to be written via the Data In bus
- Clock input (CLK)
  - CLK input is a factor ONLY during write operation
  - During read operation, behaves as a combinational logic block: Address valid  $\Rightarrow$  Data Out valid after “access time”



For **read** operation we **don't** need to wait for the clock, just put the address of register or memory, the data will automatically pop up.

For **write** operation we need to wait for the rising edge of the clock to write data.

Each instruction during execution reads and updates the state of

1. registers
2. pc
3. Memory

## Datapath

### R-Format Datapath



## 2. PC +=4

### DATAPATH FOR ADD



## DATAPATH FOR SUB/ADD

- sub almost the same as add, except now we need to subtract operands.



## I-Format Datapath

### Addi

# Datapath for add/sub



Berkeley  
UNIVERSITY OF CALIFORNIA

Garcia, Nikolic

RISC-V (30)

Add a Mux at there

## Adding addi to Datapath



Berkeley  
UNIVERSITY OF CALIFORNIA

7 Garcia, Nikolic

RISC-V (31)

$$PC = PC + 4$$

$$Reg[rd] = Reg[rs1] + Imm$$



$$PC = PC + 4$$

$$Reg[rd] = Reg[rs1] + Imm$$





**load**

- RISC-V Assembly Instruction (I-type): `lw x14, 8(x2)`

| 31                 | 20 19     | 15 14      | 12 11     | 7 6       | 0 |
|--------------------|-----------|------------|-----------|-----------|---|
| imm[11:0]          | rs1       | funct3     | rd        | opcode    |   |
| 12<br>offset[11:0] | 5<br>base | 3<br>width | 5<br>dest | 7<br>LOAD |   |
| 000000001000       | 00010     | 010        | 01110     | 0000011   |   |
| imm=+8             | rs1=2     | lw         | rd=14     | LOAD      |   |

- The 12-bit signed immediate is added to the base address in register **rs1** to form the **memory address**
  - This is very similar to the add-immediate operation but used to create address not to create final result

## R+I Arithmetic/Logic Datapath





|           |     |     |    |         |     |
|-----------|-----|-----|----|---------|-----|
| imm[11:0] | rs1 | 000 | rd | 0000011 | lb  |
| imm[11:0] | rs1 | 001 | rd | 0000011 | lh  |
| imm[11:0] | rs1 | 010 | rd | 0000011 | lw  |
| imm[11:0] | rs1 | 100 | rd | 0000011 | lbu |
| imm[11:0] | rs1 | 101 | rd | 0000011 | lhu |

funct3 field encodes size and  
'signedness' of load data

- Supporting the narrower loads requires additional logic to extract the correct byte/halfword from the value loaded from memory, and sign- or zero-extend the result to 32 bits before writing back to register file.
  - It is just a mux + a few gates

## S-Format

### ADDING SW INSTRUCTION

- **sw**: Reads two registers, rs1 for base memory address, and rs2 for data to be stored, as well immediate offset!

**sw x14, 8(x2)**



## Adding sw to Datapath



# Adding sw to Datapath



## I+S Immediate Generation



- Just need a 5-bit mux to select between two positions where low five bits of immediate can reside in instruction
- Other bits in immediate are wired to fixed positions in instruction

## B-Format



- B-format is mostly same as S-Format, with two register sources (**rs1/rs2**) and a 12-bit immediate **imm[12:1]**
- But now immediate represents values -4096 to +4094 in 2-byte increments
- The 12 immediate bits encode even 13-bit signed byte offsets (lowest bit of offset is always zero, so no need to store it)

- Need to compute **PC+IMMEDIATE** and to compare values of **rs1** and **rs2**



Garcia Nikolic

# Branch Comparator



**BrEq** = 1, if  $A=B$

**BrLT** = 1, if  $A < B$

**BrUn** = 1 selects unsigned comparison for **BrLT**,  
0=signed

**BGE** branch:  $A \geq B$ , if  $\overline{A < B}$

$\overline{A < B} = !(A < B)$

D 1.1

Garcia, Nik

12-bit immediate encodes PC-relative offset of -4096 to +4094 bytes in multiples of 2 bytes

Standard approach: Treat immediate as in range -2048..+2047, then shift left by 1 bit to multiply by 2 for branches



Each instruction immediate bit can appear in one of two places in output immediate value – so need one 2-way mux per bit

# Lighting Up Branch Path

