

# CS 110

## Computer Architecture

### Lecture 10:

### *Datapath*

Instructors:

Sören Schwertfeger & Chundong Wang

<https://robotics.shanghaitech.edu.cn/courses/ca/20s/>

School of Information Science and Technology SIST

ShanghaiTech University

Slides based on UC Berkley's CS61C



# Admin

- Project 1.1 due very soon!
- Start early with project 1.2...
- Be careful not to publicly post your HW or project code on gitlab!
- Do not make merge requests to the framework...

# Review

- Timing constraints for Finite State Machines
  - Setup time, Hold Time, Clock to Q time
- Use muxes to select among inputs
  - S control bits selects from  $2^S$  inputs
  - Each input can be n-bits wide, independent of S
  - Can implement muxes hierarchically
- ALU can be implemented using a mux
  - Coupled with basic block elements
  - Adder/ Substractor & AND & OR & shift

# Components of a Computer



# The CPU

- Processor (CPU): the active part of the computer that does all the work (data manipulation and decision-making)
- Datapath: portion of the processor that contains hardware necessary to perform operations required by the processor
- Control: portion of the processor (also in hardware) that tells the datapath what needs to be done

# One-Instruction-Per-Cycle RISC-V Machine



- One clock tick => one instruction
- Current state outputs => inputs to combinational logic => outputs settle at the values of state before next clock edge
- Rising clock edge:
  - all state elements are updated with combinational logic outputs
  - execution moves to next clock cycle

What is special about  
Instruction Memory?

Why is Instruction  
Memory special?

# Datapath and Control

- Datapath designed to support data transfers required by instructions
- Controller causes correct transfers to happen



# Stages of the Datapath : Overview

- Problem: a single, “monolithic” block that “executes an instruction” (performs all necessary operations beginning with fetching the instruction) would be too bulky and inefficient
- Solution: break up the process of “executing an instruction” into stages, and then connect the stages to create the whole datapath
  - smaller stages are easier to design
  - easy to optimize (change) one stage without touching the others (modularity)

# Five Stages of Instruction Execution

- Stage 1: Instruction Fetch (IF)
- Stage 2: Instruction Decode (ID)
- Stage 3: Execute (EX): ALU (Arithmetic-Logic Unit)
- Stage 4: Memory Access (MEM)
- Stage 5: Register Write (WB)

# Stages of Execution on Datapath



# Stages of Execution (1/5)

- There is a wide variety of RISC-V instructions: so what general steps do they have in common?
- Stage 1: Instruction Fetch
  - no matter what the instruction, the 32-bit instruction word must first be fetched from memory (the cache-memory hierarchy)
  - also, this is where we Increment PC (that is,  $PC = PC + 4$ , to point to the next instruction: byte addressing so + 4)

# Stages of Execution (2/5)

- Stage 2: Instruction Decode
  - upon fetching the instruction, we next gather data from the fields (decode all necessary instruction data)
  - first, read the opcode to determine instruction type and field lengths
  - second, (at the same time!) read in data from all necessary registers
    - for add, read two registers
    - for addi, read one register
  - third, generate the immediates

# Stages of Execution (3/5)

- Stage 3: ALU (Arithmetic-Logic Unit)
  - the real work of most instructions is done here:  
arithmetic (+, -, \*, /), shifting, logic (&, |)
  - what about loads and stores?
    - `lw t0, 40(t1)`
    - the address we are accessing in memory = the value in `t1` PLUS the value 40
    - so we do this addition in this stage
  - also does stuff for other instructions...

# Stages of Execution (4/5)

- Stage 4: Memory Access
  - actually only the load and store instructions do anything during this stage; the others remain idle during this stage or skip it all together
  - since these instructions have a unique step, we need this extra stage to account for them
  - as a result of the cache system, this stage is expected to be fast

# Stages of Execution (5/5)

- Stage 5: Register Write
  - most instructions write the result of some computation into a register
  - examples: arithmetic, logical, shifts, loads, jumps
  - what about stores, branches?
    - don't write anything into a register at the end
    - these remain idle during this fifth stage or skip it all together

# Stages of Execution on Datapath



# Datapath Components: Combinational

- Combinational Elements



**Adder**



**Multiplexer**



**ALU**

- Storage Elements + Clocking Methodology
- Building Blocks

# Datapath Elements: State and Sequencing (1/3)

- Register



- Write Enable:
  - Negated (or deasserted) (0): Data Out will not change
  - Asserted (1): Data Out will become Data In on positive edge of clock

# Datapath Elements: State and Sequencing (2/3)

- Register file (regfile, RF) consists of 32 registers
  - Two 32-bit output busses: busA and busB
  - One 32-bit input bus: busW
  - In one clock cycle can read two registers and write another!



- Register is selected by:
  - RA (number) selects the register to put on busA (data)
  - RB (number) selects the register to put on busB (data)
  - RW (number) selects the register to be written via busW (data) when Write Enable is 1
- Clock input (clk)
  - Clk input is a factor ONLY during write operation
  - During read operation, behaves as a combinational logic block:
    - RA or RB valid  $\Rightarrow$  busA or busB valid after “access time.”

**Memory Size of Register File?**

# Datapath Elements: State and Sequencing (3/3)

- “Magic” Memory
  - One input bus: Data In
  - One output bus: Data Out
- Memory word is found by:
  - For Read: Address selects the word to put on Data Out
  - For Write: Set Write Enable = 1: address selects the memory word to be written via the Data In bus
- Clock input (CLK)
  - CLK input is a factor ONLY during write operation
  - During read operation, behaves as a combinational logic block: Address valid  $\Rightarrow$  Data Out valid after “access time”



# State Required by RV32I ISA

Each instruction reads and updates this state during execution:

- Registers (**x0 . . x31**)
  - Register file (*regfile*) **Reg** holds 32 registers x 32 bits/register: **Reg [0] . . Reg [31]**
  - First register read specified by *rs1* field in instruction
  - Second register read specified by *rs2* field in instruction
  - Write register (destination) specified by *rd* field in instruction
  - **x0** is always 0 (writes to **Reg [0]** are ignored)
- Program Counter (**PC**)
  - Holds address of current instruction
- Memory (**MEM**)
  - Holds both instructions & data, in one 32-bit byte-addressed memory space
  - We'll use separate memories for instructions (**IMEM**) and data (**DMEM**)
    - *These are placeholders for instruction and data caches*
  - Instructions are read (*fetched*) from instruction memory (assume **IMEM** read-only)
  - Load/store instructions access data memory

# Review: Complete RV32I ISA

|                       |     |     |     |             |       |
|-----------------------|-----|-----|-----|-------------|-------|
| imm[31:12]            |     |     | rd  | 0110111     | LUI   |
| imm[31:12]            |     |     | rd  | 0010111     | AUIPC |
| imm[20:10:1 11 19:12] |     |     | rd  | 1101111     | JAL   |
| imm[11:0]             | rs1 | 000 | rd  | 1100111     | JALR  |
| imm[12:10:5]          | rs2 | rs1 | 000 | imm[4:1 11] | BEQ   |
| imm[12:10:5]          | rs2 | rs1 | 001 | imm[4:1 11] | BNE   |
| imm[12:10:5]          | rs2 | rs1 | 100 | imm[4:1 11] | BLT   |
| imm[12:10:5]          | rs2 | rs1 | 101 | imm[4:1 11] | BGE   |
| imm[12:10:5]          | rs2 | rs1 | 110 | imm[4:1 11] | BLTU  |
| imm[12:10:5]          | rs2 | rs1 | 111 | imm[4:1 11] | BGEU  |
| imm[11:0]             |     | rs1 | 000 | rd          | LB    |
| imm[11:0]             |     | rs1 | 001 | rd          | LH    |
| imm[11:0]             |     | rs1 | 010 | rd          | LW    |
| imm[11:0]             |     | rs1 | 100 | rd          | LBU   |
| imm[11:0]             |     | rs1 | 101 | rd          | LHU   |
| imm[11:5]             | rs2 | rs1 | 000 | imm[4:0]    | SB    |
| imm[11:5]             | rs2 | rs1 | 001 | imm[4:0]    | SH    |
| imm[11:5]             | rs2 | rs1 | 010 | imm[4:0]    | SW    |
| imm[11:0]             |     | rs1 | 000 | rd          | ADDI  |
| imm[11:0]             |     | rs1 | 010 | rd          | SLTI  |
| imm[11:0]             |     | rs1 | 011 | rd          | SLTIU |
| imm[11:0]             |     | rs1 | 100 | rd          | XORI  |
| imm[11:0]             |     | rs1 | 110 | rd          | ORI   |
| imm[11:0]             |     | rs1 | 111 | rd          | ANDI  |

|                  |       |      |       |         |         |         |
|------------------|-------|------|-------|---------|---------|---------|
| 0000000          | shamt | rs1  | 001   | rd      | 0010011 | SLLI    |
| 0000000          | shamt | rs1  | 101   | rd      | 0010011 | SRLI    |
| 0100000          | shamt | rs1  | 101   | rd      | 0010011 | SRAI    |
| 0000000          | rs2   | rs1  | 000   | rd      | 0110011 | ADD     |
| 0100000          | rs2   | rs1  | 000   | rd      | 0110011 | SUB     |
| 0000000          | rs2   | rs1  | 001   | rd      | 0110011 | SLL     |
| 0000000          | rs2   | rs1  | 010   | rd      | 0110011 | SLT     |
| 0000000          | rs2   | rs1  | 011   | rd      | 0110011 | SLTU    |
| 0000000          | rs2   | rs1  | 100   | rd      | 0110011 | XOR     |
| 0000000          | rs2   | rs1  | 101   | rd      | 0110011 | SRL     |
| 0100000          | rs2   | rs1  | 101   | rd      | 0110011 | SRA     |
| 0000000          | rs2   | rs1  | 110   | rd      | 0110011 | OR      |
| 0000000          | rs2   | rs1  | 111   | rd      | 0110011 | AND     |
| 0000             | pred  | succ | 00000 | 000     | 00000   | 0001111 |
| 0000             | 0000  | 0000 | 00000 | 001     | 00000   | 0001111 |
| 0000000000000000 |       |      | 00000 | 000     | 00000   | 1110011 |
| 0000000000000001 |       |      | 00000 | 000     | 00000   | 1110011 |
| csr              | rs1   | 001  | rd    | 1110011 | ECALL   |         |
| csr              | rs1   | 010  | rd    | 1110011 | EBREAK  |         |
| csr              | rs1   | 011  | rd    | 1110011 | CSRRW   |         |
| csr              | zimm  | 101  | rd    | 1110011 | CSRRS   |         |
| csr              | zimm  | 110  | rd    | 1110011 | CSRRC   |         |
| csr              | zimm  | 111  | rd    | 1110011 | CSRRWI  |         |
| csr              | zimm  | 111  | rd    | 1110011 | CSRRSI  |         |
| csr              | zimm  | 111  | rd    | 1110011 | CSRRCI  |         |

Not in CA

- Need datapath and control to implement these instructions

# Implementing the **add** instruction



**add rd, rs1, rs2**

- Instruction makes two changes to machine's state:
  - **Reg[rd] = Reg[rs1] + Reg[rs2]**
  - **PC = PC + 4**

# Datapath for add

$$PC = PC + 4$$

$$Reg[rd] = Reg[rs1] + Reg[rs2]$$



# Timing Diagram for add



# Implementing the **sub** instruction

| 31      | 25 24 | 20 19 | 15 14 | 12 11 | 7 6     | 0 |     |
|---------|-------|-------|-------|-------|---------|---|-----|
| 0000000 | rs2   | rs1   | 000   | rd    | 0110011 |   | add |
| 0100000 | rs2   | rs1   | 000   | rd    | 0110011 |   | sub |

**sub rd, rs1, rs2**

- Almost the same as add, except now have to subtract operands instead of adding them
- **inst[30]** selects between add and subtract

# Datapath for add/sub



## Implementing other R-Format instructions

|         |            |            |     |           |         |             |
|---------|------------|------------|-----|-----------|---------|-------------|
| 0000000 | <b>rs2</b> | <b>rs1</b> | 000 | <b>rd</b> | 0110011 | <b>add</b>  |
| 0100000 | <b>rs2</b> | <b>rs1</b> | 000 | <b>rd</b> | 0110011 | <b>sub</b>  |
| 0000000 | <b>rs2</b> | <b>rs1</b> | 001 | <b>rd</b> | 0110011 | <b>sll</b>  |
| 0000000 | <b>rs2</b> | <b>rs1</b> | 010 | <b>rd</b> | 0110011 | <b>slt</b>  |
| 0000000 | <b>rs2</b> | <b>rs1</b> | 011 | <b>rd</b> | 0110011 | <b>sltu</b> |
| 0000000 | <b>rs2</b> | <b>rs1</b> | 100 | <b>rd</b> | 0110011 | <b>xor</b>  |
| 0000000 | <b>rs2</b> | <b>rs1</b> | 101 | <b>rd</b> | 0110011 | <b>srl</b>  |
| 0100000 | <b>rs2</b> | <b>rs1</b> | 101 | <b>rd</b> | 0110011 | <b>sra</b>  |
| 0000000 | <b>rs2</b> | <b>rs1</b> | 110 | <b>rd</b> | 0110011 | <b>or</b>   |
| 0000000 | <b>rs2</b> | <b>rs1</b> | 111 | <b>rd</b> | 0110011 | <b>and</b>  |

- All implemented by decoding funct3 and funct7 fields and selecting appropriate ALU function



# TA Discussion

Video Anqi Pang:

[https://robotics.shanghaitech.edu.cn/static/ca2020/  
Discussion\\_10\\_DatapathAnqiPan.mp4](https://robotics.shanghaitech.edu.cn/static/ca2020/Discussion_10_DatapathAnqiPan.mp4)



# Q & A



# Quiz



# Quiz

Piazza: "Online Lecture 10 Datapath Poll"

- Select the statements that are TRUE:
  - A. The Clk->Q delay is not important for the Datapath.
  - B. The Datapath for add and sub are identical – the only difference is that the controller is signaling the ALU which instruction to execute.
  - C. The result of an instruction is written into the destination register as soon as it is ready.
  - D. The controller is getting the instruction during the fetch stage.
  - E. The datapath introduced so far contains two adders.

# CS 110

# Computer Architecture

## Lecture 10:

### *Datapath*

### *Video 2: I & S*

Instructors:

Sören Schwertfeger & Chundong Wang

<https://robotics.shanghaitech.edu.cn/courses/ca/20s/>

School of Information Science and Technology SIST

ShanghaiTech University

Slides based on UC Berkley's CS61C

# Implementing I-Format - addi instruction

- RISC-V Assembly Instruction:

**addi x15, x1, -50**



|              |       |     |       |         |
|--------------|-------|-----|-------|---------|
| 111111001110 | 00001 | 000 | 01111 | 0010011 |
|--------------|-------|-----|-------|---------|

imm=-50

rs1=1

add

rd=15

OP-Imm

# Datapath for add/sub



# Adding addi to Datapath



# Adding addi to Datapath



# I-Format immediates



- High 12 bits of instruction (inst[31:20]) copied to low 12 bits of immediate (imm[11:0])
- Immediate is sign-extended by copying value of inst[31] to fill the upper 20 bits of the immediate value (imm[31:12])

# R+I Datapath



Works for all other I-format arithmetic instructions (*slti, sltiu, andi, ori, xori, slli, srli, srai*) just by changing ALUSel

# Question



- 1) Program counter is a register
- 2) We **should use the main ALU** to compute  $PC=PC+4$  in order to save some gates
- 3) The **ALU** is a synchronous state element

|    |     |
|----|-----|
|    | 123 |
| A: | FFF |
| B: | FFT |
| C: | FTF |
| D: | FTT |
| E: | TFF |
| F: | TFT |
| G: | TTF |
| H: | TTT |

# Add 1w

- RISC-V Assembly Instruction (I-type): **lw x14, 8(x2)**



- The 12-bit signed immediate is added to the base address in register **rs1** to form the **memory** address
  - This is very similar to the add-immediate operation but used to create address not to create final result
- The value loaded from **memory** is stored in register **rd**

# Adding lw to Datapath



# All RV32 Load Instructions

|           |     |     |    |         |     |
|-----------|-----|-----|----|---------|-----|
| imm[11:0] | rs1 | 000 | rd | 0000011 | lb  |
| imm[11:0] | rs1 | 001 | rd | 0000011 | lh  |
| imm[11:0] | rs1 | 010 | rd | 0000011 | lw  |
| imm[11:0] | rs1 | 100 | rd | 0000011 | lbu |
| imm[11:0] | rs1 | 101 | rd | 0000011 | lhu |

funct3 field encodes size and  
'signedness' of load data

- Supporting the narrower loads requires additional logic to extract the correct byte/halfword from the value loaded from memory, and sign- or zero-extend the result to 32 bits before writing back to register file.
  - It is just a mux mod

# Adding sw Instruction

- sw: Reads two registers, rs1 for base memory address, and rs2 for data to be stored, as well immediate offset! **sw x14, 8(x2)**



# Datapath with lw



# Adding SW to Datapath



# I+S Immediate Generation



- Just need a 5-bit mux to select between two positions where low five bits of immediate can reside in instruction
- Other bits in immediate are wired to fixed positions in instruction

# Datapath So Far



# CS 110

# Computer Architecture

## Lecture 10:

### *Datapath*

### *Video 3: Branches*

Instructors:  
**Sören Schwertfeger & Chundong Wang**

<https://robotics.shanghaitech.edu.cn/courses/ca/20s/>

**School of Information Science and Technology SIST**

**ShanghaiTech University**

**Slides based on UC Berkley's CS61C**

# Implementing Branches



- B-format is mostly same as S-Format, with two register sources (rs1/ rs2) and a 12-bit immediate
- But now immediate represents values -4096 to +4094 in 2-byte increments
- The 12 immediate bits encode *even* 13-bit signed byte offsets (lowest bit of offset is always zero, so no need to store it)

# RISC-V Immediate Encoding

| Instruction encodings, $\text{inst}[31:0]$ |    |                  |    |            |    |               |                    |                 |    |               |   |   | 0      |
|--------------------------------------------|----|------------------|----|------------|----|---------------|--------------------|-----------------|----|---------------|---|---|--------|
| 31                                         | 30 | 25               | 24 | 20         | 19 | 15            | 14                 | 12              | 11 | 8             | 7 | 6 | 0      |
| <b>funct7</b>                              |    | <b>rs2</b>       |    | <b>rs1</b> |    | <b>funct3</b> |                    | <b>rd</b>       |    | <b>opcode</b> |   |   | R-type |
|                                            |    | <b>imm[11:0]</b> |    | <b>rs1</b> |    | <b>funct3</b> |                    | <b>rd</b>       |    | <b>opcode</b> |   |   | I-type |
| <b>imm[11:5]</b>                           |    | <b>rs2</b>       |    | <b>rs1</b> |    | <b>funct3</b> |                    | <b>imm[4:0]</b> |    | <b>opcode</b> |   |   | S-type |
| <b>imm[12 10:5]</b>                        |    | <b>rs2</b>       |    | <b>rs1</b> |    | <b>funct3</b> | <b>imm[4:1 11]</b> |                 |    | <b>opcode</b> |   |   | B-type |

| 32-bit immediates produced, $\text{imm}[31:0]$ |                 |    |                |    |                    |                    |                 |                |   |  |  |  |        |
|------------------------------------------------|-----------------|----|----------------|----|--------------------|--------------------|-----------------|----------------|---|--|--|--|--------|
| 31                                             | 25              | 24 | 12             | 11 | 10                 | 5                  | 4               | 1              | 0 |  |  |  |        |
| -                                              | <b>inst[31]</b> | -  |                |    | <b>inst[30:25]</b> | <b>inst[24:21]</b> | <b>inst[20]</b> |                |   |  |  |  | I-imm. |
| -                                              | <b>inst[31]</b> | -  |                |    | <b>inst[30:25]</b> | <b>inst[11:8]</b>  |                 | <b>inst[7]</b> |   |  |  |  | S-imm. |
| -                                              | <b>inst[31]</b> | -  | <b>inst[7]</b> |    | <b>inst[30:25]</b> | <b>inst[11:8]</b>  |                 |                | 0 |  |  |  | B-imm. |

Upper bits sign-extended from  $\text{inst}[31]$   
 always

Only bit 7 of instruction changes role in  
 immediate between S and B

Only one bit changes position between S and B, so only need two single-bit 2-way mux!

# Datapath So Far



# Branches

- Different change to the state:
  - $\text{PC} = \begin{cases} \text{PC} + 4, & \text{branch not taken} \\ \text{PC} + \text{immediate}, & \text{branch taken} \end{cases}$
- Six branch instructions:  
BEQ, BNE, BLT, BGE, BLTU, BGEU
- Need to compute  $\text{PC} + \text{immediate}$  and to compare values of  $\text{rs1}$  and  $\text{rs2}$ 
  - But have only one ALU – need more hardware

# Adding Branches



# Branch Comparator



- $\text{BrEq} = 1$ , if  $A=B$
- $\text{BrLT} = 1$ , if  $A < B$
- $\text{BrUn} = 1$  selects unsigned comparison for  $\text{BrLT}$ ,  $0=\text{signed}$
- BGE branch:  $A \geq B$ , if  $\overline{\overline{A < B}} = !(A < B)$

$$\overline{\overline{A < B}} = !(A < B)$$

# Let's Add JALR (I-Format)



- JALR rd, rs, immediate
- Two changes to the state
  - Writes PC+4 to rd (return address)
  - Sets PC = rs + immediate
  - Uses same immediates as arithmetic and loads
    - *no* multiplication by 2 bytes
    - LSB is ignored

# Datapath So Far, with Branches



# Adding JALR



# Adding JAL



- JAL saves PC+4 in register rd (the return address)
- Set PC = PC + offset (PC-relative jump)
- Target somewhere within  $\pm 2^{19}$  locations, 2 bytes apart
  - $\pm 2^{18}$  32-bit instructions
- Immediate encoding optimized similarly to branch instruction to reduce hardware cost

# Datapath with JALR



# Adding JAL



# U-Format for “Upper Immediate” Instructions



- Has 20-bit immediate in upper 20 bits of 32-bit instruction word
- One destination register, rd
- Used for two instructions
  - LUI – Load Upper Immediate
  - AUIPC – Add Upper Immediate to PC

# Implementing LUI



# Implementing AUI PC



# Recap: Complete RV32I ISA

|                       |     |     |     |             |
|-----------------------|-----|-----|-----|-------------|
| imm[31:12]            |     |     | rd  | 0110111     |
| imm[31:12]            |     |     | rd  | 0010111     |
| imm[20:10:1 11 19:12] |     |     | rd  | 1101111     |
| imm[11:0]             | rs1 | 000 | rd  | 1100111     |
| imm[12:10:5]          | rs2 | rs1 | 000 | imm[4:1 11] |
| imm[12:10:5]          | rs2 | rs1 | 001 | imm[4:1 11] |
| imm[12:10:5]          | rs2 | rs1 | 100 | imm[4:1 11] |
| imm[12:10:5]          | rs2 | rs1 | 101 | imm[4:1 11] |
| imm[12:10:5]          | rs2 | rs1 | 110 | imm[4:1 11] |
| imm[12:10:5]          | rs2 | rs1 | 111 | imm[4:1 11] |
| imm[11:0]             |     | rs1 | 000 | rd          |
| imm[11:0]             |     | rs1 | 001 | rd          |
| imm[11:0]             |     | rs1 | 010 | rd          |
| imm[11:0]             |     | rs1 | 100 | rd          |
| imm[11:0]             |     | rs1 | 101 | rd          |
| imm[11:5]             | rs2 | rs1 | 000 | imm[4:0]    |
| imm[11:5]             | rs2 | rs1 | 001 | imm[4:0]    |
| imm[11:5]             | rs2 | rs1 | 010 | imm[4:0]    |
| imm[11:0]             |     | rs1 | 000 | rd          |
| imm[11:0]             |     | rs1 | 010 | rd          |
| imm[11:0]             |     | rs1 | 011 | rd          |
| imm[11:0]             |     | rs1 | 100 | rd          |
| imm[11:0]             |     | rs1 | 110 | rd          |
| imm[11:0]             |     | rs1 | 111 | rd          |

|       |                |       |      |       |     |         |
|-------|----------------|-------|------|-------|-----|---------|
| LUI   | 0000000        | shamt | rs1  | 001   | rd  | 0010011 |
| AUIPC | 0000000        | shamt | rs1  | 101   | rd  | 0010011 |
| JAL   | 0100000        | shamt | rs1  | 101   | rd  | 0010011 |
| JALR  | 0000000        | rs2   | rs1  | 000   | rd  | 0110011 |
| BEQ   | 0100000        | rs2   | rs1  | 000   | rd  | 0110011 |
| BNE   | 0000000        | rs2   | rs1  | 001   | rd  | 0110011 |
| BLT   | 0000000        | rs2   | rs1  | 010   | rd  | 0110011 |
| BGE   | 0000000        | rs2   | rs1  | 011   | rd  | 0110011 |
| BLTU  | 0000000        | rs2   | rs1  | 100   | rd  | 0110011 |
| BGEU  | 0000000        | rs2   | rs1  | 101   | rd  | 0110011 |
| LB    | 0100000        | rs2   | rs1  | 101   | rd  | 0110011 |
| LH    | 0000000        | rs2   | rs1  | 110   | rd  | 0110011 |
| LW    | 0000000        | rs2   | rs1  | 111   | rd  | 0110011 |
| LBU   | 0000           | pred  | succ | 00000 | 000 | 0001111 |
| LHU   | 0000           | 0000  | 0000 | 00000 | 001 | 0001111 |
| SB    | 00000000000000 |       |      | 00000 | 000 | 00000   |
| SH    | 000000000001   |       |      | 00000 | 000 | 00000   |
| SW    | csr            |       |      | rs1   | 001 | rd      |
| ADDI  | csr            |       |      | rs1   | 010 | rd      |
| SLTI  | csr            |       |      | rs1   | 011 | rd      |
| SLTIU | csr            |       |      | zimm  | 101 | rd      |
| XORI  | csr            |       |      | zimm  | 110 | rd      |
| ORI   | csr            |       |      | zimm  | 111 | rd      |
| ANDI  | csr            |       |      |       |     | 1110011 |

Not in CA

- RV32I has 37 instructions
- 37 instructions are enough to run any C program

# Complete RV32I Datapath!



# “And In conclusion...”

- We have designed a complete datapath
  - Capable of executing all RISC-V instructions in one cycle each
  - Not all units (hardware) used by all instructions
- 5 Phases of execution
  - IF, ID, EX, MEM, WB
  - Not all instructions are active in all phases
- Controller specifies how to execute instructions
  - New instructions can be added with just control?

# Question

Piazza: "Lecture 10 Datapath Poll"

- Select the statements that are TRUE:
  - Instructions that don't need certain stages (e.g. Memory stage) can run with a higher clock speed.
  - Control signals are usually connected to a mux.
  - The datapath from this lecture is single cycle – so it only contains combinatorial logic elements.
  - For some instructions, certain control signals are undefined.
  - All I-type instructions sign-extend the immediate.