

# EECS151/251A Introduction to Digital Design and ICs

## Lecture 9: RISC-V Datapath and Control II

Sophia Shao



### AMD Acquires Xilinx

Over the next decade, high-performance computing will be at the center of nearly every major trend shaping the future. Whether in the cloud, at the edge or across the growing number of intelligent end devices — there is an increasing need to push the envelope to enable new experiences and services. While CPUs and GPUs will remain critical engines for those devices, in a world where algorithms are always advancing and new standards are continually emerging, we see demand growing for adaptive computing capabilities that will be critical to accelerate these emerging and evolving workloads.

AMD's acquisition of Xilinx creates the industry's high-performance and adaptive computing leader, combining a highly complementary set of products, customers and markets with differentiated IP and world-class talent. As we bring AMD and Xilinx together, there are considerable product, technology, market and financial benefits.



<https://www.amd.com/en/corporate/xilinx-acquisition>



- **RISC-V Datapath & Control**
  - R-type
  - I-type
  - S-type
  - B-type
  - J-type
  - U-type
  - **Control Logic**
- **RISC-V 5-stage Pipeline**

# JALR Instruction (I-Format)



- JALR rd, rs, immediate
  - $R[rd] = PC + 4$ ;  $PC = Reg[rs1] + imm$ ;
    - Writes PC+4 to rd (return address)
    - Sets  $PC = rs1 + immediate$  (and sets the LSB to 0)
  - Uses same immediates as arithmetic and loads
    - **no** multiplication by 2 bytes
    - In contrast to branches and JAL

# Datapath So Far, with Branches



# Adding JALR



# Adding JALR



# J-Format for Jump Instructions



- **JAL:**
  - $R[rd] = PC + 4$ ;  $PC = PC + imm$ ;
  - saves  $PC+4$  in register  $rd$  (the return address)
    - Assembler “`j`” jump is pseudo-instruction, uses JAL but sets  $rd=x0$  to discard return address
  - Set  $PC = PC + offset$  (PC-relative jump)
- Target somewhere within  $\pm 2^{19}$  locations, 2 bytes apart
  - $\pm 2^{18}$  32-bit instructions
- Immediate encoding optimized similarly to branch instruction to reduce hardware cost

# Datapath with JALR



# Adding JAL



# Adding JAL





- **RISC-V Datapath & Control**
  - R-type
  - I-type
  - S-type
  - B-type
  - J-type
  - **U-type**
  - **Control Logic**
- **RISC-V 5-stage Pipeline**

# U-Format for “Upper Immediate” Instructions



- Has 20-bit immediate in upper 20 bits of 32-bit instruction word
- One destination register, rd
- Used for two instructions
  - LUI – Load Upper Immediate
    - $\text{Reg}[rd] = \{\text{imm}, 12'b0\}$
  - AUIPC – Add Upper Immediate to PC
    - $\text{Reg}[rd] = \text{PC} + \{\text{imm}, 12'b0\}$

# Implementing LUI



# Implementing AUIPC



# Complete RV32I Datapath!



# Recap: Complete RV32I ISA

|                       |     |     |         |       |                  |       |      |       |     |         |         |
|-----------------------|-----|-----|---------|-------|------------------|-------|------|-------|-----|---------|---------|
| imm[31:12]            |     | rd  | 0110111 | LUI   | 0000000          | shamt | rs1  | 001   | rd  | 0010011 |         |
| imm[31:12]            |     | rd  | 0010111 | AUIPC | 0000000          | shamt | rs1  | 101   | rd  | 0010011 |         |
| imm[20 10:1 11 19:12] |     | rd  | 1101111 | JAL   | 0100000          | shamt | rs1  | 101   | rd  | 0010011 |         |
| imm[11:0]             | rs1 | 000 | rd      | JALR  | 0000000          | rs2   | rs1  | 000   | rd  | 0110011 |         |
| imm[12 10:5]          | rs2 | rs1 | 000     | BEQ   | 0100000          | rs2   | rs1  | 000   | rd  | 0110011 |         |
| imm[12 10:5]          | rs2 | rs1 | 001     | BNE   | 0000000          | rs2   | rs1  | 001   | rd  | 0110011 |         |
| imm[12 10:5]          | rs2 | rs1 | 100     | BLT   | 0000000          | rs2   | rs1  | 010   | rd  | 0110011 |         |
| imm[12 10:5]          | rs2 | rs1 | 101     | BGE   | 0000000          | rs2   | rs1  | 011   | rd  | 0110011 |         |
| imm[12 10:5]          | rs2 | rs1 | 110     | BLTU  | 0000000          | rs2   | rs1  | 100   | rd  | 0110011 |         |
| imm[12 10:5]          | rs2 | rs1 | 111     | BGEU  | 0000000          | rs2   | rs1  | 101   | rd  | 0110011 |         |
| imm[11:0]             |     | rs1 | 000     | LB    | 0100000          | rs2   | rs1  | 101   | rd  | 0110011 |         |
| imm[11:0]             |     | rs1 | 001     | LH    | 0000000          | rs2   | rs1  | 110   | rd  | 0110011 |         |
| imm[11:0]             |     | rs1 | 010     | LW    | 0000000          | rs2   | rs1  | 111   | rd  | 0110011 |         |
| imm[11:0]             |     | rs1 | 100     | LBU   | 0000             | pred  | succ | 00000 | 000 | 00000   | 0001111 |
| imm[11:0]             |     | rs1 | 101     | LHU   | 0000             | 0000  | 0000 | 00000 | 001 | 00000   | 0001111 |
| imm[11:5]             | rs2 | rs1 | 000     | SB    | 0000000000000000 |       |      | 00000 | 000 | 00000   | 1110011 |
| imm[11:5]             | rs2 | rs1 | 001     | SH    | 0000000000000001 |       |      | 00000 | 000 | 00000   | 1110011 |
| imm[11:5]             | rs2 | rs1 | 010     | SW    | csr              |       | rs1  | 001   | rd  | 1110011 |         |
| imm[11:0]             |     | rs1 | 000     | ADDI  | csr              |       | rs1  | 010   | rd  | 1110011 |         |
| imm[11:0]             |     | rs1 | 010     | SLTI  | csr              |       | rs1  | 011   | rd  | 1110011 |         |
| imm[11:0]             |     | rs1 | 011     | SLTIU | zimm             |       | zimm | 101   | rd  | 1110011 |         |
| imm[11:0]             |     | rs1 | 100     | XORI  | csr              |       | zimm | 110   | rd  | 1110011 |         |
| imm[11:0]             |     | rs1 | 110     | ORI   | csr              |       | zimm | 111   | rd  | 1110011 |         |
| imm[11:0]             |     | rs1 | 111     | ANDI  |                  |       |      |       |     |         |         |

|                  |       |      |       |     |         |         |
|------------------|-------|------|-------|-----|---------|---------|
| 0000000          | shamt | rs1  | 001   | rd  | 0010011 | SLLI    |
| 0000000          | shamt | rs1  | 101   | rd  | 0010011 | SRLI    |
| 0100000          | shamt | rs1  | 101   | rd  | 0010011 | SRAI    |
| 0000000          | rs2   | rs1  | 000   | rd  | 0110011 | ADD     |
| 0100000          | rs2   | rs1  | 000   | rd  | 0110011 | SUB     |
| 0000000          | rs2   | rs1  | 001   | rd  | 0110011 | SLL     |
| 0000000          | rs2   | rs1  | 010   | rd  | 0110011 | SLT     |
| 0000000          | rs2   | rs1  | 011   | rd  | 0110011 | SLTU    |
| 0000000          | rs2   | rs1  | 100   | rd  | 0110011 | XOR     |
| 0000000          | rs2   | rs1  | 101   | rd  | 0110011 | SRL     |
| 0100000          | rs2   | rs1  | 101   | rd  | 0110011 | SRA     |
| 0000000          | rs2   | rs1  | 110   | rd  | 0110011 | OR      |
| 0000000          | rs2   | rs1  | 111   | rd  | 0110011 | AND     |
| 0000             | pred  | succ | 00000 | 000 | 00000   | FENCE   |
| 0000             | 0000  | 0000 | 00000 | 001 | 00000   | FENCE.I |
| 0000000000000000 |       |      | 00000 | 000 | 00000   | ECALL   |
| 0000000000000001 |       |      | 00000 | 000 | 00000   | EBREAK  |
| csr              |       | rs1  | 001   | rd  | 1110011 | CSRRW   |
| csr              |       | rs1  | 010   | rd  | 1110011 | CSRRS   |
| csr              |       | rs1  | 011   | rd  | 1110011 | CSRRC   |
| zimm             |       | zimm | 101   | rd  | 1110011 | CSRRWI  |
| zimm             |       | zimm | 110   | rd  | 1110011 | CSRRSI  |
| zimm             |       | zimm | 111   | rd  | 1110011 | CSRRCI  |

- RV32I has 47 instructions
- 37 instructions are enough to run any C program

# Summary of RISC-V Instruction Formats

| 31               | 30         | 25 24      | 21 20 19 | 15 14 | 12 11  | 8 7 6       | 0      |        |
|------------------|------------|------------|----------|-------|--------|-------------|--------|--------|
|                  |            |            |          |       |        |             |        | R-type |
| funct7           |            | rs2        |          | rs1   | funct3 | rd          | opcode | I-type |
|                  | imm[11:0]  |            |          | rs1   | funct3 | rd          | opcode | S-type |
| imm[11:5]        |            | rs2        |          | rs1   | funct3 | imm[4:0]    | opcode | B-type |
| imm[12 10:5]     |            | rs2        |          | rs1   | funct3 | imm[4:1 11] | opcode | U-type |
|                  | imm[31:12] |            |          |       |        | rd          | opcode | J-type |
| imm[20 10:1 11]] |            | imm[19:12] |          |       |        | rd          | opcode |        |

# Administrivia

- No new lab this week.
- HW 3 due this week.
  - HW 4 out (last HW before midterm)
- Midterm 3/1 Tuesday
  - In class.
  - Cover up to pipelining (this week's lecture)
- Guest lecture on FPGA use cases next Thursday



- **RISC-V Datapath & Control**
  - R-type
  - I-type
  - S-type
  - B-type
  - J-type
  - U-type
  - **Control Logic**
- **RISC-V 5-stage Pipeline**

# Complete RV32I Datapath with Control



## Example: add



# add Execution



# Control Logic Truth Table

| Inst[31:0]          | BrEq | BrLT | PCSel | ImmSel | BrUn | ASel | BSel | ALUSel | MemR<br>W | RegWEn | WBSel |
|---------------------|------|------|-------|--------|------|------|------|--------|-----------|--------|-------|
| <b>add</b>          | *    | *    | +4    | *      | *    | Reg  | Reg  | Add    | Read      | 1      | ALU   |
| <b>sub</b>          | *    | *    | +4    | *      | *    | Reg  | Reg  | Sub    | Read      | 1      | ALU   |
| <i>(R-R<br/>Op)</i> | *    | *    | +4    | *      | *    | Reg  | Reg  | (Op)   | Read      | 1      | ALU   |
| <b>addi</b>         | *    | *    | +4    | I      | *    | Reg  | Imm  | Add    | Read      | 1      | ALU   |
| <b>lw</b>           | *    | *    | +4    | I      | *    | Reg  | Imm  | Add    | Read      | 1      | Mem   |
| <b>sw</b>           | *    | *    | +4    | S      | *    | Reg  | Imm  | Add    | Write     | 0      | *     |
| <b>beq</b>          | 0    | *    | +4    | B      | *    | PC   | Imm  | Add    | Read      | 0      | *     |
| <b>beq</b>          | 1    | *    | ALU   | B      | *    | PC   | Imm  | Add    | Read      | 0      | *     |
| <b>bne</b>          | 0    | *    | ALU   | B      | *    | PC   | Imm  | Add    | Read      | 0      | *     |
| <b>bne</b>          | 1    | *    | +4    | B      | *    | PC   | Imm  | Add    | Read      | 0      | *     |
| <b>blt</b>          | *    | 1    | ALU   | B      | 0    | PC   | Imm  | Add    | Read      | 0      | *     |
| <b>bltu</b>         | *    | 1    | ALU   | B      | 1    | PC   | Imm  | Add    | Read      | 0      | *     |
| <b>jalr</b>         | *    | *    | ALU   | I      | *    | Reg  | Imm  | Add    | Read      | 1      | PC+4  |
| <b>jal</b>          | *    | *    | ALU   | J      | *    | PC   | Imm  | Add    | Read      | 1      | PC+4  |
| <b>auipc</b>        | *    | *    | +4    | U      | *    | PC   | Imm  | Add    | Read      | 1      | ALU   |

# RV32I, a nine-bit ISA!

|                       |       |     |             |         |         |
|-----------------------|-------|-----|-------------|---------|---------|
| imm[31:12]            |       |     | rd          | 0110111 |         |
| imm[31:12]            |       |     | rd          | 0010111 |         |
| imm[20:10:1 11 19:12] |       |     | rd          | 1101111 |         |
| imm[11:0]             | rs1   | 000 | rd          | 1100111 |         |
| imm[12 10:5]          | rs2   | 000 | imm[4:1 11] | 1100011 |         |
| imm[12 10:5]          | rs2   | 001 | imm[4:1 11] | 1100011 |         |
| imm[12 10:5]          | rs2   | 100 | imm[4:1 11] | 1100011 |         |
| imm[12 10:5]          | rs2   | 101 | imm[4:1 11] | 1100011 |         |
| imm[12 10:5]          | rs2   | 110 | imm[4:1 11] | 1100011 |         |
| imm[12 10:5]          | rs2   | 111 | imm[4:1 11] | 1100011 |         |
| imm[11:0]             | rs1   | 000 | rd          | 0000011 |         |
| imm[11:0]             | rs1   | 001 | rd          | 0000011 |         |
| imm[11:0]             | rs1   | 010 | rd          | 0000011 |         |
| imm[11:0]             | rs1   | 100 | rd          | 0000011 |         |
| imm[11:0]             | rs1   | 101 | rd          | 0000011 |         |
| imm[11:5]             | rs2   | 000 | imm[4:0]    | 0100011 |         |
| imm[11:5]             | rs2   | 001 | imm[4:0]    | 0100011 |         |
| imm[11:5]             | rs2   | 010 | imm[4:0]    | 0100011 |         |
| imm[11:0]             | rs1   | 000 | rd          | 0010011 |         |
| imm[11:0]             | rs1   | 010 | rd          | 0010011 |         |
| imm[11:0]             | rs1   | 011 | rd          | 0010011 |         |
| imm[11:0]             | rs1   | 100 | rd          | 0010011 |         |
| imm[11:0]             | rs1   | 110 | rd          | 0010011 |         |
| imm[11:0]             | rs1   | 111 | rd          | 0010011 |         |
| 0000000               | shamt | rs1 | 001         | rd      | 0010011 |
| 0000000               | shamt | rs1 | 101         | rd      | 0010011 |
| 0100000               | shamt | rs1 | 101         | rd      | 0010011 |
| 0000000               | rs2   | rs1 | 000         | rd      | 0110011 |
| 0100000               | rs2   | rs1 | 000         | rd      | 0110011 |
| 0000000               | rs2   | rs1 | 001         | rd      | 0110011 |
| 0000000               | rs2   | rs1 | 010         | rd      | 0110011 |
| 0000000               | rs2   | rs1 | 011         | rd      | 0110011 |
| 0000000               | rs2   | rs1 | 100         | rd      | 0110011 |
| 0000000               | rs2   | rs1 | 101         | rd      | 0110011 |
| 0100000               | rs2   | rs1 | 101         | rd      | 0110011 |
| 0000000               | rs2   | rs1 | 110         | rd      | 0110011 |
| 0000000               | rs2   | rs1 | 111         | rd      | 0110011 |

LUI  
AUIPC  
JAL  
JALR  
BEQ  
BNE  
BLT  
BGE  
BLTU  
BGEU  
LB  
LH  
LW  
LBU  
LHU  
SB  
SH  
SW  
ADDI  
SLTI  
SLTIU  
XORI  
ORI  
ANDI  
SLLI  
SR LI  
SRAI  
ADD  
SUB  
SLL  
SLT  
SLTU  
XOR  
SRL  
SRA  
OR  
AND

inst[30]

inst[14:12]

inst[6:2]

Instruction type encoded using only  
9 bits inst[30],inst[14:12], inst[6:2]

# Control Realization Options

- ROM
  - “Read-Only Memory”
  - Regular structure
  - Can be easily reprogrammed
    - fix errors
    - add instructions
- Combinatorial Logic
  - Decoder is typically hierarchical
    - First decode opcode, and figure out instruction type
    - E.g. branches are  $\text{Inst}[6:2] = 11000$
    - Then determine the actual instruction
    - $\text{Inst}[30] + \text{Inst}[14:12]$
  - Modularity helps simplify and speed up logic
    - Narrow problem space for logic synthesis

# Combinational Logic Control

- Simplest example: BrUn

|              | inst[14:12] |     | inst[6:2] |             |         |      |
|--------------|-------------|-----|-----------|-------------|---------|------|
| imm[12:10:5] | rs2         | rs1 | 000       | imm[4:1 11] | 1100011 | BEQ  |
| imm[12:10:5] | rs2         | rs1 | 001       | imm[4:1 11] | 1100011 | BNE  |
| imm[12:10:5] | rs2         | rs1 | 100       | imm[4:1 11] | 1100011 | BLT  |
| imm[12:10:5] | rs2         | rs1 | 101       | imm[4:1 11] | 1100011 | BGE  |
| imm[12:10:5] | rs2         | rs1 | 110       | imm[4:1 11] | 1100011 | BLTU |
| imm[12:10:5] | rs2         | rs1 | 111       | imm[4:1 11] | 1100011 | BGEU |

| inst[14:13] | inst[12] | 00 | 01 | 11 | 10 |
|-------------|----------|----|----|----|----|
| 0           |          |    |    |    |    |
| 1           |          |    |    |    |    |

- How to decode whether BrUn is 1?

- Branch = Inst[6] • Inst[5] • !Inst[4] • !Inst[3] • !Inst[2]
- BrUn = Inst [13] • Branch

# Complete RV32I Datapath with Control





- **RISC-V Datapath & Control**
  - R-type
  - I-type
  - S-type
  - B-type
  - J-type
  - U-type
  - Control Logic
- **RISC-V 5-stage Pipeline**

# Critical Path

Critical path for an addi

$$R[rd] = R[rs1] + imm$$



- 1)  $t_{clk-q} + t_{Add} + t_{IMEM} + t_{Reg} + t_{BComp} + t_{ALU} + t_{DMEM} + t_{mux} + t_{Setup}$
- 2)  $t_{clk-q} + t_{IMEM} + \max\{t_{Reg}, t_{Imm}\} + t_{ALU} + 2t_{mux} + t_{Setup}$
- 3)  $t_{clk-q} + t_{IMEM} + \max\{t_{Reg}, t_{Imm}\} + t_{ALU} + 3t_{mux} + t_{DMEM} + t_{Setup}$
- 4) None of the above

# Pipelining with RISC-V

instruction sequence ↓

add t0, t1, t2  
or t3, t4, t5  
sll t6, t0, t3



|                                     | Single Cycle                                                         | Pipelining                                             |
|-------------------------------------|----------------------------------------------------------------------|--------------------------------------------------------|
| Timing                              | $t_{step} = 100 \dots 200 \text{ ps}$<br>Register access only 100 ps | $t_{cycle} = 200 \text{ ps}$<br>All cycles same length |
| Instruction time, $t_{instruction}$ | $= t_{cycle} = 800 \text{ ps}$                                       | <b>1000 ps</b>                                         |
| Clock rate, $f_s$                   | $1/800 \text{ ps} = 1.25 \text{ GHz}$                                | $1/200 \text{ ps} = 5 \text{ GHz}$                     |
| Relative speed                      | 1 x                                                                  | <b>4 x</b>                                             |

# Pipelining with RISC-V

instruction sequence

add t0, t1, t2  
or t3, t4, t5  
slt t6, t0, t3  
sw t0, 4(t3)  
lw t0, 8(t3)  
addi t2, t2, 1



# Complete RV32I Datapath with Control



# Pipelining RV32I Datapath



Must pipeline instruction along with data, so control operates correctly in each stage

# Different Instructions in Flight

lw t0, 8(t3)

sw t0, 4(t3)

slt t6, t0, t3

or t3, t4, t5

add t0,  
t1, t2



# Pipelined Control

- Control signals derived from instruction
  - As in single-cycle implementation
  - Information is stored in pipeline registers for use by later stages



# Summary

- We have covered the implementation of the base ISA for RV32I!!!
  - Get yourself familiar with the ISA Spec.
- Instruction type:
  - R-type
  - I-type
  - S-type
  - B-type
  - J-type
  - U-type
- Implementation suggested is straightforward, yet there are modalities in how to implement it well at gate level.
- Single-cycle datapath is slow – need to pipeline it