



UC Berkeley  
Teaching Professor  
Dan Garcia

# CS61C

## Great Ideas in Computer Architecture (a.k.a. Machine Structures)

### RISC-V Instruction Formats, Part II

# Disassembling the S-Format

- What is the corresponding assembly instruction?



S-Format Lookup table:

|           |     |     | funct3 | opcode   |         |    |
|-----------|-----|-----|--------|----------|---------|----|
| imm[11:5] | rs2 | rs1 | 000    | imm[4:0] | 0100011 | sb |
| imm[11:5] | rs2 | rs1 | 001    | imm[4:0] | 0100011 | sh |
| imm[11:5] | rs2 | rs1 | 010    | imm[4:0] | 0100011 | sw |

# Disassembling the S-Format

- What is the corresponding assembly instruction?



# Updating the Program Counter (PC)

- Updating the Program Counter (PC)
- B-Format Layout
- J-Format Layout
- U-Format: Long Immediates
- **jalr**: I-Format

# The Program Counter (PC)



**The Program Counter (PC) is a special internal register inside processor holding byte address of the instruction being executed.**

**It is separate from the 32-registers Register File!**

# $PC = PC + 4$ for most instructions



# Branches Update PC to "Jump"



Kris Kross - Jump (Official Video) -

## Unconditional Branches

```

0x0c | Loop: beq x19,x10,End
      add x18 x18,x10
      addi x19,x19,-1
      j Loop
      End: ...
  
```



## Conditional Branch (if branch taken)

→ 0x0c | Loop: beq x19,x10,End  
 add x18 x18,x10  
 addi x19,x19,-1  
 j Loop  
 End: ...



If branch not taken,  
then  $PC = PC + 4$

# PC-Relative Addressing

- PC-Relative Addressing: Supply a signed offset to update PC.

$$\text{PC} = \text{PC} + \text{byte\_offset}$$

*“Position-Independent Code”*: If all of code moves, *relative offsets* don’t change!

|      |                                      |
|------|--------------------------------------|
| 0x0c | Loop: <code>beq x19, x10, End</code> |
| 0x10 | <code>add x18, x18, x10</code>       |
| 0x14 | <code>addi x19, x19, -1</code>       |
| 0x18 | <code>j Loop</code>                  |
| 0x1c | End: ...                             |

Take branch:  
 $\text{PC} = \text{PC} + 16$   
Don’t take branch:  
 $\text{PC} = \text{PC} + 4$

|       |                                      |
|-------|--------------------------------------|
| 0x400 | Loop: <code>beq x19, x10, End</code> |
| 0x404 | <code>add x18, x18, x10</code>       |
| 0x408 | <code>addi x19, x19, -1</code>       |
| 0x40c | <code>j Loop</code>                  |
| 0x410 | End: ...                             |

Branches generally change the PC by a small amount, therefore in RISC-V instructions we encode relative offsets as *signed immediates*.

- Contrast with: Absolute Addressing

Supply new address to overwrite PC.  $\text{PC} = \text{new\_address}$

Use sparingly: Brittle to code movement/need to build 32-bit immediate  
(more later)

# B-Format Layout

- Updating the Program Counter (PC)
- B-Format Layout
- J-Format Layout
- U-Format: Long Immediates
- **jalr**: I-Format

# RISC-V Conditional Branches

- Example: **beq rs1, rs2, Label**
- Conditional branches need:
  - Two **read** registers, **no destination register** (like S-Format)
  - Some way to encode label (i.e., where to jump to)
- RISC-V uses **PC-relative addressing to encode labels**.
  - If we *don't* branch:  $\text{PC} = \text{PC} + 4 \text{ bytes}$
  - If we *do* branch:  $\text{PC} = \text{PC} + (\text{relative byte offset to Label})$
- B-Format Fields:
  - 7b (**opcode**) + 3b (**funct3**) + 5b (**rs1**) + 5b (**rs2**) = 20b
  - 12b **immediate field** to represent PC's relative offset.
    - Signed two's complement means we can represent  $\pm 2^{11}$  "units" from the PC.

# Scaling the Offset to Maximize Branch Range

The 12-bit immediate field for conditional branches is:

A. In units of 1 byte.

One branch reaches  $\pm 2^{11}$  bytes from PC  $\rightarrow \pm 2^9 \times 32$

Never branch into middle of instruction.

B. In units of 2 bytes (16-bit “half-words”).

One branch reaches  $\pm 2^{10} \times 32$

Multiply the offset by 2 before adding to the PC.

C. In units of 4 bytes (32-bit words).

One branch reaches  $\pm 2^{11} \times 32$

Multiply the offset by 4 before adding to the PC.



## Scaling the Offset to Maximize Branch Range

The 12-bit immediate field for conditional branches is:

**A. In units of 1 byte.**

- One branch reaches  $\pm 2^{11}$  bytes from PC  $\rightarrow \pm 2^9 \times 32\text{-bit instructions}$ .
- Never branch into middle of instruction.

**B. In units of 2 bytes.**

- One branch reaches  $\pm 2^{10} \times 32\text{-b instructions}$  from PC.
- Multiply the offset by 2 bytes before adding to the PC.

**C. In units of 4 bytes.**

- One branch reaches  $\pm 2^{11} \times 32\text{-b instructions}$  from PC.
- Multiply the offset by 4 bytes before adding to the PC.

(A) A

(B) B

(C) C

(D) None of the above

0%

0%

0%

0%

# Scaling the Offset to Maximize Branch Range

The 12-bit immediate field for conditional branches is:

- X In units of 1 byte.

One branch reaches  $\pm 2^{11}$  bytes from PC  $\rightarrow \pm 2^9 \times 32$

Never branch into middle of instruction.

*Wasteful!* Lower 2 bits would always be wasted (always 00).

- ✓ In units of 2 bytes (16-bit “half-words”).

One branch reaches  $\pm 2^{10} \times 32$

Multiply the offset by 2 before adding to the PC.

- X In units of 4 bytes (32-bit words).

One branch reaches  $\pm 2^{11} \times 32$

Multiply the offset by 4 before adding to the PC.

Does not support 16-bit compressed instructions!

Branch immediate range [-2048, +2047] represents PC relative offsets [-4096, +4094] in 2-byte increments.

# RISC-V Feature: 16-bit instructions

- RISC-V Base ISA for RV32, RV64, RV128 all have 32-bit wide instructions, but it supports extensions:
  - 16b compressed instructions
  - Variable-length instructions that are multiples of 16b in length
- RISC-V therefore *scales the branch offset by 2 bytes*, even when there are no 16b instructions!
- Implications for this class (which uses RISC-V processors that only support 32-bit instructions):
  - Half of the possible branch targets will be errors.
  - RISC-V Conditional Branches can only reach  $\pm 2^{10}$  32-bit instructions on either side of PC.

# B-Format Instruction Layout

- B-Format (textbook: SB-Type) close to S-Format:  
**opname      rs1, rs2 , Label**

All *conditional* branch instructions have opcode 1100011.



Immediate represents relative offset in increments of 2 bytes (“half-words”)

To compute new PC:  $PC = PC + \text{byte\_offset}$ .

12 immediate bits imply  $\pm 2^{10}$  32-bit instructions reachable:

1 bit: 2's complement (allow  $+\/-$  offset)

1 bit: half-word/16-b instruction support

(Lowest bit of offset is always zero, so no need to store in instruction.)



# B-Format Example (1/2)

```
Loop:  beq x19,x10,End  
       add x18,x18,x10  
       addi x19,x19,-1  
       j Loop  
  
End:   ... # target instr
```

How do we encode the  
beq instruction in this  
RISC-V code?

1. Fill in fields (other than immediate).



# B-Format Example (2/2)

+4  
instruct-  
ions {  
Loop:   **beq x19,x10,End**  
          add x18,x18,x10  
          addi x19,x19,-1  
          j Loop  
End:    ... # target instr

How do we encode the  
**beq** instruction in this  
RISC-V code?

2. Construct branch offset (in bytes), then encode immediate.

| 31           | 25 24 | 20 19 | 15     | 14          | 12      | 11 | 7 | 6 | 0 |
|--------------|-------|-------|--------|-------------|---------|----|---|---|---|
| imm[12 10:5] | rs2   | rs1   | funct3 | imm[4:1 11] | opcode  |    |   |   |   |
| 0000000      | 01010 | 10011 | 000    | 10000       | 1100011 |    |   |   |   |

+4 32-bit instructions = +16 byte offset

| 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
|----|----|----|---|---|---|---|---|---|---|---|---|---|
| 0  | 0  | 0  | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |

13-bit signed **byte\_offset** ("imm")

# Why Swirl Immediate Bits Around? HW Design!

- Recall: RISC-V register field positions consistent across instr. formats.
- RISC-V also tries to keep bit positions of immediates consistent:

|        | 31                | 25           | 24  | 20  | 19  | 15     | 14           | 12 | 11 | 7 | 6      | 0 |
|--------|-------------------|--------------|-----|-----|-----|--------|--------------|----|----|---|--------|---|
| I-type | imm[11:5]         | imm[4:0]     |     | rs1 |     | funct3 |              | rd |    |   | opcode |   |
|        | <b>yxxxxxxxxx</b> | <b>wwwwv</b> |     |     |     |        |              |    |    |   |        |   |
| S-type | imm[11:5]         |              | rs2 |     | rs1 | funct3 | imm[4:0]     |    |    |   | opcode |   |
|        | <b>yxxxxxxxxx</b> |              |     |     |     |        | <b>wwwwv</b> |    |    |   |        |   |
| B-type | imm[12 10:5]      |              | rs2 |     | rs1 | funct3 | imm[4:1 11]  |    |    |   | opcode |   |
|        | <b>zxxxxxxxxx</b> |              |     |     |     |        | <b>wwwwy</b> |    |    |   |        |   |

Instruction Bit 31 is always the *sign bit* (highest bit to sign extend in immediate)  
 B-type has a *13-bit* immediate encoding b/c of implicit zero in Imm bit 0.

Between S and B, only two bits change meaning:  
 Instruction Bit 7 → S: Imm Bit 0; B: Imm Bit 11  
 Instruction Bit 31 → S: Imm Bit 11; B: Imm Bit 12

RISC-V immediate bit encoding is optimized to reduce hardware cost.

# All six RV32 B-Format Instructions

| imm[12 10:5] | rs2 | rs1 | funct3 | imm[4:1 11] | opcode  |                  |
|--------------|-----|-----|--------|-------------|---------|------------------|
| imm[12 10:5] | rs2 | rs1 | 000    | imm[4:1 11] | 1100011 | b <sub>eq</sub>  |
| imm[12 10:5] | rs2 | rs1 | 001    | imm[4:1 11] | 1100011 | b <sub>nne</sub> |
| imm[12 10:5] | rs2 | rs1 | 100    | imm[4:1 11] | 1100011 | b <sub>lt</sub>  |
| imm[12 10:5] | rs2 | rs1 | 101    | imm[4:1 11] | 1100011 | b <sub>ge</sub>  |
| imm[12 10:5] | rs2 | rs1 | 110    | imm[4:1 11] | 1100011 | b <sub>ltu</sub> |
| imm[12 10:5] | rs2 | rs1 | 111    | imm[4:1 11] | 1100011 | b <sub>geu</sub> |

# Conditionally Branching Even Further

- Conditional Branches are typically used for if-else statements, for/while loops.

Usually pretty small (<50 lines of C code)

- B-Format has limited range:

$\pm 2^{10}$  32-bit instructions from current instruction

- What if destination is further away?

Enter the *unconditional jump*!

```
beq x10, x0, far  
# next instr
```

if equal

functionally equivalent

```
bne x10, x0, next  
j far  
next: # next instr
```

if equal

Garcia, Kao



# J-Format Layout

- Updating the Program Counter (PC)
- B-Format Layout
- J-Format Layout
- U-Format: Long Immediates
- **jalr**: I-Format

# The jal instruction and j pseudoinstruction

- **jal** does a **Jump And Link** (i.e., “link and jump”):

**jal rd, Label**

Jump to **Label**

Write address of the *following* instruction to **rd**.

- **Two use cases:**

1. Call a function (and simultaneously save return address in named register)

**jal ra, FuncLabel**

2. Unconditional branch

**j pseudoinstruction: jump and discard return address**

**j Label**



**jal x0, Label**

# The jal instruction and j pseudoinstruction

- jal does a Jump And Link (i.e., “link and jump”):

**jal rd, Label**

Jump to **Label** (i.e., add relative offset to PC)

Write address of the *following* instruction to **rd**.

$$\left. \begin{array}{l} \text{PC} = \text{PC} + \text{offset} \\ \text{rd} = \text{PC} + 4 \end{array} \right\}$$

- Two use cases:

1. Call a function (and simultaneously save return address in named register)

**jal ra, FuncLabel**

2. Unconditional branch

j pseudoinstruction: jump and discard return address

**j Label**

Often used to conditionally branch further than B-Types

**jal x0, Label**



# J-Format Instruction Layout

- J-Format (textbook: UJ-Type) used only for `jal`:  
`jal rd, Label`



Immediate represents relative offset in increments of 2 bytes.

To compute new PC:  $PC = PC + \text{byte\_offset}$ .

20 immediate bits imply  $\pm 2^{18}$  32-bit instructions reachable:

1 bit: 2's complement (allow  $+\/-$  offset), 1 bit: half-word/16-b instruction support

Immediate bit encoding optimized to reduce HW cost

What about jumping further?

Next: How to load 32-bit  
immediate into a register!

# U-Format: Long Immediates

- Updating the Program Counter (PC)
- B-Format Layout
- J-Format Layout
- U-Format: Long Immediates
- **jalr**: I-Format

# U-Format Instruction Layout

- “Upper Immediate” instructions:

**opname      rd, immed**

U-Format opcodes:  
lui    0110111  
auipc 0010111



Immediate represents *upper 20 bits* of a 32-bit immediate  
operand **imm** = **immed** << 12.



Long Immediate (“**imm**”)

# lui and addi creates Long Immediates

**lui rd, immed**

- The **lui** instruction, Load Upper Immediate:

Write a 20-bit immediate value into  
the upper 20-bits of register rd.

Clear the lower 12 bits.

$$\left. \begin{array}{l} \text{rd} = \text{immed} \ll 12 \\ \end{array} \right\}$$

- lui together with an addi (to set lower 12 bits) can create any 32-bit value in a register:**

```
lui x10, 0x87654      # x10 = 0x87654000
addi x10, x10, 0x321  # x10 = 0x87654321
```

The **li** pseudoinstruction  
(Load Immediate) resolves to  
**lui + addi** as needed,  
e.g., **li x10, 0x87654321**.

# lui Edge Case: addi Extends Sign

- How should we set the immediate 0xB0BACAFE?



Unfortunately:

```
lui x10, 0xB0BAC      # x10 = 0xB0BAC000  
addi x10, x10, 0xAF   # x10 = 0xB0BABAFE
```

Recall: **addi** sign-extends the 12-bit immediate.

If “sign bit” set, subtracts 1 from the upper 20-bits!



# lui Edge Case: addi Extends Sign

- How should we set the immediate 0xB0BACAFE?



Unfortunately:

```
lui x10, 0xB0BAC      # x10 = 0xB0BAC000
addi x10, x10, 0xAF   # x10 = 0xB0BABAFE
```

Recall: **addi** sign-extends the 12-bit immediate.

If “sign bit” set, subtracts 1 from the upper 20-bits!



Solution: If 12-bit immediate is negative,  
add 1 to the upper 20-bit load.

```
lui x10, 0xB0BAD      # x10 = 0xB0BAD000
addi x10, x10, 0xAF      # x10 = 0xB0BACAFE
```

Use pseudoinstructions!  
**li** automatically handles this edge case.

# auipc loads the PC into the Register File

**auipc rd, immed**

- The **auipc** instruction
  - Adds an Upper Immediate to the PC.
- Example:

**auipc x5, 0xABCD** #  $x5 = PC + 0xABCD000$

- In Practice:

Label: **auipc x5, 0** # puts address of Label in x5

Loads the PC into a register accessible by other instructions.

**auipc** is most often used together with **jalr** to do PC-relative addressing with super large offsets.

# jalr: I-Format

- Updating the Program Counter (PC)
- B-Format Layout
- J-Format Layout
- U-Format: Long Immediates
- **jalr: I-Format**

# The jalr instruction and jr pseudoinstruction

- **jalr does a Jump And Link Register:**

**jalr rd,rs1,imm**

Jump to **rs1 + imm**.

Write address of the *following* instruction to **rd**.

$$\left. \begin{array}{l} \text{PC} = \text{rs1} + \text{imm} \\ \text{rd} = \text{PC} + 4 \end{array} \right\}$$

- Use cases (so far):

1. Return to caller

**jr ra**  **jalr x0, ra, 0**

# I-Format Instruction Layout: jalr

- jalr uses I-Format:



Immediate

imm and rs1 are added together to update PC.

$$PC = rs1 + imm$$

⚠ Note I-Type!

Unlike B-type, J-type, jalr will not multiply imm by 2.

Immediate offset therefore must be written in *units of bytes*.

“Destination” Register  
gets “return address.”

$$rd = PC + 4$$

# The `jalr` instruction and `jr` pseudoinstruction

- `jalr` does a Jump And Link Register:

`jalr rd, rs1, imm`

Jump to `rs1 + imm`.

Write address of the *following* instruction to `rd`.

$$\left. \begin{array}{l} \text{PC} = \text{rs1} + \text{imm} \\ \text{rd} = \text{PC} + 4 \end{array} \right\}$$

- Use cases:

1. Return to caller

`jr ra`  `jalr x0, ra, 0`

Unlike `jal` (relative to PC), `jalr` addresses are relative to `rs1`, which is modifiable by arithmetic instructions. We can do bigger jumps!

2. Call a function (and save return address)  
at any 32-bit *absolute address*.
3. Jump *PC-relative* with a *32-bit offset*.

$$\left. \begin{array}{l} \text{lui x1 <hi20bits>} \\ \text{jalr ra, x1, <lo12bits>} \\ \\ \text{auipc x1 <hi20bits>} \\ \text{jalr x0, x1, <lo12bits>} \end{array} \right\}$$

# “And in Conclusion...”

- We've covered (almost) the entire RV32 ISA!
- Practice assembling and disassembling!

|                       |     |     |         |       |
|-----------------------|-----|-----|---------|-------|
| imm[31:12]            |     | rd  | 0110111 | LUI   |
| imm[31:12]            |     | rd  | 0010111 | AUIPC |
| imm[20 10:1 11 19:12] |     | rd  | 1101111 | JAL   |
| imm[11:0]             | rs1 | 000 | rd      | JALR  |
| imm[12 10:5]          | rs2 | rs1 | 000     | BEQ   |
| imm[12 10:5]          | rs2 | rs1 | 001     | BNE   |
| imm[12 10:5]          | rs2 | rs1 | 100     | BLT   |
| imm[12 10:5]          | rs2 | rs1 | 101     | BGE   |
| imm[12 10:5]          | rs2 | rs1 | 110     | BLTU  |
| imm[12 10:5]          | rs2 | rs1 | 111     | BGEU  |
| imm[11:0]             |     | rs1 | 000     | LB    |
| imm[11:0]             |     | rs1 | 001     | LH    |
| imm[11:0]             |     | rs1 | 010     | LW    |
| imm[11:0]             |     | rs1 | 100     | LBU   |
| imm[11:0]             |     | rs1 | 101     | LHU   |
| imm[11:5]             | rs2 | rs1 | 000     | SB    |
| imm[11:5]             | rs2 | rs1 | 001     | SH    |
| imm[11:5]             | rs2 | rs1 | 010     | SW    |
| imm[11:0]             |     | rs1 | 000     | ADDI  |
| imm[11:0]             |     | rs1 | 010     | SLTI  |
| imm[11:0]             |     | rs1 | 011     | SLTIU |
| imm[11:0]             |     | rs1 | 100     | XORI  |
| imm[11:0]             |     | rs1 | 110     | ORI   |
| imm[11:0]             |     | rs1 | 111     | ANDI  |

|                  |       |      |       |         |         |         |
|------------------|-------|------|-------|---------|---------|---------|
| 0000000          | shamt | rs1  | 001   | rd      | 0010011 | SLLI    |
| 0000000          | shamt | rs1  | 101   | rd      | 0010011 | SRLI    |
| 0100000          | shamt | rs1  | 101   | rd      | 0010011 | SRAI    |
| 0000000          | rs2   | rs1  | 000   | rd      | 0110011 | ADD     |
| 0100000          | rs2   | rs1  | 000   | rd      | 0110011 | SUB     |
| 0000000          | rs2   | rs1  | 001   | rd      | 0110011 | SLL     |
| 0000000          | rs2   | rs1  | 010   | rd      | 0110011 | SLT     |
| 0000000          | rs2   | rs1  | 011   | rd      | 0110011 | SLTU    |
| 0000000          | rs2   | rs1  | 100   | rd      | 0110011 | XOR     |
| 0000000          | rs2   | rs1  | 101   | rd      | 0110011 | SRL     |
| 0100000          | rs2   | rs1  | 101   | rd      | 0110011 | SRA     |
| 0000000          | rs2   | rs1  | 110   | rd      | 0110011 | OR      |
| 0000000          | rs2   | rs1  | 111   | rd      | 0110011 | AND     |
| 0000             | pred  | succ | 00000 | 000     | 00000   | FENCE   |
| 0000             | 0000  | 0000 | 00000 | 001     | 00000   | FENCE.I |
| 0000000000000000 |       |      | 00000 | 000     | 00000   | ECALL   |
| 0000000000000001 |       |      | 00000 | 000     | 00000   | EBREAK  |
| csr              | rs1   | 001  | rd    | 1110011 | CSRRW   |         |
| csr              | rs1   | 010  | rd    | 1110011 | CSRRS   |         |
| csr              | rs1   | 011  | rd    | 1110011 | CSRRC   |         |
| csr              | zimm  | 101  | rd    | 1110011 | CSRRWI  |         |
| csr              | zimm  | 110  | rd    | 1110011 | CSRRSI  |         |
| csr              | zimm  | 111  | rd    | 1110011 | CSRRCI  |         |

Not in CS61C

<https://riscv.org/technical/specifications/>