

## Chapter 2

Register → units of memory  
immediate use of CPU.

~~32 bits~~ 32 × 64 bit.

Smaller is faster.

$x_0 \rightarrow$  holds 0. Can be used to copy things.

$x_1 \rightarrow$  Return address.

$(\underbrace{a+b}_{1}) - (\underbrace{c+d}_{2}) \rightarrow \text{Temp}$   
 $\underbrace{\quad\quad\quad}_{3} \rightarrow \text{Temp}$   
 $\underbrace{\quad\quad\quad}_{3} \rightarrow \text{Saved}$

$x_{10-11} \rightarrow$  Function arguments / Return

$x_{12-17} \rightarrow$  " "

$x_{5-7} \rightarrow \text{Temp}$

$x_9 \rightarrow \text{Saved}$

$x_{18-27} \rightarrow$  "

$x_{28-31} \rightarrow \text{Temp}$ .

## operand

- Register operands
- memory operands
- Constant / immediates.

→ addi  $x_{28}, x_{29}, 5$   
↓  
Hard code  $\pi_0$   $f_5f_5$ .

Register: Arithmetic instruction use

register operands.

$$f \rightarrow x_{19}$$

:

:

:

$$j \Rightarrow x_{23}.$$

add  $x_5, x_{20}, x_{21}$

add  $x_6, x_{22}, x_{23}$

sub  $x_{19}, x_5, x_6$ .

## Memory operand

Risc-V is little endian.

memory is byte addressed.

$$8 \times \text{index} = \text{offset}$$

A[3]

$$\Rightarrow 8 \times 3$$

$$= 24 + 0000$$

$$= 24$$

$$A[\text{index}] = 8 * \text{index}$$

$$\text{Actual Address} = [\text{Base Add} + \text{offset}]$$

Memory operand.

words must start at address multiple of 4

double    "        "        "        "        "        "        "        8.

32 bit  $\longrightarrow$  #0004 ✓

#0003 ✗

64 bit  $\longrightarrow$  #0008 ✓

#0009 ✗

Risc-V  $\rightarrow$  No alignment restriction

Risc-V is little endian

The order in which memory stores data.

Big endian



why little endian

Load, store Instruction syntax.

load → load data from mem to reg

store → store data from mem to mem from reg



5:50

Array  $\rightarrow$  100 double words

ld  $x_5, 64[x_{22}]$   $\xrightarrow{\text{temp}}$   $\xleftarrow{\text{off}}$   $\xleftarrow{\text{Base reg}}$   $j = x_{20}$   
 $h = x_{21}$

add  $x_{20}, x_{21}, x_5$

Base of A =  $x_{22}$

$$j = h + \underline{A[8]}$$

memory  
location

$$\begin{aligned} \text{offset} &= 8 \times 8 \\ &= 64. \end{aligned}$$

Assume var h is associated with  $x_{21}$

base address of array A is  $x_{22}$ .

RISC-V assembly code for,

$$A[12] = h + A[8]$$

$$12 \times 8 = 96.$$

ld  $x_5, 64[x_{22}]$

add  $x_5, x_{21}, x_5$

sd  $x_5, 96[x_{22}]$

## Immediate Operation.

→ Avoids a load instruction (faster).

Addi  $x_{22}, x_{22}, 1$ .

Compiler cannot add/sub 2 integ.

Addi  $x_{22}, x_1, 2 \otimes$

$$x_{22} = x_{22} - 5$$

→ Addi  $x_{22}, x_{22}, -5$ .

## Immediate ope. 2.

→ avoids load instructions.

addi  $x_{22}, x_{21}, 5$ .

## Sign extension

LB → sign extension.

LBU → load byte (unsigned extension)

if signed:

replicate the sign bits to the left ( $\leftarrow$ )

else:

extend 0s to the left ( $\leftarrow$ )

$$(2)_{10} = (10)_2$$

#  $(2)_{10}$  in unsigned 8 bit.

$$(0000\ 0010)_2$$

0000 0010  
|  
|  
|  
|

LBU  $\times_{2^2}, 10 [x_{2^1}]$

24 zero.  
[ (0000 0010)

$$(+2)_{10} = (010)_2$$

$\Rightarrow \underbrace{0000\ 0010}$  (Copied sign bit)

$(-2)_{10} \Rightarrow$  2's complement of +2.

$$+2 \Rightarrow 0000\ 0010.$$

$$\begin{array}{r} \text{1's C-} 1111\ 1001 \\ +1 \quad (2\text{'s C}) \\ \hline \end{array}$$

$$\underline{(-2)_{10} = (1111\ 1110)_2}$$

$$\begin{aligned} \text{unsigned} &= 0 + 2^{n-1} - 1 \\ \text{signed} &= -2^{n-1} + 2^{n-1} - 1 \quad [2\text{'s comp}] \end{aligned}$$

left  
→ desire to keep ALL instructions "length same"  
vs "format same"

Risc V → Machine code.

Higher  
↓ Compiler

# Risc V → instructions take ex. 32 bit. Assembly

↓ Assembler  
Machine  
Language

# The layout of ins is called  
"Instruction format"



divide 32 bits into fields.

length.  
vs  
format

|      |      |              |             |
|------|------|--------------|-------------|
| Add  | $rd$ | $rs_1$       | $rs_2$      |
| Addi | $d$  | $x_{20}$     | $x_{21}$    |
| ld   | $s$  | $x_{20}$     | $s$         |
| sd   | $rs$ | $16[x_{21}]$ | $imm$       |
|      |      | $16[x_{21}]$ | Same Format |
|      |      | off.         | rs.         |

Risc V chooses to keep the length same. But different format.

# R-type instructions

Instruction formats exactly

- R ⇒ 9ns that use 3 registers (Add, sub, sll, xor, OR, AND)
- I ⇒
- S

1 → rd<sub>dest</sub>  
2 → rs<sub>src</sub>

## R-type

| func7 | rs2 | rs1 | func3 | rd | opcode       |
|-------|-----|-----|-------|----|--------------|
| 7b    | 5b  | 5b  | 3b    | 5b | 7b = 32 bits |

opcode = Denote the format of an instruction

opcode, func3, func7 ⇒ R-type ADD  
R-type ADD instruction  
(R-type ADD).

rd = destination.

rs1 = source1

rs2 = source2

Convert to Hex (Boundary ong) Add

|      |      |      |      |   |
|------|------|------|------|---|
| 0    | 1    | 4    | 5    | A |
| 0000 | 0001 | 0101 | 1010 |   |

|      |      |      |      |   |
|------|------|------|------|---|
| 0    | A    | 4    | 6    | A |
| 0000 | 1010 | 0110 | 1010 |   |

Hex: 015A0A6A

|                   |                   |                   |         |             |
|-------------------|-------------------|-------------------|---------|-------------|
| (21) <sub>2</sub> | (20) <sub>2</sub> | (20) <sub>2</sub> | 1101010 |             |
| 0000              | 10101             | 10100             | 000     | 10100 given |
| 000               |                   |                   |         |             |

func7      rs2      rs1      fun3      rd      OP.

X<sub>20</sub>   X<sub>20</sub>   X<sub>21</sub>  
rd      rs1      rs2

reg number  
not content

## Type

- 1 immediate value (offset)
- must be two reg involved (rd, rs)

| imm. | rs1 | fun3 | rd | opcode |
|------|-----|------|----|--------|
| 12b  | 5b  | 3b   | 5b | 7b     |

Immediate = constant / offset (2s comp)

rs1 = load source / Base Reg number



Why is rs/rd = 5 bit?

→ Because we just put the register numbers.  
we have 32 registers. So, we need 5 bits ( $2^5$ )

[ rs load or func3 → width + signedness ]

তত সংখ্যাকে  
তুর

sign status  
(1b/1bu)



16 8 4 2 1  
1 0 1 P 0

## S-type ins. 755357

How to distinguish.

→ store instruction (sd)

|             |   |      |
|-------------|---|------|
| SB → byte   | + | lost |
| SH → Half   |   |      |
| SW → word   |   |      |
| SD → double |   |      |

কোন মান (signed) Data?

signed কির unsigned ?

| imm[11:5] | rs2 | rs1 | func3 | imm[4:0] | opcode |
|-----------|-----|-----|-------|----------|--------|
| 7b        | 5b  | 5b  | 3b    | 5b       | 7b     |

↙      ↘  
Source      Base.

Why is imm val. divided in 7 and 5?

To make the instruction formats more common. Like with R-type and I-type. This is totally match with R-type.



|           |       |       |        |          |              |
|-----------|-------|-------|--------|----------|--------------|
| 0000 000  | 00101 | 10100 | xxx    | 10000    | xxxxx<br>xxx |
| 7(im11:5) | 5 rs2 | 5 rs1 | 3 func | 5(im4:0) | 7 (op)       |

## Shift operation.



$101\ 11100 \rightarrow$  Right shift  $\rightarrow 0101\ 1110$ .

|      |                                      |
|------|--------------------------------------|
| Rig  | left shift                           |
| SRR  | $S11 \rightarrow$ left shift logical |
| SRR1 | $S11i$ rd rs1 immediate.             |

I-type instruction.  $S11i \times_{19}, \times_{20}, 4$

$\hookrightarrow \times_{20}$  value  $\Rightarrow 4$  and  
left shift  $\Rightarrow \times_{19} =$

what'll happen if we shift 64 bits?

$\rightarrow$  সবুজ 0 হচ্ছে যাই। So, meaning them.

so Cap is **1-63**



### Right shift

2<sup>9</sup> fibo 1st.

$$\begin{aligned} 128 & 64 & 32 & 16 & 8 & 4 & 2 & 1 \\ 0 & 0 & 1 & 1 & 0 & 0 & 0 & 0 \\ \downarrow & \downarrow \\ 0 & 0 & 0 & 1 & 1 & 0 & 0 & 0 \end{aligned} \Rightarrow 48$$

$$\begin{aligned} 0 & 0 & 1 & 1 & 0 & 0 & 0 & 0 \\ \downarrow & \downarrow \\ 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 \end{aligned} \Rightarrow 24, \text{ srri } 1$$

$$\begin{aligned} 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 \\ \downarrow & \downarrow \\ 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 \end{aligned} \Rightarrow 12, \text{ srri } 2$$

if we right shift odd number, we take the floor value.  $5 // 2^1 \Rightarrow 2$

|            |     |                                                       |      |    |        |
|------------|-----|-------------------------------------------------------|------|----|--------|
| 12 bit imm | 5   | 3                                                     | 5    | 7  | I-type |
| Fun6       | imm | rs1                                                   | fun3 | rd | op     |
| 6          | 6   | $\hookrightarrow 63. \text{ or } 6$ bit<br>$2^6 = 64$ |      |    | shift. |

$\Rightarrow (12)_{10}$

$\Rightarrow (24)_{10}, \text{ SLLI. } (12 \times 2^1)$

$\Rightarrow (48)_{10}, \text{ SLLI. } (12 \times 2^2)$

SLLI performs multiplication using  $2^1 \Rightarrow$  যে ক্ষয়ক্ষতি

shift হবে

what about  $\downarrow 12 \times 5$ ?  
 $(12 \times 2^2) + 12$

$$12 \times 7$$

$$\hookrightarrow (12 \times 2^3) - 12$$

Right shift → select some bits and clear others to zero.

something and 0  $\Rightarrow$  0.

something and 1  $\Rightarrow$  something.

and  $x_9, x_{10}, x_{11}$ . (R-type instruction)  
 rd rs1 rs2

যে প্রক্রিয়া bit গুলি intact রাখতে হবে তার

1.  $f(x)$  and করো. (7-4) (2-1).

$x_{10} = 000 \dots 0000 \quad 1100 \quad 1100$

$x_{11} = 000 \dots 0000 \quad 1111 - 0110 \xrightarrow{\text{Mask}}$

$x_9 = 000 \dots 0000 \quad 1100 \quad 0100$

Result  
 1100

and

② ~~f1~~ b rs1, rs2, L1

bge rs1, rs2, L1

## OR & XOR

OR → Incluse bit.

on  $X_9, X_{10}, X_{11}$ .

| A | B | $A+B$ |
|---|---|-------|
| 0 | 0 | 0     |
| 1 | 0 | 1     |
| 0 | 1 | 1     |
| 1 | 1 | 1     |

11 some. or 0 = som.  
some or 1 = 1  
~~265~~

$X_{10} = 0000 \dots 0000 \quad 1100 \quad 0000$

$X_{11} = 0000 \dots 0000 \quad 0011 \quad 0000$

$X_9 = 0000 \dots 0000 \quad 1111 \quad 0000$

XOR → Can work as Buffer / Not

XOR  $X_9, X_{10}, X_{11}$ .

in | out

| in | out |
|----|-----|
| 0  | 0   |
| 1  | 1   |

| A | B | $A \oplus B$ |
|---|---|--------------|
| 0 | 0 | 0            |
| 1 | 0 | 1            |

Something XOR 0 → copy (Buffer)

| A | B | $A \oplus B$ |
|---|---|--------------|
| 1 | 0 | 1            |

Something XOR 1 → Not

| A | B | $A \oplus B$ |
|---|---|--------------|
| 1 | 1 | 0            |

## Practice

$$A[3] = A[5] + 1$$

~~addi 24[x<sub>21</sub>], 40[x<sub>21</sub>], 1~~

Base  $\rightarrow x_{21}$

$$A[0] = x_{21}$$

A is double word array.

ld  $x_5, 40[x_{21}]$

addi  $x_5, x_5, 1$

sd  $x_5, 24[x_{21}]$

if  $A[9]$  variable is associated with  $x_{25}$

slli  $x_5, x_{25}, 3$  ① index multiplied with num of memory cells.

(ld  $x_6, \cancel{x_5[x_{21}]} \rightarrow x_{25} \times 2^3$ )

\* add  $x_6, x_{21}, x_5$  now offset

ld  $x_7, 0[x_6]$  (Base + offset)

$x_6$  is the actual address

Temp  $\rightarrow 5, 6, 7$

28, 29, 30, 31

## Decision making (if-else)

$\text{if } (i == j) :$   
    st - 1

else  
    st - 2



$$8 \times 4 = 32 \text{ bit}$$

assembler will always run sequentially

Risc v includes two de. making

① beg branch if equal.  
Cond. branching | beg, rs1, rs2, L1  
|  
| equal ? (Level 1) → jump  
| to L1

② bne (branch. not equal)

bne rs1, rs2, L1 (same as other)

# Cond. jumps

|    | Inst. | Syntax            | Operations |
|----|-------|-------------------|------------|
| =  | beq   | beq, rs1, rs2, L1 | rs1 == rs2 |
| != | bne   | bne, rs1, rs2, L2 | rs1 != rs2 |
| <  | blt   | blt, rs1, rs2, L3 | rs1 < rs2  |
| >= | bge   | bge, rs1, rs2, L4 | rs1 >= rs2 |

branch less than

branch greater than equal to.

## example

if ( $i == j$ ):

$f = g + h$

else:

$f = g - h$ .



অবশ্যই bne করার পেছনে করা।

bne  $X_{22}, X_{23}$ , Else

add  $X_{19}, X_{20}, X_{21}$ .

beq  $X_0, X_0$ , exit

Constructed a unconditional Jump.

Else:

sub  $X_{19}, X_{20}, X_{21}$

Exit:

$f = X_{19}$

$g = X_{20}$

$h = X_{21}$

$i = X_{22}$

$j = X_{23}$

my bne?

if (i == j):

f = gth.

beg X<sub>22</sub>, X<sub>23</sub>, IF

beg X<sub>0</sub>, X<sub>0</sub>, Exit

IF:

add X<sub>19</sub>, X<sub>20</sub>, X<sub>21</sub>

Exit:

bne X<sub>22</sub>, X<sub>23</sub>, Exit

odd X<sub>19</sub>, X<sub>20</sub>, X<sub>21</sub>

Exit:

## Procedure calling

let's assume procedure like a spy.

Execution process of spy:

- (I) leaves with a secret plan.
- (II) acquires resources (night vision)
- (III) performs the task (news নিয়ে আনবে)
- (IV) covers traces (যুক্ত হবে)
- (V) return to the point of origin with desired answer.

def calc (a, b, c):      Parameter.

d = calc (1, b2, 3)

↑ Arg.

Ex of procedure

- (I) Put param. where procedure can access (X<sub>10</sub> - X<sub>17</sub>)
- (II) Transfer cont to procedure
- (III) resource for proc.
- (IV) Perf the desired task
- (V) result value in a place so that the call program can access
- (VI) Return control to the point of origin.



## ② JAL (Jump and Link instruction)

JAL,  $X_1$ , procedure label.

↑ Fixed.

(কোথায় back  
করব প্রস্তা  
বন্ধ করবেন  
return address)



PC (Program Counter)

↳ Current instruction  
at memory location.

PC = 0. ← First instruction  
running.

PC = 4 ← second "

PC = 8 ← third "

PC = 12 ← fourth "



Machine  
Code  
32 bit  
 $\frac{32}{8} = 4$

JAL  $X_1$  Label.

PC = #0004 ↳ when executing

→ PC + 4 = 0008 (কাব্য 0004 executed.)  
পরের line এ চলো

JALR = Jump and link register. → function to  
main method.

JALR,  $X_0$ ,  $O[X_1]$  → আমার স্থান হো  
(Hard) ↳  $\uparrow$  PC + 4 → আমার স্থান হো  
distance

why  $X_0 \rightarrow$  function call (যেখানে main আছেন যাই,  
so কোন লাইনে  $\uparrow$  PC + 4, এবং দুর্দশ নাই).

Caller  $\Rightarrow$  main() method

Callee  $\Rightarrow$ অন্য function (calc).

JAL  $X_0$ , Label  $\leftarrow$  unconditional Jump.

# main():  $\Rightarrow X_9, X_{26}$ ,  
store  $X_9$  to main.

Calc():  $\Rightarrow X_9$  আসছে, এখন কি হতো?  
 $X_9 = 10$ .

main method ত্বরণ reg we কষে, এখন callee reg যদি  
callee we কষে, তাহলে callee তে যেই register এর  
value আপনা ফিল্ডের তাত্ত্বিক মান সংযোগ পাইবে.

↳ store করা stack এ.

Each reg 64 bit.  
= 8 location.

SP  $\Rightarrow 24$   
( $X_2$ )

$\therefore 3 \text{ reg} = 24 \text{ location.}$

$X_2$  = stack pointer.

we want to store 3 register

① decrease the stack pointer by the  
# locations you need.

e.g. addi  $X_2, X_2, -24$

② Then store the values in the stack.



Formatting a 32 bit immediate using LUI  
16 bits.

$X_{19} = \underline{1111} \quad \underline{1111} \quad \underline{1111} \quad \underline{000}$ . → Convert to decimal.

Addi  $X_{19}, X_0, 65520$ .

→ itype instruction. → immediate = 12 bit.

Find ekhane 16 bit immediate ~~JNEZI~~

what do we do?

Our savior **LUI** → Load upper immediate.  
LUI (rd, Constant) → Copies 20 bit data into rd's [31:12], ①  
→ Copy whatever you have in bit 31 to bits [63:32]  
→ Copy 0s from 11th to zeroth bit

load this 64 bit value into  $X_{19}$

0000 0000 0000 0000 0000 0000 0000 0000

0000 0000 0011 1101 0000 0101 0000 0000

31 → 976 → 12      1280

lui  $X_{19}, 976$ .

$X_{19} = \underline{0} \quad \underline{0} \quad \boxed{\text{Copy}} \quad \underline{0} \quad \underline{31} \quad \underline{12} \quad \underline{0} \quad \underline{11} \quad \underline{0} \quad \underline{0}$

Addi  $X_{19}, X_{19}, 1280$

LUI → Utype instruction

| Immediate | rd   | opcode |
|-----------|------|--------|
| 20bit     | 5bit | 7bit   |

SB-type.

SB type → beq, bne, blt, bge.

↳ Conditional jump.

if  $a == j$ , if  $a != b$ .

| 1b          | 6b      | 5b  | 5b  | 3bit  | 7                                       |
|-------------|---------|-----|-----|-------|-----------------------------------------|
| imm<br>[12] | im[0:5] | rs2 | rs1 | func3 | imm<br>[4:1]<br>imm<br>[11]      opcode |

beq, rs1, rs2, \_\_\_\_\_  
label.

\*

\* Rep. branch  
address in  
multiples of 2.

\* forward and  
one  $x_9, x_{24}$  exit backward moving  
exit  
\* unusual mapping  
of imm. design

|      |   |       |                            |  |
|------|---|-------|----------------------------|--|
| 8000 | 1 | SLti  | $X_{10}, X_{22}, 3$        |  |
| 8004 | 2 | add   | $X_{10}, X_{10}, X_{25}$   |  |
| 8008 | 3 | ld    | $X_9, 0[X_{10}]$           |  |
| 8012 | 4 | bne   | $X_9, X_{24}, \text{Exit}$ |  |
| 8016 | 5 | addi  | $X_{22}, X_{22}, -1$       |  |
| 8020 | 6 | beq   | $X_0, X_0, \text{loop}$    |  |
| 8024 | 7 | Exit: |                            |  |

Line 4 to line 7

PC relative addressing  $\rightarrow$  Base (2nd byte) + offset

4 (2nd exit e jump address)

so, line 4 to line 7. (3rd line)

যদি নিচে থেকে হিসাব যাব

loop.

$$5 \times 4 = 20 \rightarrow \text{নিচে } 2^{\text{nd}}$$

sign bit. & upore so, -20

0000 0 0001 0100.

0 0000 0001 0100 = +20

+1 1111 1110 1011 1's.

+1

1 1111 1110 1100 2's Compliment (-20)

0000 0000 1100.  
Positive (বাস্তু ফলোর)

0 0000 0000 1100 → Discard  
to make it multiple  
of 2.

12 11 10:5 4:1

→ box (215 immediate)

① Form the 12 bit number.

(12) (11) (10:5) (4:0)

② If negative do 2's complement again.

③ PC + (imm × 2)

UJ type instruction → JAL  
→ offset 32 jump.

|             |               |             |                |    |        |
|-------------|---------------|-------------|----------------|----|--------|
| 1           | 9             | 1           | 8              | 5  | 7      |
| imm<br>[20] | imm<br>[10:1] | imm<br>[11] | imm<br>[19:12] | rd | opcode |

Jal x0, 2000 → 20 bit immediate.