

USE YOUR IMAGINATION™

# Blue Book

EXAMINATION BOOK

BOX NO. \_\_\_\_\_

NAME RAMAN ICR JNA  
SUBJECT CSA  
CLASS N13866145  
SECTION A  
INSTRUCTOR AZEEZ B  
DATE 13/13/2025

11" x 8.5" 8 LEAVES 16 PAGES

- ① The RISC-V assembly program to add the contents of 2 registers  $x_3$ , and  $x_4$ .  
In 2's complement arithmetic, an overflow happens in basically two conditions—
- i) Adding two positive numbers yields a negative number.
  - ii) Adding two negative numbers yields a positive number.

overflow cannot occur if we add a positive and a negative number in the addition operation.

RISC-V being a reduced instruction set does not have a special overflow flag to check in the addition operation.

In this case, we must manually check this overflow using a sequence of instructions while performing the add operation in the RISC-V processor.

## Program —

This program explicitly follows the logic if the operand have same sign, check if the result have different sign

## # Registers used —

$X_3$ : first operand

$X_4$ : second operand

$X_5, X_6$ : stores the result ( $X_3 + X_4$ )

$X_6, X_7$ : Temporary registers

$X_{10}$ : overflow flag ( $1 = \text{overflow}$ ,  $0 = \text{No overflow}$ )

## # Start

add  $X_5, X_3, X_4$

# Perform the addition

XOR  $X_6, X_3, X_4$

# check if the operand have different signs.

blt  $X_6, \text{Zero}, \text{NO\_overflow}$  # if  $(X_3 \text{ } \Delta \text{ } X_4) < 0$   
signs are different. branch.

XOR  $X_7, X_3, X_5$

# check if the sum and one of the operand have different sign.

blt  $X_7, \text{Zero}, \text{overflow}$  # if  $(X_3 \text{ } \Delta \text{ } X_5) < 0$

NO\_overflow signs are different, overflow!

addi  $X_{10}, \text{Zero}, 0$  # Set overflow flag to 0.

overflow:

addi \$10, zero, 1 # set overflow flag to 1.

Done,

X5: It contains the sum which will be incorrect if the overflow occurred.

X10: contains the status of the overflow.

(2) This question explores the byte-addressable and the concept of endianness of a memory

Key information

\* word size: 32 bits: 4 bytes

\* Memory: byte addressable

\* Word-to-byte-mapping: 0th word is at address 0, and first word is at address 4.

\* Pattern: The byte address of an nth word is at address  $n \times 4$ .

- (2) (a) The pattern address =  $n \times 4$   
 $\rightarrow n = 44$  (since we start counting from 0,  
the 44th word is at index 44)  
 $\rightarrow$  Byte address =  $44 \times 4 = 176$   
in hexadecimal,  $176_{10} = B0_{16}$

Ans: The byte address of the 44th word is 176 (or 0XB0)

- (b) A 32-bit (4 byte) word starting at address 176 will occupy that address and the next three consecutive byte address.
- $\rightarrow$  1st byte: 176 address  
 $\rightarrow$  2nd byte: 177 address  
 $\rightarrow$  3rd byte: 178 address  
 $\rightarrow$  4th byte: 179 address

Ans: The byte addresses 176, 177, 178, 179 (or 0XB0, 0XB1, 0XB2, 0XB3)

② (c) Number: 0x FF 22 33 44

→ Bytes (from most to least significant)

→ Most significant bytes (MSB): FF

Byte 2: 22

Byte 3: 33

Least significant bytes (LSB) = 44

Memory addresses: 176, 177, 178, 179

Big-Endian

| Memory address        | Data byte |
|-----------------------|-----------|
| 176 (lowest address)  | FF (MSB)  |
| 177                   | 22        |
| 178                   | 33        |
| 179 (highest address) | 44 (LSB)  |

Little-Endian

| Memory address        | Data byte |
|-----------------------|-----------|
| 176 (lowest address)  | 44 (LSB)  |
| 177                   | 33        |
| 178                   | 22        |
| 179 (highest address) | FF (MSB)  |

③ The program provided in the question, is a simple loop that calculates the sum of the number from 0 to 9.  
→ The loop body executes 10 times (for  $i = 0, 1, \dots, 9$ )

Assumptions  
④ We use the standard cycle count for a multicycle datapath.

- R-type (add) : 4 cycles
- I-type (addi) : 4 cycles
- load (lw) : 5 cycles
- store (sw) : 4 cycles
- Branch (beq) : 3 cycles (whether taken or not taken)
- jump (j) : 3 cycles

Cycle calculation:

① Setup (outside loop)

- addi \$0, zero, 0 : 4 cycles
- addi \$1, zero, 0 : 4 cycles
- addi \$B, zero, 10 : 4 cycles

Setup total = 12 cycles

(b) Loop (10 iterations, from  $i=0$  to  $9$ )

- beg (Not taken) : 3 cycles
- add : 4 cycles
- add : 4 cycles
- add : 3 cycles
- Cycles per iteration: 14 cycles
- Loop total :  $14 \times 10 = 140$  cycles

(c) Final loop check ( $i=10$ )

- beg : 3 cycles

The loop then exits to L2

$$\boxed{\text{Total cycles} = 12 + 140 + 3 = 155 \text{ cycles}}$$

CPI calculation

⇒ total instructions

setup : 3

$$\text{Loop} \Rightarrow 10 \times 4 = 40$$

Final check : 1

$$\boxed{\text{total instructions} = 3 + 40 + 1 = 44}$$

$$\boxed{\text{CPI} = \frac{\text{Total cycles}}{\text{Total instructions}} = \frac{155}{44} = 3.523}$$

③(b)

## Pipelined RISC-V processor

### Stall calculation

$$\text{Total cycles} = \text{Total instructions} + \text{Total stall cycles} + (\text{Pipeline latency} - 1)$$

(a) Total instruction, as calculated before,  $I = 44$

(b) Total stall cycles  
Control hazard(jumps)  
 $= 10 \times 1 = 10$  stalls

Control hazard (branches)  
 $= 10 \times 1$  (10 times not taken,  
1 time taken)

Data hazard = 0

(These are all resolved

by standard forwarding.)

Total cycle stalls  $= 10 + 1 = 11$

(c) Pipeline latency  $= 5 - 1 = 4$

$$\textcircled{3} \textcircled{b} \quad \text{Total cycles} = 44 + 11 + 4 = 59$$

$$\text{CPI Calculation} = \frac{\text{Total cycles}}{\text{Total Instructions}}$$

$$\frac{59}{44} = 1.341$$

\textcircled{4} In this question, we must map the instructions to their pipeline stages for each cycle.

The key here is to identify the stalls first. The pipeline's state without stalls, which is what exists during cycle 5.

clock cycle (s1 + 1) 2 3 4 5

addi \$1, \$2, \$1 IF ID EX MEM - WB

addi \$0, \$1, \$0 IF ID EX MEM

lw \$3, (\$1 \* 4) IF ID EX

sw \$3, (\$2 \* 4) IF ID

xor \$2 ... IF

Based on the diagram, here is the activity in cycle 5:

→ addi \$1, zero, \$2 (WB stage)  
writing: The value \$2 to register \$1.

→ addi \$0, \$1, -4 (MEM stage)  
This is an ALU instruction, it is idle during the MEM stage. No register read or writes.

→ lw \$3, 16(\$0) (Ex stage):  
Reading: The value of register \$0, to calculate the memory address ( $$0 + 16$ )

→ sw \$3, 20(\$0) (ID stage):  
Reading: The value of register \$0 (for the base address) and register \$3 (the data to be stored)

→ xor \$2, \$0, \$3 (IF stage)  
The information is being fetched from the memory.  
No register reads or writes.

Summary for cycle 5:

Registers being written: S1

Registers being READ: S0 (read by two different instructions) and S3

(5)

### Hazard Analysis

1. Addi S0, S1, -4 (reads S1 from addi S1)

→ addi S1 (EX) → addi S0 (ID)

→ Handled by EX → EX forwarding,  
No stall.

2. lw S3, 16(S0) (reads S0 from addi S0):

→ addi S0 (EX) → lw S3 (ID)

→ Handled by EX → EX forwarding,  
No stall.

3. sw S3, 20(S0) (reads S3 from lw S3)

→ lw S3 (MEM) → sw S3 (ID)

→ This is a load-use hazard. The

sw instruction reads S3 in its ID stage.

But the lw instruction only has the data available after its MEM stage.

→ The pipeline must stall for 1 cycle.

The sw instruction waits

→ In cycle 7,  $\$w$  (in D) gets  $S3$  value  
from the lw's MEM/WB register,  
→ EX forwarding path.

4.  $xor\ S2, \$0, S3$  (reads  $S3$  from  $lw$  S3):  
→ This instruction is stalled by the  
sw in front of it.  
→ When it reaches its ID stage, the  
lw instruction now has already  
completed its WB stage (cycle 7).  
→ The value of  $S3$  is read normally from  
the register file. no stall.

5.  $or\ S2, S2, S3$  (reads  $S2$  from  
 $xor\ S2$ )  
→  $xor\ S2$  (EX) →  $or\ S2$  (ID)  
→ Handled by EX → EX forwarding  
. No stall.

### Pipeline diagram

| Instruction            | 1  | 2  | 3  | 4   | 5  |
|------------------------|----|----|----|-----|----|
| 1. addi \$1, zero, \$2 | IF | ID | EX | MEM | WB |
| 2. addi \$0, \$1, -4   | IF | ID | EX | MEM |    |
| 3. lw \$3, 16(\$0)     |    |    | IF | ID  | EX |
| 4. sw \$3, 20(\$0)     |    |    |    | IF  | ID |
| 5. xor \$2, \$0, \$3   |    |    |    |     | IF |
| 6. or \$2, \$2, \$3    |    |    |    |     |    |