

1. (a) · R-type / I-type :  $30 + 250 + 150 + 25 + 200 + 25 + 20 = 700 \text{ (ps)}$

- lw :  $30 + 250 + 150 + 25 + 200 + 250 + 25 + 20 = 950 \text{ (ps)} \vee$
- sw :  $30 + 250 + 150 + 200 + 25 + 250 = 905 \text{ (ps)}$
- bzg :  $30 + 250 + 150 + 25 + 200 + 5 + 25 + 20 = 705 \text{ (ps)}$

↳ · R/I-type : 710ps  
· lw : 960ps  $\vee$        $\left. \begin{array}{l} \cdot sw : 915ps \\ \cdot bzg : 715ps \end{array} \right\} \because \text{latency of the register file } 150 \rightarrow 160$

$$12\% \times (25\% + 11\%) = 4.32\%$$

$$960 \times (100 - 4.32) = 918.528$$

$\therefore 950 \text{ ps} \rightarrow 918.528 \text{ ps} : 1.034 \text{ 배 빨라짐.}$

1. (b)  $1000 + 200 + 4 \times 10 + 100 + 2 \times 30 + 2000 + 5 + 100 + 2 \times 1 + 2 \times 500$   
 $= 4507$

↳ 4707 ( $\because \text{cost of the register file } 200 \rightarrow 400$ )  
 $\therefore 4507 \rightarrow 4707 : 1.044 \text{ 배 더워짐.}$

1. (c) 비용과 속도가 동시에 증가하기 때문에 비용이나 속도 중 상황에 알게 증가하는 결과를 얻을 수 있다.

2. (a)

|      |                  | 1  | 2  | 3  | 4   | 5  | 6  | 7   | 8   | 9  | 10 | 11 | 12 |
|------|------------------|----|----|----|-----|----|----|-----|-----|----|----|----|----|
| sd   | \$s5, 12(\$s3)   | IF | ID | EX | MEM | WB |    |     |     |    |    |    |    |
| ld   | \$s5, 8(\$s3)    | IF | ID | EX | MEM | WB |    |     |     |    |    |    |    |
| sub  | \$s4, \$s2, \$s1 | IF | ID | EX | MEM | WB |    |     |     |    |    |    |    |
| beqz | \$s4, label      |    | st | st | IF  | ID | EX | MEM | WB  |    |    |    |    |
| add  | \$s2, \$s0, \$s1 |    | st | st | IF  | ID | EX | MEM | WB  |    |    |    |    |
| sub  | \$s2, \$s6, \$s1 |    |    | st | st  | IF | ID | EX  | MEM | WB |    |    |    |

2. (b)

No. Every instruction must be fetched.

⇒ Every data access causes a stall.

2. (c)

No. NOPs must be fetched from instruction memory.

2(d)

$$25\% + 11\% = 36\%$$

### 3.(a) (b)

|      |                  |                    |  |  |  |  |
|------|------------------|--------------------|--|--|--|--|
| ld   | \$s0, 0(\$s3)    | IF ID EX MEM WB    |  |  |  |  |
| ld   | \$s1, 8(\$s3)    | IF ID EX MEM WB    |  |  |  |  |
| add  | \$s2, \$s0, \$s1 | IF ID St EX MEM WB |  |  |  |  |
| addi | \$s3, \$s3, -16  | IF St ID EX MEM WB |  |  |  |  |
| bnez | \$s2, LOOP       | St IF ID EX MEM WB |  |  |  |  |
| ld   | \$s0, 0(\$s3)    | IF ID EX MEM WB    |  |  |  |  |
| ld   | \$s1, 8(\$s3)    | IF ID EX MEM WB    |  |  |  |  |
| add  | \$s2, \$s0, \$s1 | IF ID St EX MEM WB |  |  |  |  |
| addi | \$s3, \$s3, -16  | IF St ID EX MEM WB |  |  |  |  |
| bnez | \$s2, LOOP       | IF ID EX MEM WB    |  |  |  |  |

There are no cycles that all stages are useful.

### 4.(a)

```

add $s3, $s1, $s0
nop
nop
lw $s2, 4($s3)
lw $s1, 0($s4)
nop
or $s2, $s3, $s2
nop
nop
sw $s2, 0($s3)

```

### 4.(b)

It's impossible to reduce NOPs.

### 4.(c)

The code executes correctly.

### 4.(d)

| IF | ID | EX | MEM | WB | PCWrite=1 | ALUin1=X | ALUin2=X |
|----|----|----|-----|----|-----------|----------|----------|
| 1  | 2  | 3  | 4   | 5  |           |          |          |
| IF | ID | EX | MEM | WB | PCWrite=1 | ALUin1=X | ALUin2=X |
|    |    |    |     |    |           |          |          |
| IF | ID | EX | MEM | WB | PCWrite=1 | ALUin1=0 | ALUin2=0 |
|    |    |    |     |    |           |          |          |
| IF | ID | EX | MEM |    | PCWrite=1 | ALUin1=r | ALUin2=0 |
|    |    |    |     |    |           |          |          |
| IF | ID | EX |     |    | PCWrite=1 | ALUin1=0 | ALUin2=0 |
|    |    |    |     |    |           |          |          |

### 4.(e)

### 4.(f)