

# RISC Design

---

Virendra Singh

Professor

Computer Architecture and Dependable Systems Lab

Department of Electrical Engineering, and  
Dept. of Computer Science & Engineering

Indian Institute of Technology Bombay

<http://www.ee.iitb.ac.in/~viren/>

E-mail: [viren@{ee, cse}.iitb.ac.in](mailto:viren@{ee, cse}.iitb.ac.in)

*EE-739: Processor Design @ IITB*

---



Lecture 14 (12 February 2022)

CADSL

RISC

- Simple instructions
- uniform & fixed encoding
- Limited addressing modes.

$$S \xrightarrow{T} S'$$

One go - Controller  
 ↑  
Combinational logic.

in-efficiency

Time =  $T^2$  longest sequential operations

load.



RISC

h/w

↑  
 m/c control  
 signals



$$S \xrightarrow{P} S'$$

$$S \xrightarrow{} P \xrightarrow{} P' \xrightarrow{} P'' \xrightarrow{} S'$$

P

(Multicycle      Implement)

- }. Read & write - Memory
- }. Read & write - RF
- }. ALU operation

{ Improve efficiency  
{ Better re-use of resources  
↑  
Cost



# Overview of DLX ISA

---

- ❖ simple instructions, all 32 bits wide
- ❖ very structured, no unnecessary baggage
- ❖ only three instruction formats

|   |    |                |     |                     |       |  |
|---|----|----------------|-----|---------------------|-------|--|
| R | op | rs1            | rs2 | rd                  | funct |  |
| I | op | rs1            | rd  | 16 bit address/data |       |  |
| J | op | 26 bit address |     |                     |       |  |

- ❖ rely on compiler to achieve performance



# Example Instruction Set: MIPS Subset

---

## MIPS Instruction – Subset

### ❖ Arithmetic and Logical Instructions

- add, sub, or, and, slt

(R)

### ❖ Memory reference Instructions

- lw, sw

(I)

### ❖ Branch

- beq, j



# How Fast Can the Clock Be?

---

- If every instruction is executed in one clock cycle, then:
  - Clock period must be at least 8ns to perform the longest instruction, i.e., *lw*.
  - This is a single cycle machine.
  - It is slower because many instructions take less than 8ns but are still allowed that much time.
- Method of speeding up: Use multicycle datapath.



# A Single Cycle Example

RCA



# A Multicycle Implementation



Major tasks → {

- Read & Write of Memory } 2 ns
- Read & write of RF } 1 ns
- ALU operation } 2 ns

( do not want to cascade ) ↴

Parallel tasks major tasks are fine





Date path

Steering Logic  $\Rightarrow$  Point-to-point



- $\begin{cases} HKT & OT \\ OT & \underbrace{HKT.} \\ \Downarrow & \end{cases}$
- fetch instruction (PC) IP) ↪
- update the PC
- understand instruction (Decoder)
- Fetch the operand. ( $\begin{cases} RF & \text{Immediate} \end{cases}$ )
- Execute the instruction.
  - Effective add. of memm.
  - Access memory
- Update the system bus (RF)



# R-Type Instructions

---

- Also known as arithmetic-logical instructions
- **add, sub, slt**
- Example: add \$t0, \$s1, \$s2
  - Machine instruction word  
000000 10001 10010 01000 00000 100000  
opcode    \$s1    \$s2    \$t0                function
  - Read two registers
  - Write one register
  - Opcode and function code go to control unit that generates RegWrite and ALU operation code.



*S<sub>1</sub>*



instruction  
Register



*S<sub>3</sub>*



*S<sub>2</sub>*





Steering:

# Load and Store Instructions

$$R_d \leftarrow M(R_s + Imm16)$$

- I-type instructions

• **Iw \$t0, 1200 (\$t1)** # incr. in bytes

100011 01001 01000 0000 0100 1011 0000  
opcode \$t1 \$t0 1200

• **sw \$t0, 1200 (\$t1)** # incr. in bytes

101011 01001 01000 0000 0100 1011 0000  
opcode \$t1 \$t0 1200



$S_5$

$pc \rightarrow$  Mem-add., alu-A  
 Mem-data  $\rightarrow$  IR  
~~pc~~ + 4  $\rightarrow$  alu-B  
 alu-C  $\rightarrow$  pc

$S_7$

$T_1 \rightarrow$  alu-A  
 $IR_{15-0} \rightarrow SEL \rightarrow$  alu-B  
 alu-C  $\rightarrow T_3$

$S_6$

$IR_{25-21} \rightarrow RF\text{-}A_1$   
 $RF\text{-}A_1 \rightarrow T_1$

$S_8$

$T_3 \rightarrow$  Mem add.  
 Mem-data  $\rightarrow T_3$

$T_4$

$S_9$

$T_3 \rightarrow RF\text{-}D_3$   
 $IR_{20-16} \rightarrow RF\text{-}A_3$





# Load and Store Instructions

---

- I-type instructions

- $lw \$t0, 1200 (\$t1)$  # incr. in bytes

100011 01001 01000 0000 0100 1011 0000

opcode \$t1 \$t0 1200

- $sw \$t0, 1200 (\$t1)$  # incr. in bytes

101011 01001 01000 0000 0100 1011 0000

opcode \$t1 \$t0 1200



# Branch Instruction (I-Type)

- beq       $\$s1, \$s2,$  25      # if  $\$s1 = \$s2$ ,  
                  advance PC through  
                  25 instructions

000100 10001 10010 0000 0000 0001 1001  
opcode    \$s1    \$s2                          25

16-bits

**Note:** Can branch within  $\pm 2^{15}$  words from the current instruction address in PC.



PC+4 + Imm16 X4



4 states

$S_{10} - S_{11} - S_{12} - S_{13}$

$S_{10} - S_{11} \rightarrow S_{12} \rightarrow S_{13}$

20

CADSL



$S_{10}$

$pc \rightarrow$  Mem add, alu-A  
 $t_4 \rightarrow$  alu-B  
 $\text{Mem\_data} \rightarrow [R]$   
 $\text{alu\_c} \rightarrow pc$

$S_{12}'$

$T_1 \rightarrow$  alu-A  
 $T_2 \rightarrow$  alu-B  
 $\text{if } (Zen == 1)$   
 then:  $T_3 \rightarrow pc$

$S_{11}'$

$IR_{25-21} \rightarrow RF-A_1$   
 $IR_{20-16} \rightarrow RF-A_2$   
 $RF-D_1 \rightarrow T_1$   
 $RF-D_2 \rightarrow T_2$   
 $pc \rightarrow$  alu-A  
 $IR_{15-0} \rightarrow SEL \rightarrow 2S \rightarrow$  alu-B  
 $\text{alu\_c} \rightarrow T_3$

$S_{10} \rightarrow S_{11}' \rightarrow S_{12}'$





# J-Type Instruction

---

- j 2500 # jump to instruction 2,500



# Multicycle Datapath







OPERATION



State address =



# 3 to 5 Cycles for an Instruction

| Step                                                      | R-type<br>(4 cycles)                                                                                                 | Mem. Ref.<br>(4 or 5 cycles)                              | Branch type<br>(3 cycles)                    | J-type<br>(3 cycles)                                 |
|-----------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------|----------------------------------------------|------------------------------------------------------|
| Instruction fetch                                         | IR $\leftarrow$ Memory[PC]; PC $\leftarrow$ PC+4                                                                     |                                                           |                                              |                                                      |
| Instr. decode/<br>Reg. fetch                              | A $\leftarrow$ Reg(IR[21-25]); B $\leftarrow$ Reg(IR[16-20])<br>ALUOut $\leftarrow$ PC + (sign extend IR[0-15]) << 2 |                                                           |                                              |                                                      |
| Execution, addr.<br>Comp., branch &<br>jump<br>completion | ALUOut $\leftarrow$<br>A op B                                                                                        | ALUOut $\leftarrow$<br>A+sign extend<br>(IR[0-15])        | If (A = B)<br>then<br>PC $\leftarrow$ ALUOut | PC $\leftarrow$ PC[28-<br>31]<br>  <br>(IR[0-25]<<2) |
| Mem. Access or<br>R-type<br>completion                    | Reg(IR[11-<br>15]) $\leftarrow$<br>ALUOut                                                                            | MDR $\leftarrow$ M[ALUout]<br>or M[ALUOut] $\leftarrow$ B |                                              |                                                      |
| Memory read<br>completion                                 |                                                                                                                      | Reg(IR[16-20]) $\leftarrow$<br>MDR                        |                                              |                                                      |



# Thank You

