

MIPS

# Microprocessor with Interlocked Pipeline Stages

ISA - Instruction Set Architecture

(the interface between hardware and software)

→ Components:

- memory organization
- registers
- data types and data structures
- instruction formats
- instruction set
- addressing modes
- flow of control
- input / output
- interrupts



// for 32x32

↳ R0 - R31 general purpose registers

↳ R0 fixed value of 0

↳ R31 holds the return add of a procedure call

→ PC - program counter

↳ the address of the next instruction to be fetched

→ hi, lo - store partial result of multiplication and division

## Memory Organization

- ↳ a large, single-dimensional array
- ↳ memory address = an index into the array

// for 32x32

- ↳ word 32 bits (4 bytes)
- ↳ 32 bit addresses  $\Rightarrow 2^{32}$  bytes  $\Rightarrow 2^{30}$  words

$$\begin{matrix} 0, 1, 2, \dots \\ 0, 4, 8, \dots \end{matrix} \rightarrow \begin{matrix} 2^{31} & 1 \\ 2^{32}-4 & \end{matrix}$$

- ↳ endianness - little/big
- ↳ data alignment
  - ↳ for 16-bit word starts at multiple of 2 bytes

addresses

PC  $\rightarrow$  0040 0000 hex  
0000 0000 hex

- ↳ 3 instruction formats

- R-type : used for arithmetical/logical operations
- J-type : used for arithmetical/logical operations with immediate values, memory data transfers and conditional jumps or branches
- J-type : used for unconditional jumps

| R type : | 3      | 3  | 3  | 3  | 1  | 3        |
|----------|--------|----|----|----|----|----------|
|          | opcode | rs | rt | rd | sa | function |
|          | 3      | 3  | 3  | 3  | +  |          |

| J type : | 3      13 |    |    |             |
|----------|-----------|----|----|-------------|
|          | opcode    | rs | rt | address/imm |
|          | 3         |    |    | 13          |

| J type : | 32                    |
|----------|-----------------------|
|          | opcode target address |

for R type, opcode is always 0 and the function field identifies the full operation for each instruction

↳ also 3 bits

$\Rightarrow$  15 instructions : 8 R type  
7 J type  
1 J type

I chose to implement these 16 instructions :

R type :

ADD

add \$d, \$s, \$t  
000\_rs\_rt\_rd\_0\_000  
 $\$d \leftarrow \$s + \$t$

rs ALWAYS the source register  
rt - for R type : source register  
- for J type : destination register  
rd Always destination register ONLY  
for R type

SUB

sub \$d, \$s, \$t  
000\_rs\_rt\_rd\_0\_001  
 $\$d \leftarrow \$s - \$t$

SLL

sl \$d, \$t, a  
000 - rs - rt - rd - a - 010  
 $\$d \leftarrow \$t \ll a$

SRL

sr \$d, \$t, a  
000 - rs - rt - rd - a - 011  
 $\$d \leftarrow \$t \gg a$

AND

and \$d, \$s, \$t  
000 - rs - rt - rd - 0 - 100  
 $\$d \leftarrow \$s \& \$t$

OR

or \$d, \$s, \$t  
000 - rs - rt - rd - 0 - 101  
 $\$d \leftarrow \$s \mid \$t$

XOR

xor \$d, \$s, \$t  
000 - rs - rt - rd - 0 - 110  
 $\$d \leftarrow \$s \wedge \$t$

NOR

nor \$d, \$s, \$t  
000 - rs - rt - rd - 0 - 111  
 $\$d \leftarrow !(\$s \mid \$t)$

] type

ADD

addi \$t, \$s, imm  
001 - rs - rt - iiiiii  
 $\$t \leftarrow \$s + imm$

LW

lw \$t, offset(\$s)  
010 - rs - rt - tttttt  
 $\$t \leftarrow MEM[\$s + offset]$

ex: lw \$t, 8(\$t)

↳ puts im to what is in  
memory at (t+8)

SW

sw \$t, offset(\$s)  
011 - rs - rt - tttttt  
 $MEM[\$s + offset] \leftarrow \$t$

BEQ

beq \$s, \$t, offset  
100 - rs - rd - tttttt

If  $\$S == \$T$   
 advance\_programcounter( $\text{Offset} \ll 2$ ); // because MIPS  
 works with words  
 else  
 advance\_programcounter(4)

why  $\ll 2$  ???  
 because offset (which is a word)  
 (4 bytes), so the  
 words into bytes if we multiply  
 by 4  $\Rightarrow \ll 2$

ANDJ

andi  $\$T, \$S, imm$   
 101 - rs - rt - iiiiii  
 $\$T \leftarrow \$S \& imm$

ORJ

ori  $\$T, \$S, imm$   
 101 - rs - rt - iiiliii  
 $\$T \leftarrow \$S | imm$

J-type :

$j$   
 $j target$   
 111 - jjjjjjjjjjjj  
 13

$PC \leftarrow mPC; mPC = (PC \& 0x000) | (\text{target} \ll 2)$

why ???  
 target is stored like an offset, no word type  
 $\Rightarrow$  we need to shift it by 2 to get the bytes

$PC \& 0x000 \rightarrow$  the 4 bits of the superior PC  
 value ; do this to make sure you stay in the  
 same memory segment

when " $|(\text{target} \ll 2)|$ ", we jump in the same  
 memory segments but we change the last 12 bits

I chose to do the following algorithm:

→ Fibonacci numbers

|    |                                                                  |                        |
|----|------------------------------------------------------------------|------------------------|
| 0  | 001 - 000 - 001 - 000000 $\rightarrow$ addi \$1, \$0, 0          | store 0 in<br>\$1, \$3 |
| 1  | 001 - 000 - 010 - 000001 $\rightarrow$ addi \$2, \$0, 1          |                        |
| 2  | 001 - 000 - 011 - 000000 $\rightarrow$ addi \$3, \$0, 0          | store 1 in<br>\$2, \$4 |
| 3  | 001 - 000 - 100 - 0000001 $\rightarrow$ addi \$4, \$0, 1         |                        |
| 4  | 011 - 011 - 001 - 000000 $\rightarrow$ sw \$1, 0(\$3)            |                        |
| 5  | 011 - 100 - 010 - 000000 $\rightarrow$ sw \$2, 0(\$4)            |                        |
| 6  | 010 - 011 - 001 - 000000 $\rightarrow$ lw \$1, 0(\$3)            |                        |
| 7  | 010 - 100 - 010 - 000000 $\rightarrow$ lw \$2, 0(\$4)            |                        |
| 8  | 000 - 001 - 010 - 101 - 0 - 000 $\rightarrow$ addi \$5, \$1, \$2 | offset instruction     |
| 9  | 000 - 000 - 010 - 001 - 0 - 000 $\rightarrow$ addi \$1, \$0, \$2 |                        |
| 10 | 000 - 000 - 101 - 010 - 0 - 000 $\rightarrow$ addi \$2, \$0, \$5 |                        |
| 11 | 111 - 000 - 000 - 0001000 $\rightarrow$ j 8                      |                        |

## Components of the MIPS:

→ PC program counter

16-bit edge-triggered flip-flop

→ IM instruction memory (ROM)

↳ one input bus: instruction address

↳ one output bus: instruction data

↳ memory word is 16 bit (selected by instruction address)

↳ NO control signals

→ RF register file

↳ 2 read addresses and 1 write address

↳ 8 16-bit registers ( $r_0, r_1, \dots, r_7$  encoded on 3 bits)

↳ 2 16-bit data outputs (Read data 1, Read data 2)

↳ 1 16-bit data input (write data)

↳ multiple accesses (2 asynchronous reads and 1 synchronous edge triggered write); during the read operation, the register file behaves as a combinational logic block

↳ 1 control signal (RegWrite); when RegWrite is asserted the value on the WriteData line is written in the register indicated by the write address line

→ DM data memory (RAM)

↳ 1 16-bit input address bus: Address

↳ 1 16-bit input data bus: Write Data

↳ 1 16-bit output data bus: Read Data

↳ 1 control signal: MemWrite

→ EU extension unit

↳ ExtOp = 1  $\Rightarrow$  sign extension

= 0  $\rightarrow$  zero extension

→ ALU arithmetic logic unit

↳ performs arithmetical / logical operations

! identify all the operations that the ALU needs to perform after completing the definition of the instructions

! identify how many control bits are necessary to encode the ALU operations (ALU code)



Figure 1: MIPS 32 Single-Cycle Data-Path + Control

→ instruction execution cycle  
if > ID, EX, MEM, WB

+ we will need the SSP  
and the MRS

### INSTRUCTION FETCH

- ↳ program counter
- ↳ instruction memory (ROM)
- ↳ adder
- ↳ 2 MUXs for selecting the next instruction address

- ↳ inputs :
- ↳ clock signal (for PC)
- ↳ branch target address
- ↳ jump address
- ↳ jump control signal
- ↳ PCsrc control signal (for branch)

- ↳ outputs :
- ↳ the instruction to be executed by the MIPS
- ↳ the next sequential instruction address (PC+4)

- ↳ Jump
  - L = 1  $\Rightarrow$  PC  $\leftarrow$  jump address +4 if 32x32
  - L = 0  $\Rightarrow$  PCsrc

$$L = 0 \Rightarrow PC \leftarrow PC + 1$$

$$L = 1 \Rightarrow PC \leftarrow \text{branch address}$$

! the new PC value will be written in the PC register  
Only when a button from the board is pressed  
(an enable signal from the MRS given as input to the JF Unit)  
+ use another MRS signal to reset the PC register



Figure 5: Instruction Fetch Data-Path for MIPS 32

INSTRUCTION DECODE UNIT  $\rightarrow$  read registers, extract fields, extend imm

- $\hookrightarrow$  register file
- $\hookrightarrow$  multiplexer
- $\hookrightarrow$  sign/zero extender

$\hookrightarrow$  inputs

- $\hookrightarrow$  clock (for the Register File Writes)
- $\hookrightarrow$  the 16 bit instruction
- $\hookrightarrow$  the 16 bit Write Data for the Register File
- $\hookrightarrow$  control signals:
  - $\hookrightarrow$  RegWrite (write enable for the Reg file)
  - $\hookrightarrow$  Regst (selects the write address for the Regfile)
  - $\hookrightarrow$  ExtOp (selects between Sign and Zero extension of the immediate field)

$\hookrightarrow$  outputs

- $\hookrightarrow$  register from rs address : 16 bit Read Data 1
- $\hookrightarrow$  register from rt address : 16 bit Read Data 2
- $\hookrightarrow$  16 bit extended immediate
- $\hookrightarrow$  3 bit function field
- $\hookrightarrow$  1 bit shift amount for R type shift instructions

$\hookrightarrow$  RegDst

- $\hookrightarrow$   $L = 1 \Rightarrow$  the write address for the reg file is the rd field of the instruction
- $\hookrightarrow$   $L = 0 \Rightarrow$  the write address for the reg file is the rt field of the instruction

$\hookrightarrow$  RegWrite = 1  $\Rightarrow$  write the value provided by the write data signal into the write address register in reg file

↳ ExtOp

$L = 0 \Rightarrow$  perform zero extension of the 16-bit imm  
 $L = 1 \Rightarrow$  performs sign extension of the 16-bit imm



Figure 5: Instruction Decode Data-Path for MIPS 32

## MAIN CONTROL UNIT

↳ inputs

↳ the 3-bit opcode field of the instruction

↳ outputs

↳ the main data-path control signals (except for ALU/G)

connect the control signals generated by the Main Control Unit to the IF and ID

! for writing in the register file it is necessary to control the writing mechanism from one of the buttons available on the board ; MUST BE VALIDATED (with an AND gate) WITH ONE OF THE MPG OUTPUTS (use the same enable signal as for write validation in the PC register)

! FOR CONNECTING THE SSD :

use a MUX

sw(f:5) = 000  $\Rightarrow$  display the result on the SSD

sw(f:5) = 001  $\Rightarrow$  display the next  $\rightarrow$  C (PC+) on the SSD

sw(f:5) = 010  $\Rightarrow$  display RD1 on the SSD

sw(f:5) = 011  $\Rightarrow$  display RD2 on the SSD

$sw[7:5] = 100 \Rightarrow$  display the WD signal on the SSD  
 $sw[7:5] = 101 \Rightarrow$  display the ALURes signal on the SSD  
 $sw[7:5] = 110 \Rightarrow$  display the MemData signal on the SSD  
 $sw[7:5] = 111 \Rightarrow$  display the WD signal on the SSD

On the LEDs, we display the control signals from the Main Control Unit

LED 5 : -  $sw[0] = 0 \Rightarrow$  display the 1-bit control signals on the LED  
 -  $sw[0] = 1 \Rightarrow$  display the n-bit ALUOp signal on the LED  
 (the other LEDs have'd)



Figure 6: MIPS 32 Single Cycle Main Control Unit

### EXECUTE UNIT

- ↳ ALU
- ↳ ALU Control
- ↳ Multiplier

↳ Shift Left 2 & Adder for branch target address computation

#### ↳ inputs

- ↳ the next sequential TA (PC+1)
- ↳ RD1
- ↳ RD2
- ↳ 16 bit extended immediate
- ↳ 3 bit funct field
- ↳ 1 bit shift amount
- ↳ control signals

↳ ALUSrc → selects between RD2 and Ext-imm  
 as input to the second port of the ALU  
 ↳ ALUOp → provided by the main control

#### ↳ outputs

- ↳ 16 bit ALU result
- ↳ 1 bit zero - signal

↳ this indicates whether the result of the ALU is equal to 0 or not

↳ ALUSrc

- $L = 0 \Rightarrow RD_2$  is the second input for the ALU
- $L = 1 \Rightarrow Ext\_Imm$  is the second input for the ALU

Branch Address  $\leftarrow PC + 1 + S\_Ext(Imm) \ll 2$



Figure 5: Instruction Execute Data-Path for MIPS 32

MEMORY UNIT

↳ data memory

RAM asynchronous read and synchronous write

↳ inputs

- ↳ clock signal (for Data Memory Writes)
- ↳ 16 bit ALURes signal
- ↳ 16 bit RD2 signal
- ↳ MemWrite control signal

↳ outputs

- ↳ 16 bit MemData (used only for load instructions)
- ↳ 16 bit ALURes

The write back unit is just the next term fig 1 and the

other components are : the AND gate for generating the PCSrc, the jump address computation

↳ Memto Reg

$L = 0 \Rightarrow$  ALURes signal is the input to the Write Data port of the Register file  
 $L = 1 \Rightarrow$  the MemData is the input to the Write Data port of the Register file

$PCSrc \leftarrow \text{Branch and Jmp};$



Figure 6: Memory Unit Data-Path for MIPS 32

WhiteBox component selects ALU result or memory data for writing registers

# MIPS PIPELINE

I chose to do the following algorithm: → in the single cycle  
 → Fibonacci numbers

|    |                       |                     |                                                    |
|----|-----------------------|---------------------|----------------------------------------------------|
| 0  | 001_000_001_000000    | → addi \$1, \$0, 0  | ↑ store 0 in<br>\$1, \$3<br>store 1 in<br>\$2, \$4 |
| 1  | 001_000_010_000001    | → addi \$2, \$0, 1  |                                                    |
| 2  | 001_000_011_000000    | → addi \$3, \$0, 0  |                                                    |
| 3  | 001_000_100_000001    | → addi \$4, \$0, 1  |                                                    |
| 4  | 011_011_001_000000    | → sw \$1, 0(\$3)    |                                                    |
| 5  | 011_100_010_000000    | → sw \$2, 0(\$4)    |                                                    |
| 6  | 010_011_001_000000    | → lw \$1, 0(\$3)    |                                                    |
| 7  | 010_100_010_000000    | → lw \$2, 0(\$4)    |                                                    |
| 8  | 000_001_010_101_0_000 | → add \$5, \$1, \$2 |                                                    |
| 9  | 000_000_010_001_0_000 | → add \$1, \$0, \$2 |                                                    |
| 10 | 000_000_101_010_0_000 | → add \$2, \$0, \$5 |                                                    |
| 11 | 111_000_000_0001000   | → j 8               |                                                    |

↑ offset instruction

Now, let's move on to the MIPS pipeline

↳ phases:

} IF  
ID/IOF  
EX  
MEM  
WB

we have to partition the data-path along  
the critical path with rising edge triggered  
registers → flip flops  
↳ inserted between these phases

=> we can simultaneously  
execute at most 5 instructions

each of them executing  
one of the 5 execution stages

=> IF/ID ; ID/EX ; EX/MEM ; MEM/WB

! the MUX used for selecting the WT. for the Register File is placed  
in EX, not in ID as in the single-cycle CPU case

HAZARDS IN MIPS

↳ data hazards → read after write / load data hazards

we'll take the code, write the R-type inst. and then find the  
hazards

|    |                         |                    |
|----|-------------------------|--------------------|
| 0  | 001_000_001_0000000 →   | R[1] ← R[0]+0      |
| 1  | 001_000_010_0000001 →   | R[2] ← R[0]+1      |
| 2  | 001_000_011_0000000 →   | R[3] ← R[0]+0      |
| 3  | 001_000_100_0000001 →   | R[4] ← R[0]+1      |
| 4  | 011_011_001_0000000 →   | MEM[R[3]+0] ← R[1] |
| 5  | 011_100_010_0000000 →   | MEM[R[4]+0] ← R[2] |
| 6  | 010_011_001_0000000 →   | R[1] ← MEM[R[3]+0] |
| 7  | 010_100_010_0000000 →   | R[2] ← MEM[R[4]+0] |
| 8  | 000_001_010_101_0_000 → | R[5] ← R[1]+R[2]   |
| 9  | 000_000_010_001_0_000 → | R[1] ← R[0]+R[2]   |
| 10 | 000_000_101_010_0_000 → | R[2] ← R[0]+R[5]   |
| 11 | 111_000_000_0001000 →   | PC ← PC & 0xF000   |

! instruction 6: lw \$1, 0(\$3)

instruction 8: add \$5, \$1, \$2

↳ reads \$1 and \$2  
 !!! \$1 is written in instr 6 and used  
 in instruction 8

⇒ RAW

If no barriers:

| Cycle | IF              | ID              | EX              | MEM             | WB              |
|-------|-----------------|-----------------|-----------------|-----------------|-----------------|
| 1     | J <sub>0</sub>  |                 |                 |                 |                 |
| 2     | J <sub>1</sub>  | J <sub>0</sub>  |                 |                 |                 |
| 3     | J <sub>2</sub>  | J <sub>1</sub>  | J <sub>0</sub>  |                 |                 |
| 4     | J <sub>3</sub>  | J <sub>2</sub>  | J <sub>1</sub>  |                 |                 |
| 5     | J <sub>4</sub>  | J <sub>3</sub>  | J <sub>2</sub>  | J <sub>0</sub>  | J <sub>0</sub>  |
| 6     | J <sub>5</sub>  | J <sub>4</sub>  | J <sub>3</sub>  | J <sub>1</sub>  | J <sub>1</sub>  |
| 7     | J <sub>6</sub>  | J <sub>5</sub>  | J <sub>4</sub>  | J <sub>2</sub>  | J <sub>2</sub>  |
| 8     | J <sub>7</sub>  | J <sub>6</sub>  | J <sub>5</sub>  | J <sub>3</sub>  | J <sub>3</sub>  |
| 9     | J <sub>8</sub>  | J <sub>7</sub>  | J <sub>6</sub>  | J <sub>4</sub>  | J <sub>4</sub>  |
| 10    | J <sub>9</sub>  | J <sub>8</sub>  | J <sub>7</sub>  | J <sub>5</sub>  | J <sub>5</sub>  |
| 11    | J <sub>10</sub> | J <sub>9</sub>  | J <sub>8</sub>  | J <sub>6</sub>  | J <sub>6</sub>  |
| 12    | J <sub>11</sub> | J <sub>10</sub> | J <sub>9</sub>  | J <sub>7</sub>  | J <sub>7</sub>  |
| 13    |                 | J <sub>11</sub> | J <sub>10</sub> | J <sub>8</sub>  | J <sub>8</sub>  |
| 14    |                 |                 | J <sub>11</sub> | J <sub>9</sub>  | J <sub>9</sub>  |
| 15    |                 |                 |                 | J <sub>10</sub> | J <sub>10</sub> |
| 16    |                 |                 |                 | J <sub>11</sub> | J <sub>11</sub> |

writes \$5 in WB (cycle 13)  
 J10 uses it in EX (cycle 12)

⇒ we need to add NoOp after loading in register 1 and in z  
 and after adding

=>

0

001 - 000 - 001 - 0000000 →

R[1] ← R[0] + 0

1

001 - 000 - 010 - 0000001 →

R[2] ← R[0] + 1

2

001 - 000 - 011 - 0000000 →

R[3] ← R[0] + 0

3

001 - 000 - 100 - 0000001 →

R[4] ← R[0] + 1

4

011 - 011 - 001 - 0000000 →

MEM[R[3]+0] ← R[1]

5

011 - 100 - 010 - 0000000 →

MEM[R[4]+0] ← R[2]

6, 7, 8

010 - 011 - 001 - 0000000 →

R[1] ← MEM[R[3]+0] + Nop × 2

X 9, 10, 11, 12

010 - 100 - 010 - 0000000 →

R[2] ← MEM[R[4]+0] + Nop × 3

X 13, 14, 15

000 - 001 - 010 - 101 - 0 - 000 →

R[5] ← R[1] + R[2] + Nop × 2

X 16

000 - 000 - 010 - 001 - 0 - 000 →

R[1] ← R[0] + R[2]

X 17, 18

000 - 000 - 101 - 000 - 0 - 000 →

R[2] ← R[0] + R[5] + Nop

X 19

111 - 000 - 000 - 0001000 →

PC ← PC & 0xF000 (for jump)