

# ELEC 374: MiniSRC CPU Project

## Final Report

Group 4

Jacob Chisholm (20335775)

Hendrix Gryspeerdt (21hgg3)

Luke Strickland (21laps1)

April 6, 2025

*"We do hereby verify that this written lab report is our own work and contains our own original ideas, concepts, and designs. No portion of this report has been copied in whole or in part from another source, with the possible exception of properly referenced material".*

## Abstract

This report presents the design, implementation, and evaluation of the MiniSRC CPU, a 32-bit RISC-based processor architecture supporting integer arithmetic, logic operations, conditional branching, memory and port-mapped I/O, and a VGA interface. The design comprises a detailed instruction set architecture, data path components, and control logic optimized for efficient decoding and minimal propagation delay. Implementation was done on an FPGA (Cyclone II DE2 development board), achieving a maximum operating frequency of 50 MHz with approximately 7000 logic elements utilized. Although fully operational individually, pipelining was not fully integrated, providing room for future work on pipeline registers, hazard detection, and forwarding logic. Overall, the project delivered a working and scalable foundation suitable for further expansion into a fully pipelined processor.

# Table of Contents

|                                                                       |           |
|-----------------------------------------------------------------------|-----------|
| <b>Abstract</b>                                                       | <b>2</b>  |
| <b>Table of Contents</b>                                              | <b>3</b>  |
| <b>Project Specification</b>                                          | <b>5</b>  |
| Processor State                                                       | 5         |
| I/O State                                                             | 5         |
| Instruction Set Specification                                         | 6         |
| Instruction Formats                                                   | 6         |
| Load and Store instructions                                           | 7         |
| Arithmetic and Logical instructions                                   | 7         |
| Branch instructions: brzr, brnz, brmi, brpl B-Format                  | 8         |
| Jump instructions: jr, jal J-Format                                   | 8         |
| Input/Output and MFHI/MFLO instructions: in, out, mfhi, mflo J-Format | 8         |
| Miscellaneous instructions: nop, halt M-Format                        | 9         |
| Explicit Instruction Encodings                                        | 9         |
| Load and Store Instructions                                           | 9         |
| Arithmetic and Logical Instructions                                   | 9         |
| Conditional Branch Instructions                                       | 11        |
| Jump Instructions                                                     | 11        |
| Input/Output and MFHI/MFLO (Move From HI, Move From LO) Instructions  | 12        |
| Miscellaneous Instructions                                            | 12        |
| <b>Project Design and Implementation</b>                              | <b>12</b> |
| Control Unit                                                          | 12        |
| Instruction Decode                                                    | 13        |
| Data Path Control Signals                                             | 13        |
| Data Path                                                             | 14        |
| Program Counter                                                       | 15        |
| Register File                                                         | 16        |
| ALU                                                                   | 17        |
| Adder                                                                 | 17        |
| Divider                                                               | 18        |
| Multiplier                                                            | 18        |
| Bit Manipulation Units                                                | 18        |
| Interfaces                                                            | 19        |
| VGA Interface                                                         | 19        |
| Port Mapped Interfaces                                                | 20        |
| <b>Evaluation Results</b>                                             | <b>21</b> |
| <b>Discussion, Conclusions, and Future Work</b>                       | <b>21</b> |
| <b>Appendices</b>                                                     | <b>23</b> |
| Appendix 1: Verilog HDL Code                                          | 23        |

|                                                            |    |
|------------------------------------------------------------|----|
| ALU code                                                   | 23 |
| ALU module                                                 | 23 |
| AND module                                                 | 26 |
| CLA module                                                 | 27 |
| DIV32 module                                               | 28 |
| MUL32 module                                               | 29 |
| OR module                                                  | 33 |
| ROL and ROR module                                         | 33 |
| SHIFT module                                               | 34 |
| XOR module                                                 | 34 |
| Control Code                                               | 35 |
| Control Module                                             | 35 |
| Decode Module                                              | 40 |
| PC Module                                                  | 41 |
| Instruction and ALU encoding                               | 42 |
| Datapath module                                            | 43 |
| Processor module                                           | 46 |
| Register module                                            | 48 |
| Register file module                                       | 49 |
| Seven Segment Code                                         | 50 |
| Sev_segs module                                            | 50 |
| Seven_seg module                                           | 51 |
| VGA Code                                                   | 52 |
| VGA controller code                                        | 52 |
| Memory code                                                | 54 |
| Memory module                                              | 54 |
| MMU module                                                 | 55 |
| Appendix 2: Processor Schematic                            | 56 |
| Appendix 3: Functional Simulation results for Phase 4      | 56 |
| Appendix 4: Printout of the contents of memory for Phase 4 | 59 |
| Appendix 5: Image of the FPGA board during operation       | 73 |

# Project Specification

## Processor State

PC<31..0>: 32-bit Program Counter (PC)

IR<31..0>: 32-bit Instruction Register (IR)

R[0..15]<31..0>: 16 32-bit registers, named R[0] through R[15]

R[0]<31..0>: 1 Constant zero register

R[1..7]<31..0>: 7 General-Purpose Registers

R[8]<31..0>: Return Address Register (RA)

R[9]<31..0>: Stack Pointer (SP)

R[10..13]<31..0>: Four Argument Registers

R[14..15]<31..0>: Two Return Value Registers

RASH<31..0>: (Register ALU Storage Hi) 32-bit Register dedicated to keep the high-order word of a Multiplication product, or the Remainder of a Division operation

RASL<31..0>: (Register ALU Storage Low) 32-bit Register dedicated to keep the low-order word of a Multiplication product, or the Quotient of a Division operation

## I/O State

Mem.In<31..0>: 32-bit memory data input port

MemData.Out<31..0>: 32-bit memory data output port

MemAddress.Out<31..0>: 32-bit memory address output port

MemRead.Out: indicate that the processor is requesting data from MemAddress.Out.

MemWrite.Out: indicate that the processor is requesting data to be written at location MemAddress.Out.

MemReady.In: Indicate whether assertion of MemRead.out or MemWrite.out signals have completed their function in the memory unit.

In.Port<31..0>: 32-bit Input Port

Out.Port<31..0>: 32-bit Output Port

Run.Out: Run/halt Indicator

Reset.In: Reset signal

Our processor assumes a byte-addressable address space, hence the memory address output of MemAddress.Out and the logical address space of the processor is  $2^{32}$  8-bit bytes or  $2^{32}$  32-bit words. This has implications for the encoding of the branch instruction and program counter where the branch offset is stored as an offset in words since each instruction is a 32-bit word in memory and the program counter is incremented by multiples of 4.

Additionally, for simplicity, our processor does not output a Stop or Halt signal when in the halted state. The halted state can be tested for by observing that MemAddress.Out remains unchanged between the number of cycles it takes to complete an instruction.

## Instruction Set Specification

The Arithmetic Logic Unit (ALU) performs 13 operations: addition, subtraction, multiplication, division, shift right, shift right arithmetic, shift left, rotate right, rotate left, logical AND, logical OR, Negate (2's complement), and NOT (1's complement). The instructions in Mini SRC are one-word (32-bit) long each. They can be categorized as Load and Store instructions, Arithmetic and Logical instructions, Conditional Branch and Jump instructions, Input/Output instructions, and miscellaneous instructions. The following six addressing modes are supported: Direct, Indexed, Register, Register Indirect, Immediate, and Relative.

## Instruction Formats

There are five instruction formats, as shown in the table below.

| Name       | Fields           |                  |                  |                     |                  | Comments                              |
|------------|------------------|------------------|------------------|---------------------|------------------|---------------------------------------|
| Field size | 31..27<br>5 bits | 26..23<br>4 bits | 22..19<br>4 bits | 18..15<br>4 bits    | 14..0<br>15 bits | All instructions are 32-bit long      |
| R-Format   | OP-code          | Ra               | Rb               | Rc                  | Unused           | 3 Register ALU instruction            |
| I-Format   | OP-code          | Ra               | Rb               | Constant C / Unused |                  | ALU Immediate instruction; Load/Store |
| B-Format   | OP-code          | Ra               | C2               | Constant C          |                  | Branch                                |

|          |         |        |        |       |
|----------|---------|--------|--------|-------|
| J-Format | OP-code | Ra     | Unused | Jump  |
| M-Format | OP-code | Unused |        | Misc. |

Op-code: specifies the type of operation to be performed.

Ra, Rb, Rc: specify addresses of registers in the register file, 0000=>R0, 0001=>R1, ..., 1111=>R15.

C: 19-bit constant value.

C2: 4-bit value representing the branch condition type where only the lower 2 bits are considered.

### Load and Store instructions

operands in memory can be accessed only through load/store instructions.

(a) ld, ldi, st I-Format

|         |       |       |       |   |
|---------|-------|-------|-------|---|
| 31      | 27 26 | 23 22 | 19 18 | 0 |
| Op-code | Ra    | Rb    | C     |   |

### Arithmetic and Logical instructions

(a) add, sub, and, or, shr, shra, shl, ror, rol R-Format

|         |       |       |       |              |   |
|---------|-------|-------|-------|--------------|---|
| 31      | 27 26 | 23 22 | 19 18 | 15 14        | 0 |
| Op-code | Ra    | Rb    | Rc    | -- unused -- |   |

(b) addi, andi, ori I-Format

|         |       |       |       |   |
|---------|-------|-------|-------|---|
| 31      | 27 26 | 23 22 | 19 18 | 0 |
| Op-code | Ra    | Rb    | C     |   |

(c) mul, div, neg, not I-Format

|         |    |    |             |
|---------|----|----|-------------|
| Op-code | Ra | Rb | -- unused-- |
|---------|----|----|-------------|

Branch instructions: brzr, brnz, brmi, brpl B-Format

|         |    |    |  |   |
|---------|----|----|--|---|
| Op-code | Ra | C2 |  | C |
|---------|----|----|--|---|

## Jump instructions: jr, jal J-Format

|         |    |              |
|---------|----|--------------|
| Op-code | Ra | -- unused -- |
|---------|----|--------------|

## Input/Output and MFHI/MFLO instructions: in, out, mfhi, mflo J-Format

|         |    |              |
|---------|----|--------------|
| Op-code | Ra | -- unused -- |
|---------|----|--------------|

### Miscellaneous instructions: nop, halt M-Format

|         |              |
|---------|--------------|
| Op-code | -- unused -- |
|---------|--------------|

## Explicit Instruction Encodings

## Load and Store Instructions

### Id: Load Direct

(00000xxxxx0000xxxxxxxxxxxxxxxxxxxxxx)

$R[Ra] \leftarrow M[C \text{ (sign-extended)}]$

Direct addressing:  $Rb = R0$

Id Ba C

|                                                                            |                                                                                                                                                                                                                                                  |     |           |
|----------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----|-----------|
| <b>Id: Load Indexed/Register Indirect</b><br>(00000xxxxxxxxxxxxxxxxxxxxxx) | $R[Ra] \leftarrow M[R[Rb]] + C \text{ (sign-extended)}$<br>Indexed addressing, $Rb \neq R0$<br>If $C = 0 \rightarrow$ Register Indirect addressing                                                                                               | Id  | Ra, C(Rb) |
| <b>ldi: Load Immediate</b><br>(00001xxxx0000xxxxxxxxxxxxxx)                | $R[Ra] \leftarrow C \text{ (sign-extended)}$<br>Immediate addressing, $Rb = R0$                                                                                                                                                                  | ldi | Ra, C     |
| (00001xxxxxxxxxxxxxxxxxxxxxx)                                              | $R[Ra] \leftarrow R[Rb] + C \text{ (sign-extended)}$<br>Immediate addressing, $Rb \neq R0$<br>If $C = 0 \rightarrow$ instruction acts like a simple register transfer<br>If $C \neq 0$ and $Ra = Rb \rightarrow$ Increment/decrement instruction | ldi | Ra, C(Rb) |

To note, the Load Immediate instruction above in the format “ldi Ra, C” is identical in functionality to the Add Immediate instruction (detailed in the next section) with the first argument register being R0 (addi Ra, R0, C).

|                                                                             |                                                                                                                                                    |    |           |
|-----------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------|----|-----------|
| <b>st: Store Direct</b><br>(00010xxxx0000xxxxxxxxxxxxxx)                    | $M[C \text{ (sign-extended)}] \leftarrow R[Ra]$<br>Direct addressing, $Rb = R0$                                                                    | st | C, Ra     |
| <b>st: Store Indexed/Register Indirect</b><br>(00010xxxxxxxxxxxxxxxxxxxxxx) | $M[R[Rb]] + C \text{ (sign-extended)} \leftarrow R[Ra]$<br>Indexed addressing, $Rb \neq R0$<br>If $C = 0 \rightarrow$ Register Indirect addressing | st | C(Rb), Ra |

## Arithmetic and Logical Instructions

Three-register instructions:

|                                          |                                       |     |            |
|------------------------------------------|---------------------------------------|-----|------------|
| <b>add: Add</b><br>(00011xxxxxxxxx-----) | $R[Ra] \leftarrow R[Rb] + R[Rc]$      | add | Ra, Rb, Rc |
| <b>sub: Sub</b><br>(00100xxxxxxxxx-----) | $R[Ra] \leftarrow R[Rb] - R[Rc]$      | sub | Ra, Rb, Rc |
| <b>and: AND</b><br>(00101xxxxxxxxx-----) | $R[Ra] \leftarrow R[Rb] \wedge R[Rc]$ | and | Ra, Rb, Rc |
| <b>or: OR</b><br>(00110xxxxxxxxx-----)   | $R[Ra] \leftarrow R[Rb] \vee R[Rc]$   | or  | Ra, Rb, Rc |

|                                                                  |                                                           |      |            |
|------------------------------------------------------------------|-----------------------------------------------------------|------|------------|
| <b>ror: Rotate Right</b><br>(00111xxxxxxxxxxxxx-----)            | Rotate right R[Rb] into R[Ra] by count in R[Rc]           | ror  | Ra, Rb, Rc |
| <b>rol: Rotate Left</b><br>(01000xxxxxxxxxxxxx-----)             | Rotate left R[Rb] into R[Ra] by count in R[Rc]            | rol  | Ra, Rb, Rc |
| <b>shr: Shift Right</b><br>(01001xxxxxxxxxxxxx-----)             | Shift right R[Rb] into R[Ra] by count in R[Rc]            | shr  | Ra, Rb, Rc |
| <b>shra: Shift Right Arithmetic</b><br>(01010xxxxxxxxxxxxx-----) | Shift right arithmetic R[Rb] into R[Ra] by count in R[Rc] | shra | Ra, Rb, Rc |
| <b>shl: Shift Left</b><br>(01011xxxxxxxxxxxxx-----)              | Shift left R[Rb] into R[Ra] by count in R[Rc]             | shl  | Ra, Rb, Rc |

Two-register with immediate value instructions:

|                                                                  |                                                                                                                                                                                                                                                                              |      |           |
|------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------|-----------|
| <b>addi: Add Immediate</b><br>(01100xxxxxxxxxxxxxxxxxxxxxx-----) | R[Ra] $\leftarrow$ R[Rb] + C (sign-extended)<br>Immediate addressing<br>If C = 0 $\rightarrow$ instruction acts like a simple register transfer<br>If C $\neq$ 0 and Ra = Rb $\rightarrow$ Increment/decrement instruction<br>Similar to ldi, however Rb can be any register | addi | Ra, Rb, C |
| <b>andi: AND Immediate</b><br>(01101xxxxxxxxxxxxxxxxxxxxxx-----) | R[Ra] $\leftarrow$ R[Rb] $\wedge$ C (sign-extended)<br>Immediate addressing                                                                                                                                                                                                  | andi | Ra, Rb, C |
| <b>ori: OR Immediate</b><br>(01110xxxxxxxxxxxxxxxxxxxxxx-----)   | R[Ra] $\leftarrow$ R[Rb] $\vee$ C (sign-extended)<br>Immediate addressing                                                                                                                                                                                                    | ori  | Ra, Rb, C |

Two-register without immediate value instructions:

|                                              |                                          |     |        |
|----------------------------------------------|------------------------------------------|-----|--------|
| <b>div: Divide</b><br>(01111xxxxxxxx-----)   | HI, LO $\leftarrow$ R[Ra] $\div$ R[Rb]   | div | Ra, Rb |
| <b>mul: Multiply</b><br>(10000xxxxxxxx-----) | HI, LO $\leftarrow$ R[Ra] $\times$ R[Rb] | mul | Ra, Rb |
| <b>neg: Negate</b><br>(10001xxxxxxxx-----)   | R[Ra] $\leftarrow$ - R[Rb]               | neg | Ra, Rb |
| <b>not: NOT</b><br>(10010xxxxxxxx-----)      | R[Ra] $\leftarrow$ $\overline{R[Rb]}$    | not | Ra, Rb |

## Conditional Branch Instructions

brzr, brnz, brmi, brpl

Branch  
(10011xxxx--xxxxxxxxxxxxxxxxxxxxxx)

Condition: --00: branch if zero  
--01: branch if nonzero  
--10: branch if positive  
--11: branch if negative

brzr Ra, C  
brnz Ra, C  
brpl Ra, C  
brmi Ra, C

## Jump Instructions

jal: jump and link  
(10100xxxx-----)

R[15]  $\leftarrow$  PC + 1  
PC  $\leftarrow$  R[Ra]

jal Ra

jr: return from procedure  
(10101xxxx-----)

PC  $\leftarrow$  R[Ra]  
If Ra = R15, it is for procedure return

jr Ra

## Input/Output and MFHI/MFLO (Move From HI, Move From LO) Instructions

|                                        |                  |         |
|----------------------------------------|------------------|---------|
| in: Input<br>(10110xxxx-----)          | R[Ra] ← In.Port  | in Ra   |
| out: Output<br>(10111xxxx-----)        | Out.Port ← R[Ra] | out Ra  |
| mflo: Move from LO<br>(11000xxxx-----) | R[Ra] ← LO       | mflo Ra |
| mfhi: Move from HI<br>(11001xxxx-----) | R[Ra] ← HI       | mfhi Ra |

## Miscellaneous Instructions

|                                   |                                   |      |
|-----------------------------------|-----------------------------------|------|
| nop: No-operation<br>(11010-----) | Do nothing                        | nop  |
| halt: Halt<br>(11011-----)        | Halt the control stepping process | halt |

## Project Design and Implementation

The designed processor features integer arithmetic instructions including addition, subtraction, multiplication, and division.

The complete design can be broken up into the processor, consisting of the Control Unit and Data Path, as well as the interfaces.

### Control Unit

The control unit handles instruction decoding and issues control signals to the data path.

Every cycle of every instruction starts with the control unit, therefore, it must be as optimised as possible to reduce delay within the processor.

To minimise cycle times, the control unit is broken up into two stages: Decode and issue.

An illustration of the control unit is shown below.



## Instruction Decode

The processor starts in cycle 1, fetching the next instruction. After fetching the instruction, it is loaded into the instruction register (IR). Then, a series of comparisons is made against the fields in the IR to create one hot signals corresponding to each instruction. The add instruction decode has been shown below for reference.

```
assign OP_ADD = (ID_OpCode == `ISA_ADD);
```

Verilog

For code readability, all instructions were defined inside the `ISA.vh` file. In the above example, `ISA\_ADD` is defined as `5'b00011`. To support a wide variety of instructions with fan-in constraints, instructions are further reduced into their types. This reduction is shown below for R-Type instructions.

```
// Opcode Format Wire (Useful for data path MUX Assignments)
assign OPF_R = (OP_ADD || OP_SUB || OP_AND || OP_OR || OP_ROR || OP_ROL ||
OP_SRL || OP_SRA || OP_SLL);
```

Through reducing all instructions to one hot signals, control signal assignment can be done through a series of gates with minimal fan-in.

## Data Path Control Signals

In addition to decoding instructions, the control unit must also issue control signals to the data path. Data path control signals consist of multiplexer select signals in addition to load enable signals for pipeline registers. Each signal must be asserted depending on the current cycle of

the processor and the type of instruction in the IR. The current cycle of the processor is tracked through a one-hot wraparound counter. As shown below, the counter will only increment if the processor is not halted and an external ready signal has been set to high.

```
// Assign Cycle
always @(posedge iClk or negedge nRst)
begin
    if(!nRst)
        Cycle = 5'b00001;
    else begin
        if(iRdy & ~OP_HLT) Cycle = {Cycle[4:1], Cycle[5]};
    end
end
```

In simulation, this signal was set to high as there are no memory delays. However, when the processor is connected to an external memory chip, the memory management unit (MMU) can assert this signal low while it waits for the memory data to become available.

With the exception of the ALU (Arithmetic and Logic Unit) control signals, all control signals are a combination of one hot signals. This allows for a maximum delay of 8 gates between the instruction register and any control signal output. The control signal assignment for the program counter load enable is shown below.

```
// PC Load Enable
assign oPC_en = Cycle[1] || (Cycle[3] && (BR_TRUE || OP_JAL || OP_JFR));
```

The full control unit code can be found in the appendix.

## Data Path

The data path contains the bulk of the processor. Housed within the data path is the Program Counter (PC), Register File (RF), and ALU. To illustrate the connections between these components, a diagram is shown below.



To shorten critical paths and to support future pipelining, registers are placed in between the functional units. Two input multiplexers are used to select the data supplied to and taken from these registers. Two input multiplexers were used to both reduce gate delay and to simplify the control signals. As seen in the following diagram, every multiplexer used in the data path can select between the previous selection or an external data source. This allows the use of one hot signals to select the data that is written back to the register files, simplifying the control unit.

## Program Counter

In addition to housing the Program Counter (PC) register, the PC module also contains the PC adder and multiplexers used for branching. The PC adder is shown below. The adder is hardwired to always accept the current PC value as one of its inputs. The other input can be selected as either the default increment value or a branch offset.

```
// PC Adder Selection
assign add_in = iOffsetEn ? iOffset : `PC_INCREMENT;

CLA pc_adder(
    .iX(add_in),
    .iY(pc_out),
    .iCarry(1'b0),
    .oS(add_out),
    // Intentionally leave these ports unconnected
    .oCarry(),
    .oOverflow(),
    .oZero(),
    .oNegative()
);
```

To support function calls and jumping, the output of the adder is sent into another multiplexer before being loaded into the PC. The multiplexer, shown below, can be used to load the PC with either the output from the adder or an address from a register.

```

// PC Input Selection
assign pc_in = iLoadEn ? iLoad : add_out;

REG32 #(.RESET(StartAddr)) pc(
    .iClk(iClk),
    .nRst(nRst),
    .iEn(iEn),
    .iD(pc_in),
    .oQ(pc_out)
);

```

## Register File

As per the ISA, the register file has 16 registers, including R0, the zero register. The register file is made up of 15 register units. Each unit has its own data output and write signal.

```

REG32 r1( .iClk(iClk), .nRst(nRst), .iEn(r1_write), .iD(iRegC), .oQ(r1_out) );

```

Two multiplexers are used to select the two source registers. Based on the RA and RB address inputs, the register data is provided onto the RA and RB outputs. The output assignment for RA is shown below.

```

// Output Register A assignment
assign oRegA = (iAddrA == 4'b0001) ? r1_out :
               (iAddrA == 4'b0010) ? r2_out :
               (iAddrA == 4'b0011) ? r3_out :
               (iAddrA == 4'b0100) ? r4_out :
               (iAddrA == 4'b0101) ? r5_out :
               (iAddrA == 4'b0110) ? r6_out :
               (iAddrA == 4'b0111) ? r7_out :
               (iAddrA == 4'b1000) ? r8_out :
               (iAddrA == 4'b1001) ? r9_out :
               (iAddrA == 4'b1010) ? r10_out :
               (iAddrA == 4'b1011) ? r11_out :
               (iAddrA == 4'b1100) ? r12_out :
               (iAddrA == 4'b1101) ? r13_out :
               (iAddrA == 4'b1110) ? r14_out :
               (iAddrA == 4'b1111) ? r15_out :
               32'd0;

```

As shown above, the multiplexer returns zero if no register is selected. This effectively implements the zero register, R0.

The write-back data is supplied to all registers, however, only the destination registers load enable (LE) signal (iWrite) is asserted in cycle 5.

```
// Write Signal Assert
assign r1_write = (iAddrC == 4'b0001) && iWrite;
assign r2_write = (iAddrC == 4'b0010) && iWrite;
assign r3_write = (iAddrC == 4'b0011) && iWrite;
assign r4_write = (iAddrC == 4'b0100) && iWrite;
assign r5_write = (iAddrC == 4'b0101) && iWrite;
...
```

Note that all of the load enable signals depend on the iWrite input. This is to allow the register file to not write to any register, a useful feature for instructions like store that do not return any result.

## ALU

The ALU, or Arithmetic and Logic Unit, houses all of the integer arithmetic units including an adder, divider, multiplier and several bit manipulation units. The ALU produces a 64-bit output dependent on which ALU operation signals are asserted by the control unit. A large multiplexer is used to select this output.

### Adder

The adder consists of four 8-bit Carry Look ahead Adders (CLAs). Each CLA is connected through a ripple carry mechanism. The Adder can also be used for the subtraction and negate instructions.

Subtraction is performed through a twos complement of the second input and by setting the carry bit before performing the addition. The negation instruction was implemented through a subtraction of 0-RA. The multiplexers used to support this behaviour are shown below:

```
assign cla_iA = (iCtrl == `CTRL_ALU_NEG) ? 32'd0 : iA;

// XOR input B for subtraction and set carry to 1
assign cla_iB = (iCtrl == `CTRL_ALU_SUB) ? 32'hFFFFFF ^ iB :
| | | | (iCtrl == `CTRL_ALU_NEG) ? 32'hFFFFFF ^ iA : iB;

assign cla_iCarry = (iCtrl == `CTRL_ALU_SUB || iCtrl == `CTRL_ALU_NEG);
```

## Divider

The divider performs the non-restoring division algorithm using a 32-bit array divider. The divider was constructed using a generate for the first 31 stages. The final stage of the divider was implemented separately to perform the final restore needed in the non-restoring division algorithm. See full divider code in the appendix.

## Multiplier

The Multiplier is capable of performing multiplication of 2, 32-bit signed 2's-complement numbers. The algorithm employs a 2-bit booth encoding to reduce the number of summands from 32 to 16. The summands are then left shifted accordingly and sign extended to align to a width of 64 bits. The 16 summands are then added together using 3 layers of 4-to-2 carry-save adders (<https://www.geoffknagge.com/fyp/carrysave.shtml>). The final 2 summands output from the final 4-to-2 carry-save adder are added using a carry-propagate adder to produce the final product.

The multiplier could be optimised to use fewer gates (but not less propagation delay) by narrowing the width of the 4-to-2 carry-save adders by only sign extending enough to meet the widths of the 4 inputs to each individual 4-to-2 reducer and directing the less significant bits of the partial sums directly to the product. It could be even further optimised at a system-level in combination with the ALU by utilising the same adder to add the final 2 summands.

The multiplier was implemented with MUL32 as the top level module, which was composed of an instance of BoothEncode\_2bit\_Nbit module and multiple Reducer4to2\_Nbit modules through the use of generate blocks. The BoothEncode\_2bit\_Nbit module is simply a composition of  $(N+1)/2$  (floor division) BoothEncode\_2bit modules applied to a bit vector of length N, and similarly for the Reducer4to2\_Nbit module.

Full multiplier code is shown in the appendix.

## Bit Manipulation Units

In addition to the above units, the ALU also contains several bit manipulation units to perform bit-wise operations such as and, or, xor, shifting, and rotate. The and, or, and xor units were created using Verilog primitives. Shifting and rotation were implemented using a series of multiplexers. Each multiplexer can select between the result of the last multiplexer or the result but shifted. This implementation, shown below, allows for simple and fast bit manipulation with minimal gate usage.

```

module SHIFT_RIGHT_ARITH(
    iD, iShamt,
    oD
);

    input wire [31:0] iD;
    input wire [4:0] iShamt;
    output wire [31:0] oD;

    wire [31:0] S1, S2, S3, S4, S5;

    assign S1 = iShamt[0] ? {{1{iD[31]}}, iD[31:1]} : iD;
    assign S2 = iShamt[1] ? {{2{S1[31]}}, S1[31:2]} : S1;
    assign S3 = iShamt[2] ? {{4{S2[31]}}, S2[31:4]} : S2;
    assign S4 = iShamt[3] ? {{8{S3[31]}}, S3[31:8]} : S3;
    assign S5 = iShamt[4] ? {{16{S4[31]}}, S4[31:16]} : S4;

    assign oD = S5;

endmodule

```

The above code snippet showcases the Shift Right Arithmetic operation. All other shifting modules follow the same format but with different bit manipulation logic in the concatenation portion.

## Interfaces

Interfaces are external inputs and outputs that are readable and/or controllable by the processor. The designed processor contains a memory-mapped VGA interface and two port-mapped interfaces. The two port-mapped interfaces are the eight seven-segment displays and the seventeen switches available on the DE2 development board.

### VGA Interface

To reduce the complexity of the VGA interface, it is set to a fixed resolution and only controlled by four registers. Each register can be set by the processor to set the colour of its quadrant on the display. The VGA interface then displays the four colours at a 640 by 480 resolution. The four registers used by VGA can be written to by setting the 30th bit of the memory address. The lowest two bits of the address are then used to select the VGA register to write to. Memory decoding and interface selection are done through the following code:

```

// Memory Select Wires
wire mems_dimem, mems_vga;

assign mems_dimem = proc_mem_addr[31:30] == 2'b00;
assign mems_vga = proc_mem_addr[31:30] == 2'b01;

memory mem(
    .iClk(clk),
    .iRead(proc_mem_read & mems_dimem),
    .iWrite(proc_mem_write & mems_dimem),
    .iData(proc_mem_out),
    .iAddr(proc_mem_addr),
    .oData(proc_mem_in)
);

vga_interface vga(
    .iClk_50(iClk_50),
    .nRst(nRst),
    .iCR(proc_mem_out),
    .iAddr(proc_mem_addr),
    .iWrite(proc_mem_write & mems_vga),
    .oVGA_R(oVGA_R),
    .oVGA_G(oVGA_G),
    .oVGA_B(oVGA_B),
    .oVGA_Clk(oVGA_Clk),
    .oVGA_Blink(oVGA_Blink),
    .oVGA_HSync(oVGA_HSync),
    .oVGA_VSync(oVGA_VSync),
    .oVGA_Sync(oVGA_Sync)
);

```

## Port Mapped Interfaces

Using the “in” and “out” instructions from the MiniSRC ISA, the switch states can be loaded, and the seven-segment displays can be written to. The “in” instruction loads the value from the switches. As shown below, the lower 18 bits contain 1's or 0's depending on the state of each switch. The top 14 bits are always zero.

```

assign proc_port_in = {14'd0, iSW};
assign proc_port_in = {14'd0, iSW};

```

The out instruction writes to the 8 seven-segment displays. Each display can show hex 0 through F. With 8 displays, the full 32-bit number written to the out port can be displayed in hex.

As shown below, each segment displays four bits of information.

```
SevenSeg s1(  
    .iNum(iNum[3:0]),  
    .oSeg(oSeg1)  
) ;
```

Then, the four-bit number is mapped onto the display with a case statement.

## Evaluation Results

After loading the design onto the DE2 FPGA development board, it was tested to find the maximum clock frequency and logic element usage. Maximum clock frequency was determined by running an integer computation program with multiplication and division. If the program returned the correct computational result, then the frequency was increased. The design was tested at 100Hz, 25kHz, 25MHz, and then at 50MHz. With all clock frequencies, the program returned the correct result. Therefore, the maximum operating frequency of the CPU is greater than or equal to 50MHz. Frequencies greater than 50MHz were not tested as 50MHz is the maximum available clock frequency on the DE2 development board. To accomplish a CPI (Clocks Per Instruction) of 5 at 50MHz, the design uses approximately 7000 logic elements, or 20% of the Cyclone II's number of total logic elements.

## Discussion, Conclusions, and Future Work

The initial goal of this project was to design and implement a fully functional five-stage pipelined processor consisting of Instruction Fetch (IF), Instruction Decode (ID), Execute (EX), Memory (MEM), and Write Back (WB) stages. While each of these stages has been successfully implemented independently, the pipeline integration remains incomplete. Currently, the project provides a robust foundation upon which full pipelining can be developed and finalized. Preliminary pipeline stage registers have been established to facilitate future pipelining; however, one critical component still missing is the implementation of pipeline registers for control signals.

To achieve fully pipelined operation, several steps are recommended for future development. First, it will be necessary to incorporate hazard detection, either through compilation techniques or hardware-based detection. Hazard detection logic will introduce stalls or insert NOP (no operation) instructions whenever dependencies are encountered that cannot be resolved through forwarding.

Secondly, the addition of multiple forwarding units will be essential, specifically at the EX and MEM stages, and potentially at the ID stage. Forwarding from the WB stage can be managed

directly through the register file. Forwarding at the ID stage may become important depending on whether branch decisions remain in the EX stage or are relocated earlier to the ID stage. Moving branch decisions to the ID stage would reduce wasted clock cycles caused by pipeline flushes due to incorrect branch predictions.

In the initial implementation of branch instructions, no subsequent instructions will be issued until branch targets have been determined, thereby eliminating the need for pipeline flushing. Once the pipeline is verified to be fully operational and stable, advanced features such as branch prediction and associated flush logic should be introduced to enhance efficiency and reduce pipeline stalls further.

Although the pipelined processor was not fully completed, this project can still be considered a success. The processor design and individual pipeline stages have been fully developed and validated, creating a robust and reliable base upon which future enhancements can be implemented. The groundwork laid in this project greatly simplifies further efforts towards achieving a fully operational 5-stage pipeline. By clearly identifying areas needing further attention, such as hazard detection, forwarding units, and control signal pipeline registers, future development is well-guided. Ultimately, the current state of the processor represents significant progress and provides an excellent starting point for continued improvement.

## Appendices

### Appendix 1: Verilog HDL Code

To browse and download the code for the project including test benches, see the github repository: <https://github.com/Jchisholm204/MiniSRC>. All the code for the processor is included in the sections below.

# ALU code

## ALU module

```
1  `include "../Control/ALU.vh"
2  module ALU(
3      // 32 bit Inputs
4      iA, iB,
5      // Control Signal
6      iCtrl,
7      // 64 bit output
8      oC_hi, oC_lo,
9      // Zero Output / Negative Output
10     oZero, oNeg
11 );
12
13    input wire [31:0] iA, iB;
14    input wire [3:0] iCtrl;
15    output wire [31:0] oC_hi, oC_lo;
16    output wire oZero, oNeg;
17
18    // Module output
19    wire [31:0] out_hi, out_lo;
20
21    // Adder/Subtract/Not IO
22    wire [31:0] cla_out, cla_iB, cla_iA;
23    wire cla_iCarry, cla_oCarry, cla_overflow, cla_zero, cla_neg;
24
25    // OR/XOR/AND IO
26    wire [31:0] or_out, xor_out, and_out;
27
28    // Shifter IO
29    wire [31:0] sft_data, sft_out;
30    wire [4:0] sft_shamt;
31    wire sft_arith, sft_left;
32
33    // Multiplier IO
34    wire [63:0] mul_out;
35    wire mul_neg;
36
37    // Divider IO
38    wire [31:0] div_q, div_r, div_dividend, div_divisor;
39    wire [31:0] div_rmdr, div_qtnt;
40    wire div_iNegA, div_iNegB, div_neg;
41
42    // ROL / ROR IO
43    wire [31:0] ROR_out, ROL_out;
```

```

45 // NOT
46 wire [31:0] NOT_out;
47
48 // Adder/Subtract
49
50 // XOR input B for subtraction and set carry to 1
51 assign cla_iA = (iCtrl == `CTRL_ALU_NEG) ? 32'd0 : iA;
52 // assign cla_iA = (iCtrl == `CTRL_ALU_NEG) ? 32'd0 : iA;
53
54 v assign cla_iB = (iCtrl == `CTRL_ALU_SUB) ? 32'hFFFFFFF ^ iB :
55 | | | | (iCtrl == `CTRL_ALU_NEG) ? 32'hFFFFFFF ^ iA : iB;
56
57 assign cla_iCarry = (iCtrl == `CTRL_ALU_SUB || iCtrl == `CTRL_ALU_NEG);
58
59 v CLA cla(
60   .iX(cla_iA),
61   .iY(cla_iB),
62   .iCarry(cla_iCarry),
63   .oS(cla_out),
64   .oCarry(cla_oCarry),
65   .oOverflow(cla_overflow),
66   .oZero(cla_zero),
67   .oNegative(cla_neg)
68 );
69
70 // OR
71 v ALU_OR bor(
72   .iA(iA),
73   .iB(ib),
74   .oC(or_out)
75 );
76
77 // XOR
78 v ALU_XOR bxor(
79   .iA(iA),
80   .iB(ib),
81   .oC(xor_out)
82 );
83
84 // AND
85 v ALU_AND band(
86   .iA(iA),
87   .iB(ib),
88   .oC(and_out)
89 );

```

```

91 // Bit Shifter
92 assign sft_data = iA;
93 assign sft_shamt = iB[4:0];
94 // Negate Arithmetic shift (logic low)
95 assign sft_arith = ~(iCtrl == `CTRL_ALU_SRA);
96 assign sft_left = ~(iCtrl == `CTRL_ALU_SLL);
97
98 // SHIFT sft(
99 |   .iD(sft_data),
100 |   .iShamt(sft_shamt),
101 |   .nArith(sft_arith),
102 |   .nLeft(sft_left),
103 |   .oD(sft_out)
104 );
105
106 // Multiplier
107
108 // MUL32 mul(
109 |   .iA(iA),
110 |   .iB(iB),
111 |   .oP(mul_out)
112 );
113
114 // assign mul_out = iA * iB;
115 assign mul_neg = mul_out[63];
116
117 // Divider
118 assign div_iNegA = iA[31];
119 assign div_iNegB = iB[31];
120
121 // If either is negative, the quotient must be negative
122 assign div_neg = div_iNegA ^ div_iNegB;
123
124 // If the divisor or dividend is negative, make it positive
125 assign div_dividend = div_iNegA ? 32'hFFFFFFFF ^ (iA - 1) : iA;
126 assign div_divisor = div_iNegB ? 32'hFFFFFFFF ^ (iB - 1) : iB;
127
128 // DIV32 div(
129 |   .iQ(div_divisor),
130 |   .iD(div_dividend),
131 |   .oQ(div_q),
132 |   .oR(div_r)
133 );
134
135 // Quotient will be negative if A or B is negative but not both
136 assign div_qnt = div_neg ? (32'hFFFFFFFF ^ div_q) + 1 : div_q;
137 // Remainder carries the same sign as the dividend
138 assign div_rmdr = div_iNegB ? (32'hFFFFFFFF ^ div_r) + 1 : div_r;
139
140 // ROR
141 // ROR ror(
142 |   .iD(iA),
143 |   .iShamt(iB[4:0]),
144 |   .oD(ROR_out)
145 );
146
147 // ROL
148 // ROL rol(
149 |   .iD(iA),
150 |   .iShamt(iB[4:0]),
151 |   .oD(ROL_out)
152 );
153
154 // NOT
155 // generate
156 |   genvar i;
157 |   for(i = 0; i < 32; i = i + 1) begin : alu_not_gen
158 |     not (NOT_out[i], iA[i]);
159 |   end
160 endgenerate

```

```

162 // Module Outputs
163 // Set low output register
164 `assign out_lo = (iCtrl == `CTRL_ALU_ADD) ? cla_out :
165     (iCtrl == `CTRL_ALU_SUB) ? cla_out :
166     (iCtrl == `CTRL_ALU_OR) ? or_out :
167     (iCtrl == `CTRL_ALU_XOR) ? xor_out :
168     (iCtrl == `CTRL_ALU_AND) ? and_out :
169     (iCtrl == `CTRL_ALU_MUL) ? mul_out[31:0] :
170     (iCtrl == `CTRL_ALU_DIV) ? div_gnt :
171     (iCtrl == `CTRL_ALU_SLL) ? sft_out :
172     (iCtrl == `CTRL_ALU_SRL) ? sft_out :
173     (iCtrl == `CTRL_ALU_SRA) ? sft_out :
174     (iCtrl == `CTRL_ALU_ROR) ? ROR_out :
175     (iCtrl == `CTRL_ALU_ROL) ? ROL_out :
176     (iCtrl == `CTRL_ALU_NOT) ? NOT_out :
177     (iCtrl == `CTRL_ALU_NEG) ? cla_out :
178     32'h00000000;
179
180 // Set high output register (Zero on anything not needing 64 bits)
181 `assign out_hi = (iCtrl == `CTRL_ALU_MUL) ? mul_out[63:32] :
182     (iCtrl == `CTRL_ALU_DIV) ? div_rmdr :
183     32'h00000000;
184
185 // Output register
186 assign oC_lo = out_lo;
187 assign oC_hi = out_hi;
188
189 // Assign the negative outputs based on the control inputs
190 `assign oNeg = (iCtrl == `CTRL_ALU_ADD) ? cla_neg :
191     (iCtrl == `CTRL_ALU_SUB) ? cla_neg :
192     (iCtrl == `CTRL_ALU_MUL) ? mul_neg :
193     (iCtrl == `CTRL_ALU_DIV) ? div_neg :
194     out_lo[31];
195
196 // Assign zero if both the high and low registers are zero
197 assign oZero = ({out_hi, out_lo} == 64'd0);
198
199
200 endmodule

```

## AND module

```

1 module ALU_AND(
2     iA, iB, oC
3 );
4 input wire [31:0] iA, iB;
5 output wire [31:0] oC;
6
7 generate
8     genvar i;
9     for(i = 0; i < 32; i = i + 1) begin : alu_and_gen
10        and (oC[i], iA[i], iB[i]);
11    end
12 endgenerate
13
14 endmodule

```

## CLA module

```
1 // Carry Lookahead Adder
2 module CLA(
3     iX,
4     iY,
5     iCarry,
6     oS,
7     oCarry,
8     oOverflow,
9     oZero,
10    oNegative
11 );
12
13 input wire iCarry;
14 input wire [31:0] iX, iY;
15 output wire [31:0] oS;
16 output wire oCarry, oOverflow, oZero, oNegative;
17
18 wire [3:0] C, O;
19
20 assign C[0] = iCarry;
21
22 CLA8 add1(
23     .iX(iX[7:0]),
24     .iY(iY[7:0]),
25     .iCarry(C[0]),
26     .oCarry(C[1]),
27     .oS(oS[7:0]),
28     .oOverflow()
29 );
30
31 CLA8 add2(
32     .iX(iX[15:8]),
33     .iY(iY[15:8]),
34     .iCarry(C[1]),
35     .oCarry(C[2]),
36     .oS(oS[15:8]),
37     .oOverflow()
38 );
39
40 CLA8 add3(
41     .iX(iX[23:16]),
42     .iY(iY[23:16]),
43     .iCarry(C[2]),
44     .oCarry(C[3]),
45     .oS(oS[23:16]),
46     .oOverflow()
47 );
48
49 CLA8 add4(
50     .iX(iX[31:24]),
51     .iY(iY[31:24]),
52     .iCarry(C[3]),
53     .oCarry(oCarry),
54     .oS(oS[31:24]),
55     .oOverflow(oOverflow)
56 );
57
58 assign oZero = ~oS;
59 assign oNegative = oS[31];
60 assign oOverflow = O[3];
61
62 endmodule
63
64 module CLA8(iX, iY, iCarry, oCarry, oS, oOverflow);
65 input wire [7:0] iX, iY;
66 input wire iCarry;
67 output wire oCarry, oOverflow;
68 output wire [7:0] oS;
69
70 wire [7:0] G, P;
71 wire [8:0] C;
72
73 // Generate the Generate and Propagate Signals
74 assign G = iX & iY;
75 assign P = iX | iY;
76
```

## DIV32 module

```
1 // ONLY Accepts positive numbers
2 module DIV32(
3   iQ, iD,
4   oQ, oR
5 );
6
7   input wire [31:0] iQ, iD;
8   output wire [31:0] oQ, oR;
9
10  wire [31:0] Q;
11  wire [31:0] A[31:0];
12
13  // Initialize the first partial remainder as 0
14  assign A[0] = 32'd0;
15
16  // Create 31 divisor levels
17 generate
18   genvar i;
19   for(i = 0; i < 31; i = i + 1) begin : gen_div_lvl
20     DIV_LEVEL divlvl (
21    .iA(A[i]),
22    .iQ(iQ[31-i]),
23    .iD(iD),
24    .oA(A[i+1]),
25    .oQ(Q[31-i])
26   );
27   end
28 endgenerate
29
30 // Final divisor Level
31 wire [31:0] shift, X;
32
33  assign shift = {A[31][30:0], iQ[0]};
34  assign X = shift[31] ? shift + iD : shift - iD;
35  assign Q[0] = ~X[31];
36
37 // Final divisor Level
38 wire [31:0] shift, X;
39
40  assign shift = {A[31][30:0], iQ[0]};
41  assign X = shift[31] ? shift + iD : shift - iD;
42  assign Q[0] = ~X[31];
43
44 // Remainder and output assignments
45 assign oR = X[31] ? X + iD : X;
46 assign oQ = Q;
47
48 endmodule
49
50 module DIV_LEVEL(
51   iA, iQ, iD,
52   oA, oQ
53 );
54
55   input wire [31:0] iA, iD;
56   input wire iQ;
57   output wire [31:0] oA;
58   output wire oQ;
59
60   wire [31:0] shift;
61
62 // Shift in the next bit of the quotient
63 assign shift = {iA[30:0], iQ};
64 // Add or subtract the divisor
65 assign oA = shift[31] ? shift + iD : shift - iD;
66 // Output 1 to result if the add/sub result was positive
67 assign oQ = ~oA[31];
68
69 endmodule
```

## MUL32 module

```
1 // TODO: Implement the ability to do this with unsigned numbers also.
2 // TODO: Combine all the carry-save adder generate blocks into one generate block.
3 // EXTRA: index the carry-save adder structure so this multiplier works for N being any power of 2.
4
5 // This assumes that the inputs are 32-bit 2's-complement signed integers.
6 // Multiply (multiplicand) A by (multiplier) B to get (product) P.
7 module MUL32 (input signed [31:0] iA, input signed [31:0] iB, output signed [63:0] oP);
8
9 localparam N = 32;
10
11 // The number of booth encoded values.
12 localparam BTH = (N+1)/2; // = 16
13 // For an ideal cascade of 4-to-2 reducers to align with the number of inputs, we need N to be a power of 2.
14 // Number of levels of Carry-Save Addition (CSA) using 4-to-2 reducers.
15 // localparam CSA_levels = $clog2(BTH); // = 4
16 // # of 4-to-2 reducers in each level of CSA.
17 // localparam reducers_in_CSA_level1 = BTH/4; // = 4
18 // localparam reducers_in_CSA_level2 = CSA_level1*2/4; // = 2
19 // localparam reducers_in_CSA_level3 = CSA_level2*2/4; // = 1
20 // Final Carry-Propagate Addition (CPA) is done using the final 2 outputs from the last level of CSA.
21
22 // Extend by 1 bit to handle negation
23 wire signed [N+1:0] A, negA;
24 wire signed [N+1:0] A2, negA2;
25 wire [N+1:0] notA;
26 wire [BTH-1:0] booth_sign;
27 wire [1:0] booth_magnitude [BTH-1:0];
28 wire [2*BTH-1:0] flat_booth_magnitude;
29
30 assign A = {2{iA[N-1]}}, iA;
31 assign negA = notA + 1'b1;
32 assign A2 = {iA[N-1], iA, 1'b0};
33 assign negA2 = {negA[N:0], 1'b0};
34
35 // Compute the Booth Encoding of the multiplier (B).
36 genvar i;
37 generate
38   for (i = 0; i < N+2; i = i + 1) begin : gen_notA
39     assign notA[i] = ~A[i];
40   end
41   for (i = 0; i < BTH; i = i + 1) begin : map_booth_magnitude
42     assign booth_magnitude[i] = {flat_booth_magnitude[i+BTH], flat_booth_magnitude[i]};
43   end
44 endgenerate
```

```

46 BoothEncode_2bit_Nbit #(N) be2bit(iB, booth_sign, flat_booth_magnitude[2*BTH-1:BTH], flat_booth_magnitude[BTH-1:0]);
47
48 wire signed [N+1:0] initialValue[BTH-1:0];
49
50 generate
51   for (i = 0; i < BTH; i = i + 1) begin : gen_initialValue
52     assign initialValue[i] = booth_magnitude[i][1] ? (booth_sign[i] ? negA2 : A2)
53     : (booth_magnitude[i][0] ? (booth_sign[i] ? negA : A)
54     : {(N+1){1'b0}});
55   end
56 endgenerate
57 /*
58 the s's in this diagram represent the sign bit of the initial values.
59 the z's in this diagram represent the padded zero bit of the initial values.
60 | 00000000
61 sssss0000000000
62 sss000000000zzz
63 ss000000000zzzz
64 000000000zzzzz
65 */
66 wire signed [2*N-1:0] shiftedInitialValue[BTH-1:0];
67 generate
68   for (i = 0; i < BTH; i = i + 1) begin : gen_shiftedInitialValue
69     assign shiftedInitialValue[i] = {((N-2*i){initialValue[i][N+1]}), initialValue[i], {(2*i){1'b0}}};
70   end
71 endgenerate
72 // CSA Layer 1: 16 numbers (34-bit each+shifts) -> 4*4-to-2 reducers. -> 8 numbers (64-bit each).
73 // Carry-Save Addition using 4-to-2 reducers.
74 /* same as this but with 32-bit numbers.
75 0 0000 0000 0000 0000
76 -----0000000000000000 i = 0, j = 0
77 -----0000000000000000-- j = 1
78 --0000000000000000---- j = 2
79 0000000000000000----- j = 3
80 ..... i = 1, j = 0
81 */
82 // need to make sure that sign extension is done properly.
83 // The lowest 2 bits are piped directly to the output.
84 // We can ignore the upper bit from each reducer because it is place value 2*N, and the final result (oP) is only 2*N bits (not 2*N+1).
85 wire signed [2*N-1:2] iCSA1[3:0][3:0];
86 wire signed [2*N:2] oCSA1[3:0][1:0];
87 genvar j;

```

```

88  v generate
89      // i is the index of each reducer in the first layer.
90  v     for (i = 0; i < 4; i = i + 1) begin : gen_CSA1
91  v         for (j = 0; j < 4; j = j + 1) begin : gen_CSA1_shift
92             assign iCSA1[i][j] = shiftedInitialValue[4*i+j][2*N-1:2];
93         end
94     Reducer4to2_Nbit #(2*N-2) CSA1_i(
95         .iW(iCSA1[i][3]), .iX(iCSA1[i][2]), .iY(iCSA1[i][1]), .iZ(iCSA1[i][0]), .iCarry(1'b0),
96         .oSum1(oCSA1[i][1][2*N:3]), .oSum0(oCSA1[i][0][2*N:2]));
97     assign oCSA1[i][1][2] = 1'b0;
98   end
99 endgenerate
100
101 // CSA Layer 2: 8 numbers -> 2*4-to-2 reducers. -> 4 numbers
102 wire [2*N-1:2] iCSA2[1:0][3:0];
103 wire [2*N:2] oCSA2[1:0][1:0];
104 v generate
105 v     for (i = 0; i < 2; i = i + 1) begin : gen_CSA2
106 v         for (j = 0; j < 4; j = j + 1) begin : gen_CSA2_shift
107             assign iCSA2[i][j] = oCSA1[2*i+j/2][j%2][2*N-1:2];
108         end
109     Reducer4to2_Nbit #(2*N-2) CSA2_i(
110         .iW(iCSA2[i][3]), .iX(iCSA2[i][2]), .iY(iCSA2[i][1]), .iZ(iCSA2[i][0]), .iCarry(1'b0),
111         .oSum1(oCSA2[i][1][2*N:3]), .oSum0(oCSA2[i][0][2*N:2]));
112     assign oCSA2[i][1][2] = 1'b0;
113   end
114 endgenerate
115
116 // CSA Layer 3: 4 numbers -> 1*4-to-2 reducer -> 2 numbers
117 wire [2*N-1:2] iCSA3[3:0];
118 wire [2*N:2] oCSA3[1:0];
119 v generate
120 v     for (i = 0; i < 4; i = i + 1) begin : gen_CSA3
121         assign iCSA3[i] = oCSA2[i/2][i%2][2*N-1:2];
122     end
123 endgenerate
124 v Reducer4to2_Nbit #(2*N-2) CSA3(
125     .iW(iCSA3[3]), .iX(iCSA3[2]), .iY(iCSA3[1]), .iZ(iCSA3[0]), .iCarry(1'b0),
126     .oSum1(oCSA3[1][2*N:3]), .oSum0(oCSA3[0][2*N:2]));
127 assign oCSA3[1][2] = 1'b0;
128
129 // Carry-Propagate Addition using final 2 outputs from carry-save adders.
130 assign oP = {oCSA3[1][2*N-1:2] + oCSA3[0][2*N-1:2], initialValue[0][1:0]};
131

```

```

134 // module BoothEncode_2bit_Nbit #(parameter N = 32) (input signed [N-1:0] iA, output [(N+1)/2-1:0] oSign,
135 // output [(N+1)/2-1:0] oMagnitude1, output [(N+1)/2-1:0] oMagnitude0);
136
137 wire [N+1:0] A2;
138 assign A2 = {iA[N-1], iA, 1'b0};
139
140
141 genvar i;
142 generate
143 for (i = 1; i+1 <= N+1; i = i + 2) begin : gen
144 | BoothEncode_2bit be2bit(A2[i+1:i-1], oSign[i/2], {oMagnitude1[i/2], oMagnitude0[i/2]} );
145 end
146 endgenerate
147 endmodule
148
149
150 // 0, 1, 2, -2, -1,
151 // 1 bit for sign: 0 = positive/zero, 1 = negative
152 // 2 bits for magnitude: 0 = 00, 1 = 01, 2 = 10
153 // table
154 //    // iA : oSign oMagnitude
155 //    3'b000 : 0      2'b00;
156 //    3'b001 : 0      2'b01;
157 //    3'b010 : 0      2'b01;
158 //    3'b011 : 0      2'b10;
159 //    3'b100 : 1      2'b10;
160 //    3'b101 : 1      2'b01;
161 //    3'b110 : 1      2'b01;
162 //    3'b111 : 0      2'b00; // making sure that there is no sign for zero.
163 // endtable
164 module BoothEncode_2bit(input [2:0] iA, output oSign, output [1:0] oMagnitude);
165
166 assign oSign = iA[2] & (~iA[1] | ~iA[0]);
167 assign oMagnitude[0] = iA[1] ^ iA[0];
168 assign oMagnitude[1] = (iA[2] && !(iA[1] || iA[0])) || (!iA[2] && (iA[1] && iA[0]));
169
170 endmodule
171
172 module Reducer4to2_Nbit #(parameter N = 32) ([input [N-1:0] iW, input [N-1:0] iX, input [N-1:0] iY, input [N-1:0] iZ,
173 input iCarry, output [N-1:0] oSum1, output [N:0] oSum0];
174
175 wire carry [N:0];
176
177 assign carry[0] = iCarry;
178
179 genvar i;
180 generate
181 for (i = 0; i < N; i = i + 1) begin : gen
182 | Reducer4to2 r4to2({iW[i], iX[i], iY[i], iZ[i]}, carry[i], {oSum1[i], oSum0[i]}, carry[i+1]);
183 end
184 endgenerate
185
186 assign oSum0[N] = carry[N];
187
188 endmodule
189
190 // oCarry is of the same place value of the most significant bit of the sum. https://www.geoffknagge.com/fyp/carrysave.shtml
191 module Reducer4to2 (input [3:0] iA, input iCarry, output [1:0] oSum, output oCarry);
192 wire w, x, y, z;
193 assign {w, x, y, z} = iA;
194
195 // used Karnaugh Maps to simplify the equations (https://www.charlie-coleman.com/experiments/kmap/)
196
197 assign oSum[1] = (w & x & y & z) | (iCarry & (w ^ x ^ y ^ z));
198
199 assign oSum[0] = iCarry ^ w ^ x ^ y ^ z;
200
201 assign oCarry = (w | x | y) & (w | x | z) & (w | y | z) & (x | y | z);
202
203 endmodule

```

## OR module

```
1  module ALU_OR(
2      iA, iB, oC
3  );
4  input wire [31:0] iA, iB;
5  output wire [31:0] oC;
6
7  generate
8      genvar i;
9      for(i = 0; i < 32; i = i + 1) begin : alu_or_gen
10         or (oC[i], iA[i], iB[i]);
11     end
12 endgenerate
13
14 endmodule
```

## ROL and ROR module

```
1  module ROL(
2      iD, iShamt,
3      oD
4  );
5
6  input wire [31:0] iD;
7  input wire [4:0] iShamt;
8  output wire [31:0] oD;
9
10 wire [31:0] S1, S2, S3, S4, S5;
11
12 assign S1 = iShamt[0] ? {iD[30:0], iD[31]} : iD;
13 assign S2 = iShamt[1] ? {S1[29:0], S1[31:30]} : S1;
14 assign S3 = iShamt[2] ? {S2[27:0], S2[31:28]} : S2;
15 assign S4 = iShamt[3] ? {S3[23:0], S3[31:24]} : S3;
16 assign S5 = iShamt[4] ? {S4[15:0], S4[31:16]} : S4;
17
18 assign oD = S5;
19
20 endmodule
21
22 module ROR(
23      iD, iShamt,
24      oD
25  );
26
27 input wire [31:0] iD;
28 input wire [4:0] iShamt;
29 output wire [31:0] oD;
30
31 wire [31:0] S1, S2, S3, S4, S5;
32
33 assign S1 = iShamt[0] ? {iD[0], iD[31:1]} : iD;
34 assign S2 = iShamt[1] ? {S1[1:0], S1[31:2]} : S1;
35 assign S3 = iShamt[2] ? {S2[3:0], S2[31:4]} : S2;
36 assign S4 = iShamt[3] ? {S3[7:0], S3[31:8]} : S3;
37 assign S5 = iShamt[4] ? {S4[15:0], S4[31:16]} : S4;
38
39 assign oD = S5;
40
41 endmodule
```

## SHIFT module

```
1 ∵ module SHIFT(
2     iD, iShamt,
3     nArith, nLeft,
4     oD
5 );
6
7     input wire [31:0] iD;
8     input wire [4:0] iShamt;
9     input wire nArith, nLeft;
10    output wire [31:0] oD;
11
12    wire [31:0] left, right_arith, right_logic, right;
13
14 ∵ SHIFT_LEFT leftshift(
15     .iD(iD),
16     .iShamt(iShamt),
17     .oD(left)
18 );
19
20 ∵ SHIFT_RIGHT_ARITH rightarith(
21     .iD(iD),
22     .iShamt(iShamt),
23     .oD(right_arith)
24 );
25
26 ∵ SHIFT_RIGHT_LOGIC rightlogic(
27     .iD(iD),
28     .iShamt(iShamt),
29     .oD(right_logic)
30 );
31
32     assign right = nArith ? right_arith : right_arith;
33     assign oD = nLeft ? right : left;
34
35 endmodule
36
37 ∵ module SHIFT_LEFT(
38     iD, iShamt,
39     oD
40 );
41
42     input wire [31:0] iD;
43     input wire [4:0] iShamt;
44     output wire [31:0] oD;
45
46     wire [31:0] S1, S2, S3, S4, S5;
47
48     assign S1 = iShamt[0] ? {iD[30:0], 1'd0} : iD;
49     assign S2 = iShamt[1] ? {S1[29:0], 2'd0} : S1;
50     assign S3 = iShamt[2] ? {S2[27:0], 4'd0} : S2;
51     assign S4 = iShamt[3] ? {S3[23:0], 8'd0} : S3;
52     assign S5 = iShamt[4] ? {S4[15:0], 16'd0} : S4;
53     assign oD = S5;
54
55 endmodule
56
57 module SHIFT_RIGHT_ARITH(
58     iD, iShamt,
59     oD
60 );
61
62     input wire [31:0] iD;
63     input wire [4:0] iShamt;
64     output wire [31:0] oD;
65
66     wire [31:0] S1, S2, S3, S4, S5;
67     assign S1 = iShamt[0] ? {{1{iD[31]}}, iD[31:1]} : iD;
68     assign S2 = iShamt[1] ? {{2{S1[31]}}, S1[31:2]} : S1;
69     assign S3 = iShamt[2] ? {{4{S2[31]}}, S2[31:4]} : S2;
70     assign S4 = iShamt[3] ? {{8{S3[31]}}, S3[31:8]} : S3;
71     assign S5 = iShamt[4] ? {{16{S4[31]}}, S4[31:16]} : S4;
72     assign oD = S5;
73
74 endmodule
75
76 module SHIFT_RIGHT_LOGIC(
77     iD, iShamt,
78     oD
79 );
80
81     input wire [31:0] iD;
82     input wire [4:0] iShamt;
83     output wire [31:0] oD;
84
85     wire [31:0] S1, S2, S3, S4, S5;
86     assign S1 = iShamt[0] ? {1'd0, iD[31:1]} : iD;
87     assign S2 = iShamt[1] ? {2'd0, S1[31:2]} : S1;
88     assign S3 = iShamt[2] ? {4'd0, S2[31:4]} : S2;
89     assign S4 = iShamt[3] ? {8'd0, S3[31:8]} : S3;
90     assign S5 = iShamt[4] ? {16'd0, S4[31:8]} : S4;
91     assign oD = S5;
92
93 endmodule
```

## XOR module

```
1 ∵ module ALU_XOR(
2     iA, iB, oC
3 );
4     input wire [31:0] iA, iB;
5     output wire [31:0] oC;
6
7 ∵ generate
8     genvar i;
9 ∵     for(i = 0; i < 32; i = i + 1) begin : alu_xor_gen
10        xor (oC[i], iA[i], iB[i]);
11    end
12 endgenerate
13
14 endmodule
```

# Control Code

## Control Module

```
1 `include "ISA.vh"
2 `include "ALU.vh"
3
4
5 module Control (
6     // Clock, reset and ready signals
7     // Ready is an active high that allows the next step to continue
8     iClk, nRst, iRdy,
9     // Memory Signals/Control
10    iMemData, oMemRead, oMemWrite,
11    // Pipe Control
12    oPipe_nRst,
13    // Program Counter Control
14    oPC_nRst, oPC_en, oPC_tmpEn, oPC_load, oPC_offset,
15    // Register File Control
16    oRF_Write,
17    oRF_AddrA, oRF_AddrB, oRF_AddrC,
18    // Write Back Register Control
19    oRWB_en,
20    // ALU Control
21    oALU_Ctrl, oRA_en, oRB_en,
22    oRZH_en, oRZL_en, oRAS_en,
23    // Jump Feedback
24    iJ_zero, iJ_nZero, iJ_pos, iJ_neg,
25    // Port Register enable
26    oREP_en,
27    // Multiplexers
28    oMUX_BIS, oMUX_RZHS, oMUX_WBM, oMUX_MAP, oMUX_ASS, oMUX_WBP, oMUX_WBE,
29    // Imm32 Output
30    oImm32
31 );
32
33 // Clock, reset and ready signals
34 // Ready is an active high that allows the next step to continue
35 input wire iClk, nRst, iRdy;
36 // Memory Signals/Control
37 input wire [31:0] iMemData;
38 output wire oMemRead, oMemWrite;
39 // Pipe Control
40 output wire oPipe_nRst;
41 // Program Counter Control
42 output wire oPC_nRst, oPC_en, oPC_tmpEn, oPC_load, oPC_offset;
43 // Register File Control
44 output wire oRF_Write;
45 output wire [3:0] oRF_AddrA, oRF_AddrB, oRF_AddrC;
```

```

46 // Write Back Register Control
47 output wire oRWB_en;
48 // ALU Control
49 output wire [3:0] oALU_Ctrl;
50 output wire oRA_en, oRB_en;
51 output wire oRZH_en, oRZL_en, oRAS_en;
52 // Jump Feedback
53 input wire iJ_zero, iJ_nZero, iJ_pos, iJ_neg;
54 // External Port Enable
55 output wire oREP_en;
56 // Multiplexers
57 output wire oMUX_BIS, oMUX_RZHS, oMUX_WBM, oMUX_MAP, oMUX_ASS, oMUX_WBP, oMUX_WBE;
58 // Imm32 Output
59 output wire [31:0] oImm32;
60
61 // Step Counter
62 reg [5:1] Cycle;
63
64 // IR
65 wire IR_en;
66 wire [31:0] IR_out;
67
68 // Decoder IO
69 wire [3:0] ID_RA, ID_RB, ID_RC;
70 wire [4:0] ID_OpCode;
71 wire [31:0] ID_imm32, ID_BRD;
72 wire [1:0] ID_BRC;
73
74 // OpCode R-Format Wires
75 wire OP_LD, OP_LI, OP_ST, OP_ADD, OP_SUB, OP_AND,
76 | OP_OR, OP_ROR, OP_ROL, OP_SRL, OP_SRA, OP_SLL;
77 // OpCode I-Format Wires
78 wire OP_ADDI, OP_ANDI, OP_ORI, OP_DIV, OP_MUL, OP_NEG, OP_NOT;
79 // OpCode B-Format Wires
80 wire OP_BRx;
81 // OpCode J-Format Wires
82 wire OP_JAL, OP_JFR, OP_IN, OP_OUT, OP_MFL, OP_MFH;
83 // OpCode M-Format Wires
84 wire OP_NOP, OP_HLT;
85 // OpCode Format Wires
86 // (Useful for data path MUX Assignments)
87 wire OPF_R, OPF_I, OPF_B, OPF_J, OPF_M;
88 // Branch Conditional Wires
89 wire BR_ZERO, BR_NZRO, BR_POS, BR_NEG;
90 wire BR_TRUE;

```

```

92  // Assign Cycle
93  always @(posedge iClk or negedge nRst)
94  begin
95    if(!nRst)
96      Cycle = 5'b00001;
97    else begin
98      if(iRdy & ~OP_HLT) Cycle = {Cycle[4:1], Cycle[5]};
99    end
100 end
101
102 // Instruction Register
103 assign IR_en = Cycle[1];
104 REG32 IR(.iClk(iClk), .nRst(nRst), .iEn(IR_en), .iD(iMemData), .oQ(IR_out));
105
106 // Decoder
107 Decode decoder(
108   .iINS(IR_out),
109   .oImm32(ID_imm32),
110   .oRa(ID_RA),
111   .oRb(ID_RB),
112   .oRc(ID_RC),
113   .oCode(ID_OpCode),
114   // Branch Distance
115   .oBRD(ID_BRD),
116   // Branch Code
117   .oBRC(ID_BRC)
118 );
119
120 // Assign OP-Code Types
121
122 // Assign R-Format Wires
123 assign OP_ADD = (ID_OpCode == `ISA_ADD);
124 assign OP_SUB = (ID_OpCode == `ISA_SUB);
125 assign OP_AND = (ID_OpCode == `ISA_AND);
126 assign OP_OR = (ID_OpCode == `ISA_OR);
127 assign OP_ROR = (ID_OpCode == `ISA_ROR);
128 assign OP_ROL = (ID_OpCode == `ISA_ROL);
129 assign OP_SRL = (ID_OpCode == `ISA_SRL);
130 assign OP_SRA = (ID_OpCode == `ISA_SRA);
131 assign OP_SLL = (ID_OpCode == `ISA_SLL);
132 // Opcode Format Wire (Useful for data path MUX Assignments)
133 assign OPF_R = (OP_ADD || OP_SUB || OP_AND || OP_OR || OP_ROR || OP_ROL || OP_SRL || OP_SRA || OP_SLL);
134 // Assign I-Format Wires
135 assign OP_LD = (ID_OpCode == `ISA_LD);
136 assign OP_LI = (ID_OpCode == `ISA_LI);

```

```

137 assign OP_ST = (ID_OpCode == `ISA_ST);
138 assign OP_ADDI = (ID_OpCode == `ISA_ADDI) || OP_LI;
139 assign OP_ANDI = (ID_OpCode == `ISA_ANDI);
140 assign OP_ORI = (ID_OpCode == `ISA_ORI);
141 assign OP_DIV = (ID_OpCode == `ISA_DIV);
142 assign OP_MUL = (ID_OpCode == `ISA_MUL);
143 assign OP_NEG = (ID_OpCode == `ISA_NEG);
144 assign OP_NOT = (ID_OpCode == `ISA_NOT);
145 // Opcode Format Wire (Useful for data path MUX Assignments)
146 assign OPF_I = (OP_LD || OP_ST || OP_ADDI || OP_ANDI || OP_ORI || OP_DIV || OP_MUL || OP_NEG || OP_NOT);
147 // Assign B-Format Wires
148 assign OP_BRx = (ID_OpCode == `ISA_BRx);
149 // Opcode Format Wire (Useful for data path MUX Assignments)
150 assign OPF_B = OP_BRx;
151 // Assign J-Format Wires
152 assign OP_JAL = (ID_OpCode == `ISA_JAL);
153 assign OP_JFR = (ID_OpCode == `ISA_JFR);
154 assign OP_IN = (ID_OpCode == `ISA_IN);
155 assign OP_OUT = (ID_OpCode == `ISA_OUT);
156 assign OP_MFL = (ID_OpCode == `ISA_MFL);
157 assign OP_MFH = (ID_OpCode == `ISA_MFH);
158 // Opcode Format Wire (Useful for data path MUX Assignments)
159 assign OPF_J = (OP_JAL || OP_JFR || OP_MFL || OP_MFH || OP_IN || OP_OUT);
160 // Assign M-Format Wires
161 assign OP_NOP = (ID_OpCode == `ISA_NOP);
162 assign OP_HLT = (ID_OpCode == `ISA_HLT);
163 // Opcode Format Wire (Useful for data path MUX Assignments)
164 assign OPF_M = (OP_NOP || OP_HLT);
165
166 // Assign Branch Wires
167 // ij_xxx based on RF_RB in data path
168 assign BR_ZERO = (ID_BRC == `ISA_BR_ZERO) && ij_zero;
169 assign BR_NZRO = (ID_BRC == `ISA_BR_NZRO) && ij_nZero;
170 assign BR_POS = (ID_BRC == `ISA_BR_POSI) && ij_pos;
171 assign BR_NEG = (ID_BRC == `ISA_BR_NEGL) && ij_neg;
172 assign BR_TRUE = (BR_ZERO || BR_NZRO || BR_POS || BR_NEG) && OP_BRx;
173
174 // Assign Control outputs based on Codes and Cycle
175
176 // Pipe Reset Signal
177 assign oPipe_nRst = nRst;
178
179 // Program Counter Control Signals
180 // PC Reset (Should only be reset on CPU reset)
181 assign oPC_nRst = nRst;

```

```

182 // PC Load Enable
183 assign oPC_en = Cycle[1] || (Cycle[3] && (BR_TRUE || OP_JAL || OP_JFR));
184 assign oPC_tmpEn = Cycle[1];
185 // PC Jump Enable
186 assign oPC_offset = Cycle[3] && BR_TRUE;
187 assign oPC_load = Cycle[3] && (OP_JFR || OP_JAL);
188
189 // Register File Control Signals
190 assign oRF_Write = Cycle[5] && (OPF_R || (OPF_I && ~OP_DIV && ~OP_MUL && ~OP_ST) || OP_MFH || OP_MFL || OP_JAL || OP_IN);
191 // Note: Most ISA's use RC as the write back address, MiniSRC uses RA
192 // RA is dependent on ISA type, use R0 if RA is not specified
193 // RA is used to load PC on JMP/JAL
194 assign oRF_AddrA = (OPF_R | OPF_I) ? ID_RB :
195 | (OPF_J) ? ID_RA : 4'h0;
196 // RB is dependent on ISA type, use R0 if RB is not specified
197 assign oRF_AddrB = (OPF_I | OPF_B | OPF_J) ? ID_RA :
198 | (OPF_R) ? ID_RC : 4'h0;
199 // Store is always RA
200 // ISA Specification states to store PC in r15 on JAL (Jump and Link)
201 assign oRF_AddrC = (OP_JAL) ? 4'h8 : ID_RA;
202
203 // Register File Write Back Register Load Enable
204 assign oRWB_en = 1'b1;
205
206 // ALU Control Signals Also, this should be renamed to "Ctrl" like the key on the keyboard.
207 assign oALU_Ctrl = (OP_ADD || OP_ADDI) ? `CTRL_ALU_ADD :
208 | (OP_SUB) ? `CTRL_ALU_SUB :
209 | (OP_OR || OP_ORI) ? `CTRL_ALU_OR :
210 | (OP_AND || OP_ANDI) ? `CTRL_ALU_AND :
211 | (OP_MUL) ? `CTRL_ALU_MUL :
212 | (OP_DIV) ? `CTRL_ALU_DIV :
213 | (OP_SLL) ? `CTRL_ALU_SLL :
214 | (OP_SRL) ? `CTRL_ALU_SRL :
215 | (OP_SRA) ? `CTRL_ALU_SRA :
216 | (OP_ROR) ? `CTRL_ALU_ROR :
217 | (OP_ROL) ? `CTRL_ALU_ROL :
218 | (OP_NOT) ? `CTRL_ALU_NOT :
219 | (OP_NEG) ? `CTRL_ALU_NEG :
220 // ALU Add is default for most instructions - so why not remove the (OP_ADD || OP_ADDI) ?
221 | `CTRL_ALU_ADD;
222 // ALU Input A Register Load Enable
223 assign oRA_en = 1'b1;
224 // ALU Input B Register Load Enable
225 assign oRB_en = 1'b1;

```

```

227 // ALU Result High Load EN
228 assign oRZH_en = 1'b1;
229 // ALU Result Low Load EN
230 assign oRZL_en = 1'b1;
231 // ALU Result Save EN
232 assign oRAS_en = (OP_DIV || OP_MUL);
233
234 // External Port Register Enable
235 assign oREP_en = OP_OUT && Cycle[4];
236
237 // ALU B Input Select (Selects Imm)
238 assign oMUX_BIS = OPF_I && ~(OP_DIV || OP_MUL);
239 // ALU Result High Select
240 assign oMUX_RZHS = (OP_MFH);
241 // RF Write Back Select
242 assign oMUX_WBM = (OP_LD);
243 // Memory Address Output Select
244 // assign oMUX_MA = Cycle[1];
245 assign oMUX_MAP = ~((OP_LD || OP_ST) && Cycle[4]);
246 // ALU Storage Select
247 assign oMUX_ASS = (OP_MFL || OP_MFH);
248 // Write Back Program Counter Select
249 assign oMUX_WBP = OP_JAL;
250 // Write Back External Port Select
251 assign oMUX_WBE = OP_IN;
252
253 // Immediate value output
254 // Assign Imm32 branch distance if the branch is true
255 assign oImm32 = OP_BRx ? ID_BRD : ID_imm32;
256
257 // Memory Read/Write Signals
258 assign oMemRead = Cycle[1] || (Cycle[4] && OP_LD);
259 assign oMemWrite = Cycle[4] && OP_ST;
260
261 endmodule

```

## Decode Module

```

1  v module Decode (
2    // Input Instruction
3    iINS,
4    // Immediate Value (Sign Extended)
5    oImm32,
6    // Reg A, Reg B, Reg C addresses
7    oRa, oRb, oRc,
8    // OP Code
9    oCode,
10   // Branch Distance
11   oBRD,
12   // Branch Code
13   oBRC
14 );
15
16 // Taken from instruction formats: section 2.1 of the Processor specifications
17
18 input wire [31:0] iINS;
19 output wire [31:0] oImm32;
20 output wire [3:0] oRa, oRb, oRc;
21 output wire [4:0] oCode;
22 output wire [31:0] oBRD;
23 output wire [1:0] oBRC;
24
25 assign oCode = iINS[31:27];
26 assign oRa = iINS[26:23];
27 assign oRb = iINS[22:19];
28 assign oRc = iINS[18:15];
29 assign oImm32 = {13{iINS[18]}}, iINS[18:0];
30 assign oBRD = {11{iINS[18]}}, iINS[18:0], 2'b00;
31 assign oBRC = iINS[20:19];
32
33 endmodule

```

## PC Module

```
1  module PC #(
2    parameter StartAddr = 32'h00000000
3  ) (
4    iClk,
5    iEn, iTmpEn, nRst,
6    iLoadEn, iOffsetEn,
7    iLoad, iOffset,
8    oPC,
9    oPC_tmp
10 );
11 `include "../constants.vh"
12
13 input wire iClk, iEn, iTmpEn, nRst;
14 input wire iLoadEn, iOffsetEn;
15 input wire [31:0] iLoad, iOffset;
16 output wire [31:0] oPC, oPC_tmp;
17
18 wire [31:0] pc_out, pc_tmp_out, add_in, add_out, pc_in;
19
20 // PC Adder Selection
21 assign add_in = iOffsetEn ? iOffset : `PC_INCREMENT;
22
23 CLA pc_adder(
24   .iX(add_in),
25   .iY(pc_out),
26   .iCarry(1'b0),
27   .oS(add_out),
28   // Intentionally leave these ports unconnected
29   .oCarry(),
30   .oOverflow(),
31   .oZero(),
32   .oNegative()
33 );
34
35 // PC Input Selection
36 assign pc_in = iLoadEn ? iLoad : add_out;
37
38 REG32 #(RESET(StartAddr)) pc(
39   .iClk(iClk),
40   .nRst(nRst),
41   .iEn(iEn),
42   .iD(pc_in),
43   .oQ(pc_out)
44 );
45
46 REG32 pc_tmp(
47   .iClk(iClk),
48   .nRst(nRst),
49   .iEn(iTmpEn),
50   .iD(add_out),
51   .oQ(pc_tmp_out)
52 );
53
54 assign oPC = pc_out;
55 assign oPC_tmp = pc_tmp_out;
56
57 endmodule
```

## Instruction and ALU encoding

```
1 // ALU Verilog Header File
2
3 // Control Parameters
4 `define CTRL_ALU_ADD 4'h0
5 `define CTRL_ALU_SUB 4'h1
6 `define CTRL_ALU_OR 4'h2
7 `define CTRL_ALU_XOR 4'h3
8 `define CTRL_ALU_AND 4'h4
9 `define CTRL_ALU_MUL 4'h5
10 `define CTRL_ALU_DIV 4'h6
11 `define CTRL_ALU_SLL 4'h7
12 `define CTRL_ALU_SRL 4'h8
13 `define CTRL_ALU_SRA 4'h9
14 `define CTRL_ALU_ROR 4'hA
15 `define CTRL_ALU_ROL 4'hB
16 `define CTRL_ALU_NOT 4'hC
17 `define CTRL_ALU_NEG 4'hD
18
19 // Opcode Signatures
20 // R Format Instructions
21 `define ISA_ADD 5'b00011
22 `define ISA_SUB 5'b00100
23 `define ISA_AND 5'b00101
24 `define ISA_OR 5'b00110
25 `define ISA_ROR 5'b00111
26 `define ISA_ROL 5'b01000
27 `define ISA_SRL 5'b01001
28 `define ISA_SRA 5'b01010
29 `define ISA_SLL 5'b01011
30
31 // I Format Instructions
32 `define ISA_LD 5'b00000
33 `define ISA_LI 5'b00001
34 `define ISA_ST 5'b00010
35 `define ISA_ADDI 5'b01100
36 `define ISA_ANDI 5'b01101
37 `define ISA_ORI 5'b01110
38 `define ISA_DIV 5'b01111
39 `define ISA_MUL 5'b10000
40 `define ISA_NEG 5'b10001
41 `define ISA_NOT 5'b10010
42
43 // B Format Instructions
44 `define ISA_BRx 5'b10011
45
46 // J Format Instructions
47 `define ISA_JAL 5'b10100
48 `define ISA_JFR 5'b10101
49 `define ISA_IN 5'b10110
50 `define ISA_OUT 5'b10111
51 `define ISA_MFL 5'b11000
52 `define ISA_MFH 5'b11001
53
54 // M Format Instructions
55 `define ISA_NOP 5'b11010
56 `define ISA_HLT 5'b11011
57
58
59 // Branch Codes
60 // Branch if Zero
61 `define ISA_BR_ZERO 2'b00
62 // Branch if NonZero
63 `define ISA_BR_NZRO 2'b01
64 // Branch if Positive
65 `define ISA_BR_POSI 2'b10
66 // Branch if negative
67 `define ISA_BR_NEGA 2'b11
```

## Datapath module

```
1  module Datapath(
2      // Clock and reset signals (reset is active low)
3      iClk, nRst,
4      // Memory Signals
5      iMemData,
6      oMemData,
7      // Port IO
8      iPORT, oPORT,
9      // Program Counter Control
10     iPC_nRst, iPC_en, iPC_tmPEn, iPC_load, iPC_offset,
11     // Register File Control
12     iRF_Write,
13     iRF_AddrA, iRF_AddrB, iRF_AddrC,
14     // Write Back Register Control
15     iRWB_en,
16     // ALU Control
17     iALU_Ctrl, iRA_en, iRB_en,
18     iRZH_en, iRZL_en, iRAS_en,
19     // Jump Feedback
20     oJ_zero, oJ_nZero, oJ_pos, oJ_neg,
21     // ALU Results
22     oALU_neg, oALU_zero,
23     // Memory Control
24     iRMA_en, iRMD_en,
25     // Multiplexers
26     iMUX_BIS, // ALU B Input/Immediate Select
27     iMUX_RZHS, // ALU Result High Select
28     iMUX_WBM, // Write back in Memory Select
29     iMUX_MAP, // Memory Address out PC Select
30     iMUX_ASS, // ALU Storage Select
31     iMUX_WBP, // Write back Program Counter Select
32     iMUX_WBE, // Write back in External Port Select
33     // Imm32 Output
34     iImm32
35 );
36
37 `include "constants.vh"
38
39 input wire iClk, nRst;
40 // Memory Signals
41 input wire [31:0] iMemData;
42 output wire [31:0] oMemData, oMemAddr;
43 // Port IO
44 input wire [31:0] iPORT;
45 output wire [31:0] oPORT;
```

  

```
46 // Program Counter Control
47 input wire iPC_nRst, iPC_en, iPC_tmPEn, iPC_load, iPC_offset;
48 // Register File Control
49 input wire iRF_Write;
50 input wire [3:0] iRF_AddrA, iRF_AddrB, iRF_AddrC;
51 // Write Back Register Control
52 input wire iRWB_en;
53 // ALU Control
54 input wire [3:0] iALU_Ctrl;
55 input wire iRA_en, iRB_en;
56 input wire iRZH_en, iRZL_en, iRAS_en;
57 output wire oALU_neg, oALU_zero;
58 // Jump Feedback
59 output wire oJ_zero, oJ_nZero, oJ_pos, oJ_neg;
60 // Multiplexers
61 input wire iMUX_BIS, iMUX_RZHS, iMUX_WBM, iMUX_MAP, iMUX_ASS, iMUX_WBP, iMUX_WBE;
62 // Imm32 Output
63 input wire [31:0] iImm32;
64
65 // Internal Clock Signal
66 wire Clk;
67 assign Clk = iClk & nRst;
68
69 // Program Counter Signals
70 wire [31:0] PC_out, PC_tOut;
71
72 // Register File IO
73 wire [31:0] RF_oRegA, RF_oRegB, RF_iRegC;
74 wire [31:0] RWB_in;
75
76 // ALU IO
77 wire [31:0] ALU_iA, ALU_iB, ALU_oC_hi, ALU_oC_lo;
78
79 // ALU Immediate Registers
80 wire [31:0] RA_out, RB_out;
81 wire [31:0] RZH_out, RZL_out, RZ_out;
82 // ALU Storage Registers
83 wire [31:0] RASH_out, RASL_out, RAS_out;
84 // ALU Output
85 wire [31:0] RZX_out;
86
87
88 // Program Counter
89 PC #(StartAddr(`START_PC_ADDRESS)) pc(
90     .iClk(Clk),
```

```

91     .iEn(iPC_en),
92     .iTmPEn(iPC_tmPEn),
93     .nRst(iPC_nRst),
94     .iLoadEn(iPC_load),
95     .iOffsetEn(iPC_offset),
96     .iLoad(RF_oRegA),
97     .iOffset(iImm32),
98     .oPC(PC_out),
99     .oPC_tmP(PC_tmP)
100 );
101
102
103 // Register File
104 RegFile RF(
105     .iClk(Clk),
106     .nRst(nRst),
107     .iWrite(iRF_Write),
108     .iAddrA(iRF_AdrA),
109     .iAddrB(iRF_AdrB),
110     .iAddrC(iRF_AdrC),
111     .oRegA(RF_oRegA),
112     .oRegB(RF_oRegB),
113     .iRegC(RF_iRegC)
114 );
115
116 // Jump Outputs
117 // RB is linked to RA in the ISA
118 assign oJ_zero = (RF_oRegB == 32'd0);
119 assign oJ_nZero = ~RF_oRegB;
120 assign oJ_pos   = ~RF_oRegB[31] && oJ_nZero;
121 assign oJ_neg   = RF_oRegB[31] && oJ_nZero;
122
123 // RF stationary/buffer registers
124 REG32 RA(.iClk(Clk), .nRst(nRst), .iEn(iRA_en), .iD(RF_oRegA), .oQ(RA_out));
125 REG32 RB(.iClk(Clk), .nRst(nRst), .iEn(iRB_en), .iD(RF_oRegB), .oQ(RB_out));
126
127 // ALU Input Multiplexers
128
129 assign ALU_iA = RA_out;
130 assign ALU_iB = iMUX_BIS ? iImm32 : RB_out;
131
132 // ALU
133 ALU alu(
134     .iA(ALU_iA),

```

```

135     .iB(ALU_iB),
136     .iCtrl(iALU_Ctrl),
137     .oC_hi(ALU_oC_hi),
138     .oC_lo(ALU_oC_lo),
139     .oZero(oALU_zero),
140     .oNeg(oALU_neg)
141 );
142
143 // ALU Result Registers
144 REG32 RZH(.iClk(Clk), .nRst(nRst), .iEn(iRZH_en), .iD(ALU_oC_hi), .oQ(RZH_out));
145 REG32 RZL(.iClk(Clk), .nRst(nRst), .iEn(iRZL_en), .iD(ALU_oC_lo), .oQ(RZL_out));
146
147 // ALU Storage Registers - Persist data until reset or next H/L transaction
148 REG32 RASH(.iClk(Clk), .nRst(nRst), .iEn(iRAS_en), .iD(ALU_oC_hi), .oQ(RASH_out));
149 REG32 RASL(.iClk(Clk), .nRst(nRst), .iEn(iRAS_en), .iD(ALU_oC_lo), .oQ(RASL_out));
150
151 // 32 bit ALU result selection
152 assign RAS_out = iMUX_RZHS ? RASH_out : RASL_out;
153 assign RZ_out = iMUX_RZHS ? RZH_out : RZL_out;
154 // Select between storage or current registers
155 assign RZX_out = iMUX_ASS ? RAS_out : RZ_out;
156
157 // Memory
158 assign oMemAddr = iMUX_MAP ? PC_out : RZX_out ;
159 v assign oMemData = RB_out;
160 |   // iRF_AddrA, iRF_AddrB, iRF_AddrC,
161 // assign oPORT = {RF_oRegA[15:0], RF_oRegB[15:0]};
162 assign oPORT = RB_out;
163
164 // Write Back
165 // Select Memory input on WBM, Select PC for JAL, otherwise use ALU result
166 v assign RWB_in = iMUX_WBM ? iMemData :
167 |   |   |   iMUX_WBE ? iPORT    :
168 |   |   |   iMUX_WBP ? PC_tOut : RZX_out;
169
170 // Write back buffer register
171 REG32 RWB(.iClk(iClk), .nRst(nRst), .iEn(iRWB_en), .iD(RWB_in), .oQ(RF_iRegC));
172
173 endmodule

```

## Processor module

```
1  module Processor(
2      iClk, nRst,
3      oMemAddr, oMemData,
4      iMemData, iMemRdy,
5      oMemRead, oMemWrite,
6      iPORT, oPORT
7  );
8
9  `include "constants.vh"
10
11 input wire iClk, nRst, iMemRdy;
12 output wire oMemRead, oMemWrite;
13 input wire [31:0] iMemData;
14 output wire [31:0] oMemData, oMemAddr;
15 input wire [31:0] iPORT;
16 output wire [31:0] oPORT;
17
18 // Program Counter Signals
19 wire PC_nRst, PC_en, PC_tmpEn, PC_load, PC_offset;
20
21 // Register File IO
22 wire RF_iWrite;
23 wire [3:0] RF_iAddrA, RF_iAddrB, RF_iAddrC;
24 wire RWB_en;
25
26 // ALU IO
27 wire [3:0] ALU_iCtrl;
28 wire ALU_oZero, ALU_oNeg;
29
30 // ALU Immediate Registers
31 wire RA_en, RB_en;
32 wire RZH_en, RZL_en;
33 // ALU Storage Registers
34 wire RAS_en;
35
36 // Jump/Branch Signals
37 wire J_zero, J_nZero, J_pos, J_neg;
38
39 // External Port Signals
40 wire REP_en;
41 wire [31:0] REP_in;
42
43 // Multiplexer Signals
44 wire MUX_BIS, MUX_RZHS, MUX_WBM, MUX_MAP, MUX_ASS, MUX_WBP, MUX_WBE;
45
46 // Control Signals
47 wire [31:0] CT_imm32;
48
49 // Control Unit
50 Control Ctrl(
51     // Clock, reset and ready signals
52     // Ready is an active high that allows the next step to continue
53     .iClk(iClk),
54     .nRst(nRst),
55     .iRdy(iMemRdy),
56     // Memory Signals/Control
57     .iMemData(iMemData),
58     .oMemRead(oMemRead),
59     .oMemWrite(oMemWrite),
60     // Pipe Control
61     .oPipe_nRst(pipe_rst),
62     // Program Counter Control
63     .oPC_nRst(PC_nRst),
64     .oPC_en(PC_en),
65     .oPC_tmpEn(PC_tmpEn),
66     .oPC_load(PC_load),
67     .oPC_offset(PC_offset),
68     // Register File Control
69     .oRF_Write(RF_iWrite),
70     .oRF_AddrA(RF_iAddrA),
71     .oRF_AddrB(RF_iAddrB),
72     .oRF_AddrC(RF_iAddrC),
73     .oRWB_en(RWB_en),
74     // ALU Control
75     .oALU_Ctr1(ALU_iCtrl),
76     .oRA_en(RA_en),
77     .oRB_en(RB_en),
78     .oRZH_en(RZH_en),
79     .oRZL_en(RZL_en),
80     .oRAS_en(RAS_en),
81     // Jump Feedback
82     .iJ_zero(J_zero),
83     .iJ_nZero(J_nZero),
84     .iJ_pos(J_pos),
85     .iJ_neg(J_neg),
86     // External Port Register Enable
87     .oREP_en(REP_en),
88     // Multiplexers
89     .oMUX_BIS(MUX_BIS),
90     .oMUX_RZHS(MUX_RZHS),
```

```

91     .oMUX_WBM(MUX_WBM),
92     .oMUX_MAP(MUX_MAP),
93     .oMUX_ASS(MUX_ASS),
94     .oMUX_WBP(MUX_WBP),
95     .oMUX_WBE(MUX_WBE),
96     // Imm32 Output
97     .oImm32(CT_imm32)
98   );
99
100 Datapath pipe(
101   // Clock and reset signals (reset is active low)
102   .iClk(iClk),
103   .nRst(pipe_rst),
104   // Memory Signals
105   .iMemData(iMemData),
106   .oMemAddr(oMemAddr),
107   .oMemData(oMemData),
108   // Port Signals
109   .iPORT(iPORT),
110   .oPORT(REP_in),
111   // Program Counter Control
112   .iPC_nRst(PC_nRst),
113   .iPC_en(PC_en),
114   .iPC_tmpEn(PC_tmpEn),
115   .iPC_load(PC_load),
116   .iPC_offset(PC_offset),
117   // Register File Control
118   .iRF_Write(RF_iWrite),
119   .iRF_AddrA(RF_iAddrA),
120   .iRF_AddrB(RF_iAddrB),
121   .iRF_AddrC(RF_iAddrC),
122   // Write Back Register Control
123   .iRWB_en(RWB_en),
124   // ALU Control
125   .iALU_Ctrl(ALU_iCtrl),
126   .iRA_en(RA_en),
127   .iRB_en(RB_en),
128   .iRZH_en(RZH_en),
129   .iRZL_en(RZL_en),
130   .iRAS_en(RAS_en),
131   // Jump Feedback
132   .oJ_zero(J_zero),
133   .oJ_nZero(J_nZero),
134   .oJ_pos(J_pos),
135   .oJ_neg(J_neg),
136   // ALU Results
137   .oALU_neg(ALU_oNeg),
138   .oALU_zero(ALU_oZero),
139   // Multiplexers
140   .iMUX_BIS(MUX_BIS), // ALU B Input/Immediate Select
141   .iMUX_RZHS(MUX_RZHS), // ALU Result High Select
142   .iMUX_WBM(MUX_WBM), // Write back in Memory Select
143   .iMUX_MAP(MUX_MAP), // Memory Address out PC Select
144   .iMUX_ASS(MUX_ASS), // ALU Storage Select
145   .iMUX_WBP(MUX_WBP),
146   .iMUX_WBE(MUX_WBE),
147   // Imm32 Output
148   .iImm32(CT_imm32)
149 );
150
151 REG32 REP(.iClk(iClk), .nRst(nRst), .iEn(REP_en), .iD(REP_in), .oQ(oPORT));
152 // assign oPORT = REP_in;
153
154 endmodule

```

## Register module

```
1  v module REG32 #(parameter RESET = 32'd0)(
2    iClk,
3    nRst, iEn,
4    iD, oQ
5  );
6
7  input wire iClk, nRst, iEn;
8  input wire [31:0] iD;
9  output reg [31:0] oQ;
10
11  always@(posedge iClk or negedge nRst)
12  v begin
13  v   if(!nRst)
14  |     oQ <= RESET;
15  v   else begin
16  |     if(iEn) oQ <= iD;
17  |   end
18  end
19
20 endmodule
```

## Register file module

```

1  v module RegFile(
2    iClk, nRst, iWrite,
3    iAddrA, iAddrB, iAddrC,
4    oRegA, oRegB, iRegC
5  );
6
7  // Module IO
8  input wire iClk, nRst, iWrite;
9  input wire [3:0] iAddrA, iAddrB, iAddrC;
10 output wire [31:0] oRegA, oRegB;
11 input wire [31:0] iRegC;
12
13
14 // Register IO
15 v wire r1_write, r2_write, r3_write, r5_write, r6_write,
16   | r7_write, r8_write, r9_write, r10_write, r11_write,
17   | r12_write, r13_write, r14_write, r15_write;
18
19 v wire [31:0] r1_out, r2_out, r3_out, r4_out, r5_out,
20   | r6_out, r7_out, r8_out, r9_out, r10_out,
21   | r11_out, r12_out, r13_out, r14_out, r15_out;
22
23 // Write Signal Assert
24 assign r1_write = (iAddrC == 4'b0001) && iWrite;
25 assign r2_write = (iAddrC == 4'b0010) && iWrite;
26 assign r3_write = (iAddrC == 4'b0011) && iWrite;
27 assign r4_write = (iAddrC == 4'b0100) && iWrite;
28 assign r5_write = (iAddrC == 4'b0101) && iWrite;
29 assign r6_write = (iAddrC == 4'b0110) && iWrite;
30 assign r7_write = (iAddrC == 4'b0111) && iWrite;
31 assign r8_write = (iAddrC == 4'b1000) && iWrite;
32 assign r9_write = (iAddrC == 4'b1001) && iWrite;
33 assign r10_write = (iAddrC == 4'b1010) && iWrite;
34 assign r11_write = (iAddrC == 4'b1011) && iWrite;
35 assign r12_write = (iAddrC == 4'b1100) && iWrite;
36 assign r13_write = (iAddrC == 4'b1101) && iWrite;
37 assign r14_write = (iAddrC == 4'b1110) && iWrite;
38 assign r15_write = (iAddrC == 4'b1111) && iWrite;
39
40
41 // Output Register A assignment
42 v assign oRegA = (iAddrA == 4'b0001) ? r1_out :
43   | (iAddrA == 4'b0010) ? r2_out :
44   | (iAddrA == 4'b0011) ? r3_out :
45   | (iAddrA == 4'b0100) ? r4_out :
46   | (iAddrA == 4'b0101) ? r5_out :
47   | (iAddrA == 4'b0110) ? r6_out :
48   | (iAddrA == 4'b0111) ? r7_out :
49   | (iAddrA == 4'b1000) ? r8_out :
50   | (iAddrA == 4'b1001) ? r9_out :
51   | (iAddrA == 4'b1010) ? r10_out :
52   | (iAddrA == 4'b1011) ? r11_out :
53   | (iAddrA == 4'b1100) ? r12_out :
54   | (iAddrA == 4'b1101) ? r13_out :
55   | (iAddrA == 4'b1110) ? r14_out :
56   | (iAddrA == 4'b1111) ? r15_out :
57   | 32'd0;
58
59 // Output Register A assignment
60 v assign oRegB = (iAddrB == 4'b0001) ? r1_out :
61   | (iAddrB == 4'b0010) ? r2_out :
62   | (iAddrB == 4'b0011) ? r3_out :
63   | (iAddrB == 4'b0100) ? r4_out :
64   | (iAddrB == 4'b0101) ? r5_out :
65   | (iAddrB == 4'b0110) ? r6_out :
66   | (iAddrB == 4'b0111) ? r7_out :
67   | (iAddrB == 4'b1000) ? r8_out :
68   | (iAddrB == 4'b1001) ? r9_out :
69   | (iAddrB == 4'b1010) ? r10_out :
70   | (iAddrB == 4'b1011) ? r11_out :
71   | (iAddrB == 4'b1100) ? r12_out :
72   | (iAddrB == 4'b1101) ? r13_out :
73   | (iAddrB == 4'b1110) ? r14_out :
74   | (iAddrB == 4'b1111) ? r15_out :
75   | 32'd0;
76
77 // Registers
78 REG32 r1(.iClk(iClk), .nRst(nRst), .iEn(r1_write), .iD(iRegC), .oQ(r1_out));
79 REG32 r2(.iClk(iClk), .nRst(nRst), .iEn(r2_write), .iD(iRegC), .oQ(r2_out));
80 REG32 r3(.iClk(iClk), .nRst(nRst), .iEn(r3_write), .iD(iRegC), .oQ(r3_out));
81 REG32 r4(.iClk(iClk), .nRst(nRst), .iEn(r4_write), .iD(iRegC), .oQ(r4_out));
82 REG32 r5(.iClk(iClk), .nRst(nRst), .iEn(r5_write), .iD(iRegC), .oQ(r5_out));
83 REG32 r6(.iClk(iClk), .nRst(nRst), .iEn(r6_write), .iD(iRegC), .oQ(r6_out));
84 REG32 r7(.iClk(iClk), .nRst(nRst), .iEn(r7_write), .iD(iRegC), .oQ(r7_out));
85 REG32 r8(.iClk(iClk), .nRst(nRst), .iEn(r8_write), .iD(iRegC), .oQ(r8_out));
86 REG32 r9(.iClk(iClk), .nRst(nRst), .iEn(r9_write), .iD(iRegC), .oQ(r9_out));
87 REG32 r10(.iClk(iClk), .nRst(nRst), .iEn(r10_write), .iD(iRegC), .oQ(r10_out));
88 REG32 r11(.iClk(iClk), .nRst(nRst), .iEn(r11_write), .iD(iRegC), .oQ(r11_out));
89 REG32 r12(.iClk(iClk), .nRst(nRst), .iEn(r12_write), .iD(iRegC), .oQ(r12_out));
90 REG32 r13(.iClk(iClk), .nRst(nRst), .iEn(r13_write), .iD(iRegC), .oQ(r13_out));
91 REG32 r14(.iClk(iClk), .nRst(nRst), .iEn(r14_write), .iD(iRegC), .oQ(r14_out));
92 REG32 r15(.iClk(iClk), .nRst(nRst), .iEn(r15_write), .iD(iRegC), .oQ(r15_out));
93
94
95 endmodule

```

## Seven Segment Code

Sev\_segs module

```
1  module SevSegs(
2      iNum,
3      oSeg1,
4      oSeg2,
5      oSeg3,
6      oSeg4,
7      oSeg5,
8      oSeg6,
9      oSeg7,
10     oSeg8
11 );
12   input wire [31:0] iNum;
13   output wire [6:0] oSeg1, oSeg2, oSeg3, oSeg4, oSeg5, oSeg6, oSeg7, oSeg8;
14   SevenSeg s1(
15       .iNum(iNum[3:0]),
16       .oSeg(oSeg1)
17   );
18   SevenSeg s2(
19       .iNum(iNum[7:4]),
20       .oSeg(oSeg2)
21   );
22   SevenSeg s3(
23       .iNum(iNum[11:8]),
24       .oSeg(oSeg3)
25   );
26   SevenSeg s4(
27       .iNum(iNum[15:12]),
28       .oSeg(oSeg4)
29   );
30   SevenSeg s5(
31       .iNum(iNum[19:16]),
32       .oSeg(oSeg5)
33   );
34   SevenSeg s6(
35       .iNum(iNum[23:20]),
36       .oSeg(oSeg6)
37   );
38   SevenSeg s7(
39       .iNum(iNum[27:24]),
40       .oSeg(oSeg7)
41   );
42   SevenSeg s8(
43       .iNum(iNum[31:28]),
44       .oSeg(oSeg8)
45   );
```

## Seven\_seg module

```
1  module SevenSeg(
2      iNum,
3      oSeg
4  );
5      input wire [3:0] iNum;
6      output reg [6:0] oSeg;
7
8  always @(iNum) begin
9      case(iNum)
10         4'h0: oSeg = 7'b1000000;
11         4'h1: oSeg = 7'b1111001;
12         4'h2: oSeg = 7'b0100100;
13         4'h3: oSeg = 7'b0110000;
14         4'h4: oSeg = 7'b0011001;
15         4'h5: oSeg = 7'b0010010;
16         4'h6: oSeg = 7'b0000010;
17         4'h7: oSeg = 7'b1111000;
18         4'h8: oSeg = 7'b0000000;
19         4'h9: oSeg = 7'b0010000;
20         4'hA: oSeg = 7'b0001000;
21         4'hB: oSeg = 7'b0000011;
22         4'hC: oSeg = 7'b1000110;
23         4'hD: oSeg = 7'b0100001;
24         4'hE: oSeg = 7'b0000110;
25         4'hF: oSeg = 7'b0001110;
26     endcase
27 end
28
29 endmodule
```

# VGA Code

## VGA controller code

```
1 // Simple VGA Controller
2 // 640x480 Resolution
3
4 module vga_controller(
5     iClk_50,
6     nRst,
7     iVGA_colorData,
8     oVGA_colorAddress,
9     oVGA_R,
10    oVGA_G,
11    oVGA_B,
12    oVGA_Clk,
13    oVGA_Blink,
14    oVGA_HSync,
15    oVGA_VSync,
16    oVGA_Sync,
17 );
18
19 input wire iClk_50, nRst;
20 input wire [29:0] iVGA_colorData;
21 output wire [31:0] oVGA_colorAddress;
22 output wire [9:0] oVGA_R, oVGA_G, oVGA_B;
23 output wire oVGA_Clk, oVGA_Blink, oVGA_HSync, oVGA_VSync, oVGA_Sync;
24
25 parameter WIDTH = 16'd639;
26 parameter HEIGHT = 16'd479;
27
28 // Signal States
29 // Active Video
30 parameter SS_AV = 3'd0;
31 // Front Porch
32 parameter SS_FP = 3'd1;
33 // Sync Pulse
34 parameter SS_SP = 3'd2;
35 // Back Porch
36 parameter SS_BP = 3'd3;
37 // Blank
38 parameter SS_BL = 3'd4;
39
40 // Video States
41 // Active Video
42 parameter VS_AV = 2'd0;
43 // Horizontal Sync
44 parameter VS_HS = 2'd1;
45 // Vertical Sync
46 C:\Users\lukeal\Documents\github\MiniSRC\VGA
47
48 // Video State
49 reg [1:0] videoState, videoState_next;
50 // Horizontal and Vertical Signal States
51 reg [2:0] hState, vState;
52 reg [2:0] hState_next, vState_next;
53
54 // Signal Clock Counters
55 reg [15:0] s_width, s_height;
56
57 // In/Out Color Data
58 assign oVGA_colorAddress = {s_height, s_width};
59 wire [9:0] cR, cG, cB;
60 assign cR = iVGA_colorData[9:0];
61 assign cG = iVGA_colorData[19:10];
62 assign cB = iVGA_colorData[29:20];
63
64 // Clock Divider
65 reg Clk_25;
66 always @(posedge iClk_50, negedge nRst) begin
67     if(~nRst) Clk_25 = 1'b0;
68     else Clk_25 = ~Clk_25;
69 end
70
71 always @(posedge Clk_25, negedge nRst) begin
72     if(~nRst) begin
73         videoState = VS_AV;
74         hState = SS_AV;
75         vState = SS_AV;
76     end
77     else begin
78         videoState = videoState_next;
79         vState = vState_next;
80         hState = hState_next;
81         case(videoState)
82             // Active Video
83             VS_AV: begin
84                 oVGA_R = cR;
85                 oVGA_G = cG;
86                 oVGA_B = cB;
87                 if (s_width == WIDTH) begin
88                     s_width = 16'd0;
89                     if(s_height == HEIGHT) begin
90                         s_height = 16'd0;
```

```

91          videoState_next = VS_VS;
92          vState_next = SS_FP;
93          hState_next = SS_BL;
94      end else begin
95          s_height = s_height + 16'd1;
96          videoState_next = VS_HS;
97          vState_next = SS_AV;
98          hState_next = SS_FP;
99      end
100 end else begin
101     s_width = s_width + 16'd1;
102     videoState_next = VS_AV;
103     vState_next = SS_AV;
104     hState_next = SS_AV;
105 end
106 end
107 // Vertical Sync
108 VS_VS: begin
109     oVGA_R = 10'd0;
110     oVGA_G = 10'd0;
111     oVGA_B = 10'd0;
112     case(vState)
113         SS_FP:;
114         SS_SP:;
115         SS_BP:;
116         default: begin
117             videoState_next = VS_AV;
118             vState_next = SS_AV;
119             hState_next = SS_AV;
120         end
121     endcase
122 end
123 // Horizontal Sync
124 VS_HS: begin
125 end
126 default: begin
127     videoState_next = VS_AV;
128     vState_next = SS_AV;
129     hState_next = SS_AV;
130 end
131 endcase
132 end
133 end
134 endmodule
135
136

```

## Memory code

### Memory module

```
1  `include "../Processor/Control/ISA.vh"
2  `include "../Processor/tb/sim_ISA.vh"
3  module memory(
4      iClk, iRead, iWrite,
5      iData, iAddr,
6      oData
7  );
8
9  input wire iClk, iRead, iWrite;
10 input wire [31:0] iData, iAddr;
11 output reg [31:0] oData;
12
13 reg [31:0] mem[0:1023];
14
15 wire [31:2] Addri;
16 assign Addri = iAddr[31:2];
17
18 // assign oData = iRead ? mem[Addri] : 32'd0;
19
20 always @(posedge iClk) begin
21     if(iWrite)
22         mem[Addri] <= iData;
23     oData <= mem[Addri];
24 end
25
26 initial $readmemh("program.hex", mem);
27
28 // initial begin
29 //     // Set the output port to the value in R1
30 //     mem[0] = `INS_J(`ISA_OUT, 4'h1);
31 //     // Increase R1 by 1
32 //     mem[1] = `INS_I(`ISA_ADDI, 4'h1, 4'h1, 19'd1);
33 //     // Load the counter value from the in port into R2
34 //     mem[2] = `INS_J(`ISA_IN, 4'h2);
35 //     // Loop for the duration of R2
36 //     mem[3] = `INS_I(`ISA_ADDI, 4'h3, 4'h3, 19'd1);
37 //     mem[4] = `INS_R(`ISA_SUB, 4'h4, 4'h2, 4'h3);
38 //     mem[5] = `INS_B(`ISA_BRx, 4'h4, `ISA_BR_POSI, -19'd3);
39 //     mem[6] = `INS_I(`ISA_ADDI, 4'h3, 4'h0, 19'd0);
40 //     // Go back to the beginning of the code
41 //     mem[7] = `INS_J(`ISA_JFR, 4'h0);
42 // end
43
44 endmodule
```

## MMU module

```
1  module MMU(
2      iClk, iRead, iWrite,
3      iData, iAddr,
4      oData
5  );
6
7  input wire iClk, iRead, iWrite;
8  input wire [31:0] iData, iAddr;
9  output wire [31:0] oData;
10
11
12 memory mem(
13     .iClk(iClk),
14     .iRead(iRead),
15     .iWrite(iWrite),
16     .iData(iData),
17     .iAddr(iAddr),
18     .oData(oData)
19 );
20
21 endmodule
```

## Appendix 2: Processor Schematic



## Appendix 3: Functional Simulation results for Phase 4

## Simulation test bench code:

```
1 `timescale 1ns/1ps
2 `include "../Control/ISA.vh"
3 `include "../constants.vh"
4 `include "../sim_ISA.vh"
5
6 // Complete simulation in 153100ns with iPort = 0x00000000.
7
8 module sim_LAB4_prog();
9
10 parameter SA = `START_PC_ADDRESS;
11
12 wire Clk;
13 reg nRst = 1'b0;
14
15 ClockGenerator cg(
16     .nRst(nRst),
17     .oClk(Clk)
18 );
19
20 wire mem_read, mem_write;
21 wire [31:0] proc_mem_out, proc_byte_mem_addr;
22 reg [31:0] proc_mem_in;
23 wire [31:0] oPort;
24 wire [31:0] iPort = 32'h00000000;
25
26 wire [31:2] word_addr;
27 assign word_addr = proc_byte_mem_addr[31:2];
28
29 `define MEM_MAX (511)
30 reg [31:0] word_mem[0:`MEM_MAX];
```

```

32  integer i;
33  initial begin
34      for (i = 0; i <= `MEM_MAX; i = i + 1) begin
35          word_mem[i] = 32'h00000000; // Initialize memory to zero
36      end
37      // initialize memory with program
38      $readmemh("Processor/tb/lab4demo/program.hex", word_mem);
39      #1;
40      nRst = 1'b1; // Release reset after 1ns
41  end
42
43  always @(*(proc_mem_out, proc_mem_in, word_addr)) begin
44      if(mem_write)
45          word_mem[word_addr] <= proc_mem_out;
46      proc_mem_in <= word_mem[word_addr];
47  end
48
49  Processor proc(
50      .iClk(Clk),
51      .nRst(nRst),
52      .oMemAddr(proc_byte_mem_addr),
53      .oMemData(proc_mem_out),
54      .iMemData(proc_mem_in),
55      .iMemRdy(1'b1),
56      .oMemRead(mem_read),
57      .oMemWrite(mem_write)
58      ,.iPORT(iPort)
59      ,.oPORT(oPort)
60  );
61
62  endmodule

```

Processor State before execution:





For memory state before and after execution, see the following Appendix 4.

## Appendix 4: Printout of the contents of memory for Phase 4

## Before Execution:



@bc 00100111111111100000000000000000  
@bd 10101100000000000000000000000000  
@be 00000000000000000000000000000000  
@bf 00000000000000000000000000000000  
@c0 00000000000000000000000000000000  
@c1 00000000000000000000000000000000  
@c2 00000000000000000000000000000000  
@c3 00000000000000000000000000000000  
@c4 00000000000000000000000000000000  
@c5 00000000000000000000000000000000  
@c6 00000000000000000000000000000000  
@c7 00000000000000000000000000000000  
@c8 00000000000000000000000000000000  
@c9 00000000000000000000000000000000  
@ca 00000000000000000000000000000000  
@cb 00000000000000000000000000000000  
@cc 00000000000000000000000000000000  
@cd 00000000000000000000000000000000  
@ce 00000000000000000000000000000000  
@cf 00000000000000000000000000000000  
@d0 00000000000000000000000000000000  
@d1 00000000000000000000000000000000  
@d2 00000000000000000000000000000000  
@d3 00000000000000000000000000000000  
@d4 00000000000000000000000000000000  
@d5 00000000000000000000000000000000  
@d6 00000000000000000000000000000000  
@d7 00000000000000000000000000000000  
@d8 00000000000000000000000000000000  
@d9 00000000000000000000000000000000  
@da 00000000000000000000000000000000  
@db 00000000000000000000000000000000  
@dc 00000000000000000000000000000000  
@dd 00000000000000000000000000000000  
@de 00000000000000000000000000000000  
@df 00000000000000000000000000000000  
@e0 00000000000000000000000000000000  
@e1 00000000000000000000000000000000





@17a 00000000000000000000000000000000  
@17b 00000000000000000000000000000000  
@17c 00000000000000000000000000000000  
@17d 00000000000000000000000000000000  
@17e 00000000000000000000000000000000  
@17f 00000000000000000000000000000000  
@180 00000000000000000000000000000000  
@181 00000000000000000000000000000000  
@182 00000000000000000000000000000000  
@183 00000000000000000000000000000000  
@184 00000000000000000000000000000000  
@185 00000000000000000000000000000000  
@186 00000000000000000000000000000000  
@187 00000000000000000000000000000000  
@188 00000000000000000000000000000000  
@189 00000000000000000000000000000000  
@18a 00000000000000000000000000000000  
@18b 00000000000000000000000000000000  
@18c 00000000000000000000000000000000  
@18d 00000000000000000000000000000000  
@18e 00000000000000000000000000000000  
@18f 00000000000000000000000000000000  
@190 00000000000000000000000000000000  
@191 00000000000000000000000000000000  
@192 00000000000000000000000000000000  
@193 00000000000000000000000000000000  
@194 00000000000000000000000000000000  
@195 00000000000000000000000000000000  
@196 00000000000000000000000000000000  
@197 00000000000000000000000000000000  
@198 00000000000000000000000000000000  
@199 00000000000000000000000000000000  
@19a 00000000000000000000000000000000  
@19b 00000000000000000000000000000000  
@19c 00000000000000000000000000000000  
@19d 00000000000000000000000000000000  
@19e 00000000000000000000000000000000  
@19f 00000000000000000000000000000000

After Execution:



@4c 00000000000000000000000000000000  
@4d 00000000000000000000000000000000  
@4e 00000000000000000000000000000000  
@4f 00000000000000000000000000000000  
@50 00000000000000000000000000000000  
@51 00000000000000000000000000000000  
@52 00000000000000000000000000000000  
@53 00000000000000000000000000000000  
@54 0000000000000000000000000000000010010111  
@55 00000000000000000000000000000000  
@56 00000000000000000000000000000000  
@57 00000000000000000000000000000000  
@58 00000000000000000000000000000000  
@59 00000000000000000000000000000000  
@5a 00000000000000000000000000000000  
@5b 00000000000000000000000000000000  
@5c 00000000000000000000000000000000  
@5d 00000000000000000000000000000000  
@5e 00000000000000000000000000000000  
@5f 00000000000000000000000000000000  
@60 00000000000000000000000000000000  
@61 00000000000000000000000000000000  
@62 00000000000000000000000000000000  
@63 00000000000000000000000000000000  
@64 00000000000000000000000000000000  
@65 00000000000000000000000000000000  
@66 00000000000000000000000000000000  
@67 00000000000000000000000000000000  
@68 00000000000000000000000000000000  
@69 00000000000000000000000000000000  
@6a 00000000000000000000000000000000  
@6b 00000000000000000000000000000000  
@6c 00000000000000000000000000000000  
@6d 00000000000000000000000000000000  
@6e 00000000000000000000000000000000  
@6f 00000000000000000000000000000000  
@70 00000000000000000000000000000000  
@71 00000000000000000000000000000000

@be 00000000000000000000000000000000  
@bf 00000000000000000000000000000000  
@c0 00000000000000000000000000000000  
@c1 00000000000000000000000000000000  
@c2 00000000000000000000000000000000  
@c3 00000000000000000000000000000000  
@c4 00000000000000000000000000000000  
@c5 00000000000000000000000000000000  
@c6 00000000000000000000000000000000  
@c7 00000000000000000000000000000000  
@c8 00000000000000000000000000000000  
@c9 00000000000000000000000000000000  
@ca 00000000000000000000000000000000  
@cb 00000000000000000000000000000000  
@cc 00000000000000000000000000000000  
@cd 00000000000000000000000000000000  
@ce 00000000000000000000000000000000  
@cf 00000000000000000000000000000000  
@d0 00000000000000000000000000000000  
@d1 00000000000000000000000000000000  
@d2 00000000000000000000000000000000  
@d3 00000000000000000000000000000000  
@d4 00000000000000000000000000000000  
@d5 00000000000000000000000000000000  
@d6 00000000000000000000000000000000  
@d7 00000000000000000000000000000000  
@d8 00000000000000000000000000000000  
@d9 00000000000000000000000000000000  
@da 00000000000000000000000000000000  
@db 00000000000000000000000000000000  
@dc 00000000000000000000000000000000  
@dd 00000000000000000000000000000000  
@de 00000000000000000000000000000000  
@df 00000000000000000000000000000000  
@e0 00000000000000000000000000000000  
@e1 00000000000000000000000000000000  
@e2 00000000000000000000000000000000  
@e3 00000000000000000000000000000000

@10a 00000000000000000000000000000000  
@10b 00000000000000000000000000000000  
@10c 00000000000000000000000000000000  
@10d 00000000000000000000000000000000  
@10e 00000000000000000000000000000000  
@10f 00000000000000000000000000000000  
@110 00000000000000000000000000000000  
@111 00000000000000000000000000000000  
@112 00000000000000000000000000000000  
@113 00000000000000000000000000000000  
@114 00000000000000000000000000000000  
@115 00000000000000000000000000000000  
@116 00000000000000000000000000000000  
@117 00000000000000000000000000000000  
@118 00000000000000000000000000000000  
@119 00000000000000000000000000000000  
@11a 00000000000000000000000000000000  
@11b 00000000000000000000000000000000  
@11c 00000000000000000000000000000000  
@11d 00000000000000000000000000000000  
@11e 00000000000000000000000000000000  
@11f 00000000000000000000000000000000  
@120 00000000000000000000000000000000  
@121 00000000000000000000000000000000  
@122 00000000000000000000000000000000  
@123 00000000000000000000000000000000  
@124 00000000000000000000000000000000  
@125 00000000000000000000000000000000  
@126 00000000000000000000000000000000  
@127 00000000000000000000000000000000  
@128 00000000000000000000000000000000  
@129 00000000000000000000000000000000  
@12a 00000000000000000000000000000000  
@12b 00000000000000000000000000000000  
@12c 00000000000000000000000000000000  
@12d 00000000000000000000000000000000  
@12e 00000000000000000000000000000000  
@12f 00000000000000000000000000000000  
@130 00000000000000000000000000000000





```
@1f3 00000000000000000000000000000000  
@1f4 00000000000000000000000000000000  
@1f5 00000000000000000000000000000000  
@1f6 00000000000000000000000000000000  
@1f7 00000000000000000000000000000000  
@1f8 00000000000000000000000000000000  
@1f9 00000000000000000000000000000000  
@1fa 00000000000000000000000000000000  
@1fb 00000000000000000000000000000000  
@1fc 00000000000000000000000000000000  
@1fd 00000000000000000000000000000000  
@1fe 00000000000000000000000000000000  
@1ff 00000000000000000000000000000000
```

## Appendix 5: Image of the FPGA board during operation



Image of the Cyclone II DE2 FPGA Board upon completion of the phase 4 program.



Image of the Cyclone II DE2 FPGA Board mid execution of the phase 4 program.