

# CprE 381: Computer Organization and Assembly-Level Programming

## Project Part 1 Report

Team Members: \_\_\_\_\_ John Brose \_\_\_\_\_

\_\_\_\_\_ Andrew Deick \_\_\_\_\_

\_\_\_\_\_ Rolf Anderson \_\_\_\_\_

Project Teams Group #: \_\_\_\_\_ 2-4 \_\_\_\_\_

[Part 1 (d)] Include your final MIPS processor schematic in your lab report.



**[Part 2 (a.i)]** Create a spreadsheet detailing the list of  $M$  instructions to be supported in your project alongside their binary opcodes and funct fields, if applicable. Create a separate column for each binary bit. Inside this spreadsheet, create a new column for the  $N$  control signals needed by your datapath implementation. The end result should be an  $N \times M$  table where each row corresponds to the output of the control logic module for a given instruction.

| Instruction | Opcode (Binary) | Funct (Binary) | jr [jr mux] | jal [jal mux] | MemtoReg | MemWrite [we mem]                  | RegWrite [we reg]                 | Control Signals                    | Branch (PCSrc)                                          | SignExt [determines sign extension on 16 to 32 bit extender] | j [jump]                                                        |             |
|-------------|-----------------|----------------|-------------|---------------|----------|------------------------------------|-----------------------------------|------------------------------------|---------------------------------------------------------|--------------------------------------------------------------|-----------------------------------------------------------------|-------------|
| 3           | <b>R-TYPE</b>   |                |             |               |          |                                    |                                   |                                    |                                                         |                                                              |                                                                 |             |
| 4           | add             | "000100"       | ..."        | 0             | 0        | 0 [add does NOT read from memory]  | 0 [add does NOT write to memory]  | 1 [add writes back to a register]  | 0 [add uses rt as destination register rather than rd]  | 0 [use PC+4 no branch]                                       | 1 [add sign extended]                                           | 0 [no jump] |
| 5           | addu            | "000101"       | ..."        | 0             | 0        | 0 [addu does NOT read from memory] | 0 [addu does NOT write to memory] | 0 [addu writes back to a register] | 0 [addu uses rt as destination register rather than rd] | 0 [use PC+4 no branch]                                       | 1 [add sign extended]                                           | 0 [no jump] |
| 6           | andi            | "000110"       | ..."        | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 0 [uses rt as destination register rather than rd]      | 0 [use PC+4 no branch]                                       | 0 [zero extended]                                               | 0 [no jump] |
| 7           | or              | "000111"       | ..."        | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 0 [writes back to a register]      | 0 [uses rt as destination register rather than rd]      | 0 [use PC+4 no branch]                                       | 0 [zero extended]                                               | 0 [no jump] |
| 8           | ori             | "001101"       | ..."        | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 0 [uses rt as destination register rather than rd]      | 0 [use PC+4 no branch]                                       | 0 [zero extended]                                               | 0 [no jump] |
| 9           | slti            | "001010"       | ..."        | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 0 [uses rt as destination register rather than rd]      | 0 [use PC+4 no branch]                                       | 1 [sign extended]                                               | 0 [no jump] |
| 10          | lui             | "001111"       | ..."        | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 0 [writes back to a register]      | 0 [uses rt as destination register rather than rd]      | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 11          |                 |                |             |               |          |                                    |                                   |                                    |                                                         |                                                              |                                                                 |             |
| 12          | beq             | "000100"       | ..."        | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 0 [does NOT write to a register]   | 0 [does not matter]                                     | 1 [use branch if needed]                                     | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 13          | bne             | "000101"       | ..."        | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 0 [does not matter]                | 0 [does not matter]                                     | 1 [use branch if needed]                                     | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 14          |                 |                |             |               |          |                                    |                                   |                                    |                                                         |                                                              |                                                                 |             |
| 15          | lw              | "100011"       | ..."        | 0             | 0        | 1 [reads from memory]              | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 0 [uses rt as destination register rather than rd]      | 0 [use PC+4 no branch]                                       | 1 [sign extended]                                               | 0 [no jump] |
| 16          | sw              | "100101"       | ..."        | 0             | 0        | 0 [does not matter]                | 1 [writes to memory]              | 0 [does NOT write to a register]   | 0 [does not matter]                                     | 0 [use PC+4 no branch]                                       | 1 [sign extended]                                               | 0 [no jump] |
| 17          |                 |                |             |               |          |                                    |                                   |                                    |                                                         |                                                              |                                                                 |             |
| 18          | <b>R-TYPE</b>   |                |             |               |          |                                    |                                   |                                    |                                                         |                                                              |                                                                 |             |
| 19          | add             | "000000"       | "100000"    | 0             | 0        | 0 [add does NOT read from memory]  | 0 [add does NOT write to memory]  | 1 [add writes back to a register]  | 1 [add uses rd as destination]                          | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 20          | addu            | "000000"       | "100001"    | 0             | 0        | 0 [addu does NOT read from memory] | 0 [addu does NOT write to memory] | 1 [addu writes back to a register] | 1 [addu uses rd as destination]                         | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 21          | andi            | "000000"       | "100100"    | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 1 [uses rd as destination]                              | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 22          | nor             | "000000"       | "100111"    | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 1 [uses rd as destination]                              | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 23          | xor             | "000000"       | "100110"    | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 1 [uses rd as destination]                              | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 24          | or              | "000000"       | "100101"    | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 1 [uses rd as destination]                              | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 25          | sh              | "000000"       | "101010"    | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 1 [uses rd as destination]                              | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 26          | sll             | "000000"       | "000000"    | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 1 [uses rd as destination]                              | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 27          | srl             | "000000"       | "000010"    | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 1 [uses rd as destination]                              | 0 [use PC+4 no branch]                                       | 0 [jumps? zero extended]                                        | 0 [no jump] |
| 28          | sub             | "000000"       | "100000"    | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 1 [uses rd as destination]                              | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 29          | subu            | "000000"       | "100010"    | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 1 [uses rd as destination]                              | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 30          | sub             | "000000"       | "100011"    | 0             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1 [writes back to a register]      | 1 [uses rd as destination]                              | 0 [use PC+4 no branch]                                       | 1 [does not matter but choose one because more sign extensions] | 0 [no jump] |
| 31          | jr              | "000000"       | "001000"    | 1             | 0        | 0 [does not matter]                | 0 [does NOT write to memory]      | 0 [does not matter]                | 0 [does not matter]                                     | 0 [does not matter]                                          | 1 [does not matter but choose one because more sign extensions] | 1 [jump]    |
| 32          |                 |                |             |               |          |                                    |                                   |                                    |                                                         |                                                              |                                                                 |             |
| 33          | <b>J-TYPE</b>   |                |             |               |          |                                    |                                   |                                    |                                                         |                                                              |                                                                 |             |
| 34          | j               | "000010"       | ..."        | 0             | 0        | 0 [does not matter]                | 0 [does not matter]               | 0 [does not matter]                | 0 [does not matter]                                     | 0 [does not matter]                                          | 1 [does not matter but choose one because more sign extensions] | 1 [jump]    |
| 35          | jal             | "000011"       | ..."        | 1             | 0        | 0 [does NOT read from memory]      | 0 [does NOT write to memory]      | 1                                  | 0 [does not matter]                                     | 0 [does not matter]                                          | 1 [does not matter but choose one because more sign extensions] | 1 [jump]    |

**[Part 2 (a.ii)]** Implement the control logic module using whatever method and coding style you prefer. Create a testbench to test this module individually, and show that your output matches the expected control signals from problem 1(a).

Below is the format for each instruction of the control unit, the coding mainly consisted of with select statements (shown below). The tb output waveform compares the control\_unit.vhd outputs with the expected outputs derived from our spreadsheet from part 2a.

```

30      -- 14      13      12      11-8      7      6      5      4      3      2      1      0
31      -- "0      0      0      0000      0      0      0      1      1      0      1      0"      0
32      -- jr      jal      ALUSrc      ALUControl      MemtoReg      we_mem      we_reg      RegDst      PCSrc      SignExt      j"      Halt
33      --
34
35
36      with i_funct select s_RTYPE <=
37          "0001110001110100" when "100000", -- add
38          "0000000001110100" when "100001", -- addu
39          "00000010001110100" when "100100", -- and
40          "0000101001110100" when "100111", -- nor
41          "0000100001110100" when "100110", -- xor
42          "00000110001110100" when "100101", -- or
43          "000001110001110100" when "101010", -- slt
44          "000001110001110100" when "101011", -- sltu
45          "0000010001001110100" when "000000", -- sll
46          "0000100000110000" when "000010", -- srl
47          "0000101000110100" when "000011", -- sra
48          "000011110001110100" when "100010", -- sub
49          "0000000001001110100" when "100011", -- subu
50          "1000000000000000" when "001000", -- jr
51          "0000000000000000" when others;
52
53
54      with i_opcode select o_Ctrl_Unt <=
55          s_RTYPE      when "000000", -- RTYPE
56          "001111000100100" when "001000", -- addi
57
58          "0000000000000001" when "010100", -- halt
59
60          "001000000100100" when "001001", -- addiu

```

|              |                                                                                           |
|--------------|-------------------------------------------------------------------------------------------|
| control_unit | _____                                                                                     |
| input        | _____                                                                                     |
| opcode       | 000000                                                                                    |
| func         | 100000 100001 100100 100111 100110 100101 101010 000000 000010 000011                     |
| output       | _____                                                                                     |
| o_Ctrl_Unt   | 15'h0E34 15'h0034 15'h0234 15'h0534 15'h0434 15'h0334 15'h0734 15'h0934 15'h0830 15'h0A34 |
| expected_out | 15'h0E34 15'h0034 15'h0234 15'h0534 15'h0434 15'h0334 15'h0734 15'h0934 15'h0830 15'h0A34 |

|              |                                                                                           |
|--------------|-------------------------------------------------------------------------------------------|
| control_unit | _____                                                                                     |
| input        | _____                                                                                     |
| opcode       | 000000 001000 001001 001100 001110 001101 001010 001111 000100                            |
| func         | 100010 100011 000000                                                                      |
| output       | _____                                                                                     |
| o_Ctrl_Unt   | 15'h0F34 15'h0134 15'h1E24 15'h1024 15'h1220 15'h1420 15'h1320 15'h1724 15'h1624 15'h0B0C |
| expected_out | 15'h0F34 15'h0134 15'h1E24 15'h1024 15'h1220 15'h1420 15'h1320 15'h1724 15'h1624 15'h0B0C |

|              |                                                                |
|--------------|----------------------------------------------------------------|
| control_unit | _____                                                          |
| input        | _____                                                          |
| opcode       | 000101 100011 101011 000010 000011 000000 010100               |
| func         | 000000 001000 000000                                           |
| output       | _____                                                          |
| o_Ctrl_Unt   | 15'h0C0C 15'h10A4 15'h1044 15'h0006 15'h2026 15'h4006 15'h0001 |
| expected_out | 15'h0C0C 15'h10A4 15'h1044 15'h0006 15'h2026 15'h4006 15'h0001 |

**[Part 2 (b.i)]** What are the control flow possibilities that your instruction fetch logic must support? Describe these possibilities as a function of the different control flow-related instructions you are required to implement.

There are 27 unique instructions: addi, addiu, andi, xori, ori, slti, lui, beq, bne, lw, sw, add, addu, and, nor xor, or, slt, sll, srl, sra, sub, subu, jr, j, jal & halt.

All of these instructions need to have separate control states. Many of the I-type instructions are exactly the same as J-type counterparts except that ALUSrc is set to receive an immediate value instead of one from the register.

There are two control signals that control the addressing: j and Branch. Branch requires a specific ALU output, while j (short for jump) is unconditional. The jump register can receive an address from a register or from a 26 bit immediate. (This mux is controlled by the jr signal). Perhaps misleadingly, the jal control signal writes the current address (+4) to register 31, and the signal itself is not at all related to addressing. (The jal instruction enables the j signal).

The remaining signals include two Write Enables (MemWrite and RegWrite) and two pathing signals (RegDst and MemtoReg), which, respectively, determine if rs or rd should decide the final register write destination and whether the memory or the ALU should write to the register.

Last but not least, we have SignExtend that is the only bit control signal that does not correspond with a Mux. It is fed into the Extended component to determine if the immediate should be sign-extended or zero-extended.

**[Part 2 (b.ii)]** Draw a schematic for the instruction fetch logic and any other datapath modifications needed for control flow instructions. What additional control signals are needed?



Instruction Fetch (left) requires a 32 bit signal from the PC. As seen in the picture, the processor splits the instruction into several more manageable pieces, which it then handles individually. The instructions are always connected to their respective components, and the output of the component is ignored when necessary.

For example, the jump address is calculated every instruction and then ignored via control signals.

Speaking of Control Signals, we have 15 bits worth. Four of the bits correspond to ALU Opcode, which is determined from the Opcode or Function Code. We have jr, jal, ALUSrc, MemtoReg, MemWrite, RegWrite, RegDst, Branch, SignExt, j, & Halt.

**[Part 2 (b.iii)]** Implement your new instruction fetch logic using VHDL. Use Modelsim to test your design thoroughly to make sure it is working as expected. Describe how the execution of the control flow possibilities corresponds to the Modelsim waveforms in your writeup.

|                         |                                     |
|-------------------------|-------------------------------------|
| tb/MyMips/s_Inst        | 32b00110100000100000001001000110100 |
| Fetched Parts           |                                     |
| tb/MyMips/s_opcode      | 6b001101                            |
| tb/MyMips/s_RegInRea... | 5b00000                             |
| tb/MyMips/s_RegInRea... | 5b00000                             |
| tb/MyMips/s_RegD        | 5b00010                             |
| tb/MyMips/s_shamt       | 5b01000                             |
| tb/MyMips/s_funcCode    | 6b110100                            |
| tb/MyMips/s_imm16       | 16b0000100100011000                 |
| tb/MyMips/s_jumpAddr... | 32b00000000010000000100100011010000 |

|                         |                                     |
|-------------------------|-------------------------------------|
| tb/MyMips/s_Inst        | 32b00000100000010000000000000000011 |
| Fetched Parts           |                                     |
| tb/MyMips/s_opcode      | 6b000010                            |
| tb/MyMips/s_RegInRea... | 5b00000                             |
| tb/MyMips/s_RegInRea... | 5b00000                             |
| tb/MyMips/s_RegD        | 5b00000                             |
| tb/MyMips/s_shamt       | 5b00000                             |
| tb/MyMips/s_funcCode    | 6b000011                            |
| tb/MyMips/s_imm16       | 16b00000000000000000000000000000011 |
| tb/MyMips/s_jumpAddr... | 32b00000000010000000100100011010000 |

|                         |                                      |
|-------------------------|--------------------------------------|
| tb/MyMips/s_Inst        | 32b001101000000100110001001000110100 |
| Fetched Parts           |                                      |
| tb/MyMips/s_opcode      | 6b001101                             |
| tb/MyMips/s_RegInRea... | 5b00000                              |
| tb/MyMips/s_RegInRea... | 5b00000                              |
| tb/MyMips/s_RegD        | 5b00000                              |
| tb/MyMips/s_shamt       | 5b00000                              |
| tb/MyMips/s_funcCode    | 6b110100                             |
| tb/MyMips/s_imm16       | 16b0000100100011000                  |
| tb/MyMips/s_jumpAddr... | 32b00000000010000000100100011010000  |

|  |
|--|
|  |
|  |
|  |

In addition to the instructions tested above, every instruction was tested individually to ensure that the fetch worked as expected. These cover every control flow possibility, because literally every instruction is accounted for.

**[Part 2 (c.i.1)]** *Describe the difference between logical (srl) and arithmetic (sra) shifts. Why does MIPS not have a sla instruction?*

Srl and sra are both shift right operations, which move bits from left to right. Srl shifts logical, meaning no matter what bit 31 holds, the added bits will be zeros. Sra is arithmetic, meaning if bit 31 is a one, then ones will be used to shift in, otherwise zeros will. Arithmetic allows the shifting of negative numbers by keeping the number negative but still shifting bits. Sla is not needed since bit 0 does not affect the sign of the number and thus shifting in ones would hold no distinct advantage.

**[Part 2 (c.i.2)]** *In your writeup, briefly describe how your VHDL code implements both the arithmetic and logical shifting operations.*

Our VHDL code implements arithmetic and logical operations but having a control bit to a 2 to 1 mux that selects whether the added bits to the input should be 0 or input(31). This allows for shifting in strictly zeros or shifting between 0's and 1's based on whether the number is positive or negative.

**[Part 2 (c.i.3)]** *In your writeup, explain how the right barrel shifter above can be enhanced to also support left shifting operations.*

In order to add left shifting operations the only thing that needs to be added is two simple 2 to 1 muxs. The first mux selects between the normal input and the left shift input where bit 31 becomes bit 0, bit 0 becomes 31, bit 30 becomes bit 1, etc until the input is completely flipped around. Then at the output, another mux is used to either keep the normal right shift output, or flip the left shift output back around to be left shifted. Lastly, the control bit for logical versus arithmetic needs to be always set to zero for srl.

**[Part 2 (c.i.4)]** Describe how the execution of the different shifting operations corresponds to the Modelsim waveforms in your writeup.

Below shows the output from the barrel shifter test bench. `I_shift_type` dictates logical(0) or arithmetic(1). `I_shift_dir` dictates shift right(0) versus shift left(1). Finally, `i_shamt` controls shift amount. As can be seen in the waveforms when `srl` is done no 1's are shifted in, but when `sra` is used for the same number 1's are. Shifting left is also shown to work once `i_shift_dir` changes to one.



**[Part 2 (c.ii.1)]** In your writeup, briefly describe your design approach, including any resources you used to choose or implement the design. Include at least one design decision you had to make.

The design approach of the ALU mainly involved breaking down the functionality into smaller parts based on each opcode needed and then combining those parts into one big file. The first hurdle was deciding on whether to implement all functionality as separate blocks or directly implement them inside the ALU. After discussing with one another it was decided that we would have a mix. The xor, or, and, nor, and lui were all implemented inside the ALU since their implementation was quick and easy using modelsims logic operations within processes and setting portions of inputs to outputs (for lui). The harder to implement units like the adder\_subtractor, barrel\_shifter, and beq\_bne were implemented separately since they were more complicated and in order for ease of readily inside the ALU they were created and tested separately.

**[Part 2 (c.ii.2)]** Describe how the execution of the different operations corresponds to the Modelsim waveforms in your writeup.

The only other components that were tested separately were the adder\_subtractor unit and the beq\_bne unit. The adder\_subtractor unit was extensively tested in previous labs and only an overflow needed to be added to it so the unit was not tested again. The beq\_bne unit was tested by taking a single 32 bit input (assuming that it was already xored by the alu) and output a 1 based on the value of is\_equal\_type where ( $iF == 0$ ) when 1 and ( $iF != 0$ ) when 0 as can be seen below.



**[Part 2 (c.iii)]** Draw a simplified, high-level schematic for the 32-bit ALU. Consider the following questions: how is Overflow calculated? How is Zero calculated? How is  $s_{lt}$  implemented?



Sl<sub>t</sub> was calculated using the subtraction of i\_B from i\_A and then taking the signed bit of the output. In order to account for overflow bit 31 of the add\_sub unit output was anded with the overflow bit as shown. Zero was calculated by taking the xor at the output of the alu\_out\_control and then using an or tree to see if it was equal to zero. Finally, the overflow was calculated inside the add\_sub unit by xoring Cout(31) with Cout(30).

**[Part 2 (c.v)]** *Describe how the execution of the different operations corresponds to the Modelsim waveforms in your writeup.*

Each opcode was tested with a variety of cases including common and edge. Below shows what each i\_aluOp corresponds to in terms of alu function and from there the expected output can be calculated and compared to the designated output.

|      |      |     |      |
|------|------|-----|------|
| addu | 0000 | sll | 1001 |
| subu | 0001 | srl | 1000 |
| and  | 0010 | sra | 1010 |
| or   | 0011 | beq | 1011 |
| xor  | 0100 | bne | 1100 |
| nor  | 0101 |     | 1101 |
| lui  | 0110 | add | 1110 |
| slt  | 0111 | sub | 1111 |







**[Part 2 (c.viii)]** justify why your test plan is comprehensive. Include waveforms that demonstrate your test programs functioning.

The test bench run below using the datapath from lab2 is comprehensive and tests all aspects of the alu. The expected output is 16383 which matches up with the value that i\_rtData holds on the last clock cycle. A buffer step that lasted a full clock cycle in between each MIPS instruction was added in which i\_rtData was set to the output of the register that was just updated for easy error checking. The test plan is comprehensive because it tests a variety of ordinary cases for all functions as well as possible edge cases, including overflow, shifting right logical when the 31 bit is a 1, and shifting right arithmetic when the 31 bit is a 1.

The following MIPS code represents all the functions run through the tb.

```

addi $1, $0, 1
addi $2, $0, 2
addi $3, $0, 3
addi $4, $0, 4
addi $6, $0, 6
addi $7, $0, 7
addi $8, $0, 8
addi $9, $0, 9
addi $10, $0, 10
xor $11, $1, $2
or $12, $11, $3
nor $13, $12, $4
slt $14, $13, $5
add $15, $14, $6
sub $16, $15, $7
add $17, $16, $8
sub $18, $17, $9
add $19, $18, $10
addi $20, $0, -35

```

```

add $21, $19, $20
lui $22, 0x8000
add $22, $21, $22
sll $22, 2
sra $22, 2
srl $22, 18
and $22, $22, $21

```





**[Part 3]** In your writeup, show the Modelsim output for each of the following tests, and provide a discussion of result correctness. It may be helpful to also annotate the waveforms directly.



The screenshot shows the ModelSim interface with two windows. The main window displays the command-line output of the simulation process. The right-hand window is titled 'Testing Framework' and lists several assembly files: DS\_Store, addiseq.s, fibonacci.s, grendel.s, lab3Seq.s, proj1\_base\_test.s (which is checked), proj1\_bubblesort.s (which is checked), proj1\_cf\_test.s (which is checked), and simplebranch.s. The 'Run' button is highlighted in green at the bottom of the framework window.

```
U:\cpre381\CPRE381Project\cpre381-toolflow\TestingFramework.exe
Skipping compilation
Starting VHDL Simulation...
Successfully simulated program!
Victory!! Your processes matches MARS expected output with no mismatches!!
U
Adding Asm Files to Queue
Starting Simulation for : mips\proj1_base_test.s
Skipping compilation
Starting VHDL Simulation...
Successfully simulated program!
Victory!! Your processes matches MARS expected output with no mismatches!!
Adding Asm Files to Queue
Starting Simulation for : mips\proj1_bubblesort.s
Skipping compilation
Starting VHDL Simulation...
Successfully simulated program!
Victory!! Your processes matches MARS expected output with no mismatches!!
Adding Asm Files to Queue
Starting Simulation for : mips\proj1_cf_test.s
Skipping compilation
Starting VHDL Simulation...
Successfully simulated program!
Victory!! Your processes matches MARS expected output with no mismatches!!
```

**[Part 3 (a)]** Create a test application that makes use of every required arithmetic/logical instruction at least once. The application need not perform any particularly useful task, but it should demonstrate the full functionality of the processor (e.g., sequences of many instructions executed sequentially, 1 per cycle while data written into registers can be effectively retrieved and used by later instructions). Name this file Proj1\_base\_test.s.

This test is a basic tour in showing off MIPS assembly instructions, it tests all of the required logic and arithmetic instructions implemented in our processor.



The base tests ran without issue. The testing framework threw no errors and examination of the waveform logs shows everything where it needs to be. The correct instructions run, it interprets pseudoinstructions correctly, it reads the correct registers and the ALU comes up with the correct results, which are stored in the correct registers.



Every relevant register and signal were as expected.

**[Part 3 (b)]** Create and test an application which uses each of the required control-flow instructions and has a call depth of at least 5 (i.e., the number of activation records on the stack is at least 4). Name this file Proj1\_cf\_test.s.

This test uses nested functions with individual activation records. The first activation record is inputted with \$a0-2. The stack is pushed back 16 bits, the return address is located at 0(\$sp), followed by the three values. The function adds up the sum of each value and its subordinates (that is, \$a2 = \$a2 + \$a1 + \$a0; \$a1 = \$a1 + \$a0; \$a0 = \$a0 + 1), which is then stored in memory at their respective original addresses, the \$a register values are updated to their new sums and the function calls itself again, pushing the stack forward and repeating the process until \$a2 is greater than 0xf00. At this point the function begins to shrink the stack and add each \$a0-2 into \$v1, when \$a0 is 1 the function returns to the original \$ra from the first jal and finishes the test.



Once again the test ran without issue.



Every relevant register and signal were as expected.

When jal is run, the jal signal goes high, ALUSrc goes low, the PC is updated to the specified address loaded form the immediate mux, and the value of PC before the update is written to \$ra + 4. No issues here.

When beq is run, the processor puts the desired imm value into the \$at register before running beq and reading the values of rd and \$at to determine equality. The branch signal goes high.

Following that are 4 instances of the sw instruction being run, once again no issues. These tests show that our processor follows its requirements.

**[Part 3 (c)]** Create and test an application that sorts an array with  $N$  elements using the BubbleSort algorithm ([link](#)). Name this file Proj1\_bubblesort.s.

This assembly function uses the simple and inefficient bubblesort sorting algorithm. It makes room on the stack to store the array, then iterates over 0 to  $N-1$  elements. At each element, it checks to see if  $N + 1$  is less than  $N$ , if so it switches the values, otherwise it skips to the beginning of the loop again. Since elements may be more than 1 position away from their ideal position, the bubblesort function must iterate over the array until it goes through the entire array without switching any elements.







No issues running the bubble sort test on our CPU in the testing framework. Examining the waveform logs shows correct behavior.





Every relevant register and signal were as expected.

In these specific instructions, the processor begins by checking the value of \$t7 which is abstracted as a boolean swapped? value. If it is 0, it means that the processor did not switch any elements during the prior traverse of the array. This means the array is sorted, it would jump to the end of the function if this was the case. In this specific section, swapped is 1 and the process will continue. The processor uses \$t8 as the moving pointer to the current element,  $a[i]$ . It loads  $a[i]$  and  $a[i+1]$  into \$t0 and \$t1 respectively. It uses slt to determine if  $a[i+1] < a[i]$ . If it is not the case, it jumps past the swapping logic since the elements are sorted already. If it is the case, the processor stores \$t1 in  $0(\$t8)$  and \$t0 in  $4(\$t8)$ . It then sets the value of \$t7, the swapped value, to 1, as this array will require at least another traverse in order to sort it with this algorithm.

Further analysis shows expected behavior across all of the relevant elements of the processor. At the end, it successfully sorts the array  $\{5, 4, 3, 2, 1\}$  into  $\{1, 2, 3, 4, 5\}$  requiring 5 traverses.

|                         |              |          |          |          |          |          |
|-------------------------|--------------|----------|----------|----------|----------|----------|
| .../b/MyMips/s_DMemAddr | 32'h00000000 | 10010FF8 | 10010FEC | 10010FF0 | 10010FF4 | 10010FF8 |
| .../b/MyMips/s_DMemData | 32'h00000000 | 00000004 | 00000005 | 00000000 | 00000000 | 00000000 |
| .../b/MyMips/s_DMemOut  | 32'h00000005 | 00000001 | 00000002 | 00000003 | 00000004 | 00000005 |

All in all, our processor successfully passed testing.

**[Part 4]** Report the maximum frequency your processor can run at and determine what your critical path is. Draw this critical path on top of your top-level schematics. What components would you focus on to improve the frequency?



The critical path of the processor is highlighted in pen, and numerically comes out to 42.834 nanoseconds.

Interestingly enough, the critical path is not reading from the data memory (which is what I had assumed), but actually the branching instructions. After reviewing our code, I found why: when we branch, we are essentially running it through the ALU a second time by comparing the ALUresult to zero.

In order to improve efficiency, we would definitely focus on improving the branch component inside the ALU, as it would be fairly easy to make it run much faster.