

### Procedure

The lab exercise involved two main tasks: integrating the register file and ALU into the datapath of an ARM processor and simulating the updated design, as well as implementing the CMP/SUBS instruction and conditional branching. The simulation step was crucial in ensuring the correctness of the design, while the addition of CMP/SUBS and conditional branching allowed for more complex operations and decision-making within the processor.

### Task 1:

The first step is to integrate the register file and ALU modules into the datapath of the ARM processor. Before doing this, it is important to fully understand the function of each signal and how they are represented (Table 1). And then we finished the codes of inserting the reg\_file and ALU based on the given schematic (Figure 1 and 2).

| Instruction | Format            | Description             | Bits                                     |
|-------------|-------------------|-------------------------|------------------------------------------|
| ADD         | ADD R1, R2, R3    | $R[D] = R[A] + R[B]$    | 1110 000 0100 0 AAAA DDDD 0000 0000 BBBB |
|             | ADD R1, R2, #10   | $R[D] = R[A] + I$       | 1110 001 0100 0 AAAA DDDD 0000 IIII IIII |
| SUB         | SUB R1, R2, R3    | $R[D] = R[A] - R[B]$    | 1110 000 0010 0 AAAA DDDD 0000 0000 BBBB |
|             | SUB R1, R2, #10   | $R[D] = R[A] - I$       | 1110 001 0010 0 AAAA DDDD 0000 IIII IIII |
| AND         | AND R1, R2, R3    | $R[D] = R[A] \& R[B]$   | 1110 000 0000 0 AAAA DDDD 0000 0000 BBBB |
| ORR         | ORR R1, R2, R3    | $R[D] = R[A]   R[B]$    | 1110 000 1100 0 AAAA DDDD 0000 0000 BBBB |
| LDR         | LDR R1, [R2, #10] | $R[D] = MEM[R[A] + 10]$ | 1110 010 1100 1 AAAA DDDD IIII IIII IIII |
| STR         | STR R1, [R2, #10] | $MEM[R[A] + I] = R[D]$  | 1110 010 1100 0 AAAA DDDD IIII IIII IIII |
| B           | B TAG             | $PC = PC + (I \ll 2)$   | 1110 1010 IIII IIII IIII IIII IIII IIII  |

Table 1: Processor Base Instruction Set



Figure 1: Single-cycle ARM processor

```

reg_file u_reg_file (
    .clk          (clk),
    .wr_en       (RegWrite),
    .write_data(Result),
    .write_addr(Instr[15:12]),
    .read_addr1(RA1),
    .read_addr2(RA2),
    .read_data1(RD1),
    .read_data2(RD2)
);
    alu u_alu (
        .a           (srcA),
        .b           (srcB),
        .ALUControl(ALUControl),
        .Result      (ALUResult),
        .ALUFlags   (ALUFlags)
);

```

**Figure 2: The code for inserting reg\_file and alu into arm**

## Task 2:

The second task requires us to introduce a new instruction called SUBS/CMP, which performs a subtraction and also saves the resulting flags in a special register called FlagsReg(Table 2). These flags can be used to evaluate condition logic for branching.

| Instruction | Format          | Description              | Bits                                     |
|-------------|-----------------|--------------------------|------------------------------------------|
| CMP         | CMP R1, R2, R3  | R[D] = R[A] - R[B], Flag | 1110 000 0010 1 AAAA DDDD 0000 0000 BBBB |
|             | CMP R1, R2, #10 | R[D] = R[A] - I, Flag    | 1110 001 0010 1 AAAA DDDD 0000 IIII IIII |
| BXX         | BXX TAG         | PC = PC+(I<<2) if COND   | COND 1010 IIII IIII IIII IIII IIII IIII  |
|             | B               | Unconditional            | 1110                                     |
|             | BEQ             | Equal                    | 0000                                     |
|             | BNE             | Not Equal                | 0001                                     |
|             | BGE             | Greater or Equal         | 1010                                     |
|             | BGT             | Greater                  | 1100                                     |
|             | BLE             | Less or Equal            | 1101                                     |
|             | BLT             | Less                     | 1011                                     |

**Table 2: Added command and branching conditions**

To implement this, we added a new register called FlagsReg into our arm code. This register is used to store the flags produced by the recent CMP command, and the always\_ff block is being implemented to modify the register when a new control signal called FlagWrite is asserted(See Figure 3).

```
FlagsReg u_flags_reg (.clk, .FlagWrite(FlagWrite & CondEx), .write_data(ALUFlags), .read_data(StatusFlag));
```

```

1 //Junchao Zhou, Chenhan Dai
2 //04/19/2023
3 //EE469
4 //Lab #2, Task2
5
6 // FlagsReg is a flag register for updating flags
7 // Update flag only when FlagWrite signal is true
8 // Output asynchronously
9
10 // clk - system clock, same as the processor
11 // Flagwrite - write enable, allows the write_data to overwrite the 4 bit flag storage in memory
12 // write_data - the 4 bit flag which you intend to write into memory
13 // read_data - the data currently stored at memory
14 module FlagsReg(input logic clk,
15                   input logic Flagwrite,
16                   input logic [3:0] write_data,
17                   output logic [3:0] read_data);
18
19 // memory;
20 logic [3:0] memory;
21
22
23 // Write_port
24 always_ff @ (posedge clk) begin
25   if (Flagwrite)
26     memory <= (write_data);
27 end
28
29 // asynchronous read
30 assign read_data = memory;
31
32 endmodule

```

**Figure 3: The code for FlagsReg**

At the same time, we also improved the control by updating the SUB instruction(See Figure 4).

```

// SUB/CMP (Imm or Reg)
8 'b00?_0010_? : begin // note that we use wildcard "?"
  PCSrc    = 0;
  MemtoReg = 0;
  MemWrite = 0;
  ALUSrc   = Instr[25]; // may use immediate
  FlagWrite = Instr[20]; // may write flag
  RegWrite = 1;
  RegSrc   = 'b00;
  ImmSrc   = 'b00;
  ALUControl = 'b01;
end

```

**Figure 4: the updated SUB/CMP instruction**

We also added logic to compute the conditions defined by EQ, NE, GE, GT, LE, and LT based on the saved flag bits. Hence, we updated the control for the branch instruction to check if the condition is valid before executing the branch or just ignoring the instruction completely (See Figure 5 and 6).

```

always_comb begin
    case (Instr[31:28])
        //EQ: Equal
        4'b0000: CondEx = StatusFlag[2];
        //NE: Not equal
        4'b0001: CondEx = ~StatusFlag[2];
        //GE: Greater or Equal
        4'b1010: CondEx = StatusFlag[3] ~^ StatusFlag[0];
        //LT: Less
        4'b1011: CondEx = StatusFlag[3] ^ StatusFlag[0];
        //GT: Greater
        4'b1100: CondEx = ~StatusFlag[2] & (StatusFlag[3] ~^ StatusFlag[0]);
        //LE: Less or Equal
        4'b1101: CondEx = StatusFlag[2] | (StatusFlag[3] ^ StatusFlag[0]);
        //Unconditional
        4'b1110: CondEx = 1; //doesn't matter
        default: CondEx = 0;
    endcase

```

**Figure 5:** the logic to compute the conditions defined by EQ, NE, GE, LE and LT

```

// B/BXX
8'b1010_???? : begin
    PCSrc   = CondEx;
    MemtoReg = 0;
    MemWrite = 0;
    ALUSrc  = 1;
    Flagwrite = 0;
    RegWrite = 0;
    RegSrc   = 'b01;
    ImmSrc   = 'b10;
    ALUControl = 'b00; // do an add
end

```

**Figure 6:** the updated B/BXX instruction

## Result

### Task 1:

The process of adding the register file and ALU to the ARM code has been completed, and a simulation of the processor has been run using ModelSim. The instructions for the simulation are read from the memfile.dat file(Figure 7).

```
MAIN          ADD R0, R15, #0
              SUB R1, R0, R0
              ADD R2, R1, #10
              ADD R3, R0, R2
              SUB R4, R2, #3
              SUB R5, R3, R4
              ORR R6, R4, R5
              AND R7, R6, R5
              STR R7, [R1, #0]
              B SKIP
              STR R1, [R1, #0]
              B LOOP
SKIP          LDR R8, [R1, #0]
LOOP          B LOOP
```

Figure 7: Base Processor Testing Program “memfile.dat”

The simulation has generated a waveform, which shows the behavior of various signals in the system. The waveform contains the following elements in order from top to bottom: clock signal (clk), reset signal (rst), program counter (PC), instruction (Instr), ALU result (ALUResult), data to be written (WriteData), memory write signal (MemWrite), and data read from memory (ReadData). For better illustration, we set PC as decimal to compare the steps in .dat file, memWrite and RegWrite are set to binary, and others are set to hexadecimal. The screen shot of the waveform can be seen below in Figure 8.



Figure 8: The waveform generated by the Processor.

The following is the completed version of Table 3.

| Cycle | PC | Instr            | SrcA | SrcB     | ALUResult | WriteData  | Read Data | Mem Write | RegWrite | Result |
|-------|----|------------------|------|----------|-----------|------------|-----------|-----------|----------|--------|
| 1     | 00 | ADD R0, R15, #0  | 8    | 0        | 8         | Don't Care | X         | 0         | 1        | 8      |
| 2     | 04 | SUB R1, R0, R0   | 8    | 8        | 0         | 8          | X         | 0         | 1        | 0      |
| 3     | 08 | ADD R2, R1, #10  | 0    | A        | A         | Don't Care | X         | 0         | 1        | A      |
| 4     | 12 | ADD R3, R0, R2   | 8    | A        | 12        | A          | X         | 0         | 1        | 12     |
| 5     | 16 | SUB R4, R2, #3   | A    | 3        | 7         | 12         | X         | 0         | 1        | 7      |
| 6     | 20 | SUB R5, R3, R4   | 12   | 7        | B         | 7          | X         | 0         | 1        | B      |
| 7     | 24 | ORR R6, R4, R5   | 7    | B        | F         | B          | X         | 0         | 1        | F      |
| 8     | 28 | AND R7, R6, R5   | F    | B        | B         | B          | X         | 0         | 1        | B      |
| 9     | 32 | STR R7, [R1, #0] | 0    | 0        | 0         | B          | X         | 1         | 0        | 0      |
| 10    | 36 | B SKIP(B # 1)    | 2C   | 4        | 30        | 0          | X         | 0         | 0        | 30     |
| 11    | 48 | LDR R8, [R1, #0] | 0    | 0        | B         | Don't Care | B         | 0         | 1        | B      |
| 12    | 52 | B LOOP(B # -1)   | 3C   | fffffff8 | 34        | Don't Care | X         | 0         | 0        | 34     |
| 13    | 52 | B LOOP # -1      | 3C   | fffffff8 | 34        | Don't Care | X         | 0         | 0        | 34     |
| 14    | 52 | B LOOP # -1      | 3C   | fffffff8 | 34        | Don't Care | X         | 0         | 0        | 34     |
| 15    | 52 | B LOOP # -1      | 3C   | fffffff8 | 34        | Don't Care | X         | 0         | 0        | 34     |
| 16    | 52 | B LOOP # -1      | 3C   | fffffff8 | 34        | Don't Care | X         | 0         | 0        | 34     |
| 17    | 52 | B LOOP # -1      | 3C   | fffffff8 | 34        | Don't Care | X         | 0         | 0        | 34     |
| 18    | 52 | B LOOP # -1      | 3C   | fffffff8 | 34        | Don't Care | X         | 0         | 0        | 34     |
| 19    | 52 | B LOOP # -1      | 3C   | fffffff8 | 34        | Don't Care | X         | 0         | 0        | 34     |

**Table 3. First nineteen cycles of executing memfile.dat**

## Task 2:

The process of adding the CMP/SUBS and Conditional Branching has been completed, and a simulation of the processor has been run using ModelSim. The instructions for the simulation are read from the memfile2.dat file(Figure 9). And we wrote down the expected PC sequence on the left hand side.

|                                             |      |                  |
|---------------------------------------------|------|------------------|
| 0      ADD R0, R15, #0                      | MAIN | ADD R0, R15, #0  |
| 4      SUB R1, R0, R0                       |      | SUB R1, R0, R0   |
| 8      ADD R2, R1, #10                      |      | ADD R2, R1, #10  |
| 12     ADD R3, R0, R2                       |      | ADD R3, R0, R2   |
| 16     SUB R4, R2, #3                       |      | SUB R4, R2, #3   |
| 20     SUB R5, R3, R4                       |      | SUB R5, R3, R4   |
| 24     ORR R6, R4, R5                       |      | ORR R6, R4, R5   |
| 28     AND R7, R6, R5                       |      | AND R7, R6, R5   |
| 32     STR R7, [R1, #0]                     |      | STR R7, [R1, #0] |
| <b>36</b> B SKIP                            |      | B SKIP           |
| <b>48</b> <b>SKIP</b> LDR R8, [R1, #0]      |      | STR R1, [R1, #0] |
| <b>52</b> <b>B_START</b> CMP R9, R6, #15    |      | B LOOP           |
| 56     BNE B_START                          |      | LDR R8, [R1, #0] |
| 60     CMP R9, R5, R4                       |      | CMP R9, R6, #15  |
| <b>64</b> BNE BNE_TESTED                    |      | BNE B_START      |
| <b>72</b> <b>BNE_TESTED</b> CMP R9, R2, R3  |      | CMP R9, R5, R4   |
| 76     BGE B_START                          |      | BNE BNE_TESTED   |
| 80     CMP R9, R3, R2                       |      | B B_START        |
| <b>84</b> BGE BGE_TESTED                    |      | CMP R9, R2, R3   |
| <b>92</b> <b>BGE_TESTED</b> CMP R9, R3, R2  |      | BGE BGE_TESTED   |
| 96     BLE B_START                          |      | B B_START        |
| 100    CMP R9, R2, R3                       |      | CMP R9, R3, R2   |
| 104    BLE BLE_TESTED                       |      | BGE B_START      |
| <b>112</b> <b>BLE_TESTED</b> ADD R8, R1, #1 |      | CMP R9, R2, R3   |
| <b>116</b> <b>LOOP</b> B LOOP               |      | BLE BLE_TESTED   |
|                                             |      | B B_START        |
|                                             |      | ADD R8, R1, #1   |
|                                             |      | B LOOP           |

Figure 9: Updated Processor Testing Program “memfile2.dat”(On right)

The simulation has generated a waveform, which shows the behavior of various signals in the system. The waveform contains the following elements in order from top to bottom: clock signal (clk), reset signal (rst), program counter (PC), instruction (Instr), ALU result (ALUResult), data to be written (WriteData), memory write signal (MemWrite), and data read from memory (ReadData). The screen shot of the

waveform can be seen below in Figure 10. Due to the large dimensions of the screenshot, it has been divided into two sections. The upper portion depicts information from PC0 to PC92, while the lower section displays data from PC48 to PC116.



**Figure 10: The waveform generated by the Processor**

## Appendix: SystemVerilog code

### 1) arm.sv

```

1 //Junchao Zhou, Chenhan Dai
2 //04/19/2023
3 //EE469
4 //Lab #2, Task1, 2
5
6 /* arm is the spotlight of the show and contains the bulk of the datapath and control logic. This module is split into two parts, the datapath and control.
7 */
8
9 // clk - system clock
10 // rst - system reset
11 // Instr - incoming 32 bit instruction from imem, contains opcode, condition, addresses and or immediates
12 // ReadData - data read out of the dmem
13 // WriteData - data to be written to the dmem
14 // Memwrite - write enable to allowed WriteData to overwrite an existing dmem word
15 // PC - the current program count value, goes to imem to fetch instruction
16 // ALUResult - result of the ALU operation, sent as address to the dmem
17
18 module arm (
19     input logic          clk, rst,
20     input logic [31:0]   Instr,
21     input logic [31:0]   Readdata,
22     output logic [31:0]  WriteData,
23     output logic [31:0]  PC, ALUResult,
24     output logic         Memwrite
25 );
26
27     // datapath buses and signals
28     logic [31:0] PCPrime, PCPlus4, PCPlus8; // pc signals
29     logic [3:0] RA1, RD1, RD2; // regfile input addresses
30     logic [3:0] RD1, RD2; // raw regfile outputs
31     logic [3:0] ALUFlags; // alu combinational flag outputs
32     logic [31:0] StatusFlag;
33     logic [31:0] ExtImm, SrcA, SrcB; // immediate and alu inputs
34     logic [31:0] Result; // computed or fetched value to be written into regfile or pc
35
36     // control signals
37     logic PCSrc, MemtoReg, ALUSrc, Regwrite, CondEx, Flagwrite;
38     logic [1:0] RegSrc, ImmSrc, ALUControl;
39
40
41     /* The datapath consists of a PC as well as a series of muxes to make decisions about which data words to pass forward and operate on. It is
42     ** noticeably missing the register file and alu, which you will fill in using the modules made in lab 1. To correctly match up signals to the
43     ** ports of the register file and alu take some time to study and understand the logic and flow of the datapath.
44     */
45
46     //-----
47     //----- DATAPATH -----
48
49
50     assign PCPrime = PCsrc ? Result : PCPlus4; // mux, use either default or newly computed value
51     assign PCPlus4 = PC + 'd4; // default value to access next instruction
52     assign PCPlus8 = PCPlus4 + 'd4; // value read when reading from reg[15]
53
54     // update the PC, at rst initialize to 0
55     always_ff @(posedge clk) begin
56         if (rst) PC <= '0';
57         else PC <= PCPrime;
58     end
59
60     // determine the register addresses based on control signals
61     // RegSrc[0] is set if doing a branch instruction
62     // RegSrc[1] is set when doing memory instructions
63     assign RA1 = RegSrc[0] ? 4'd15 : Instr[19:16];
64     assign RA2 = RegSrc[1] ? Instr[15:12] : Instr[3:0];
65
66     // Register file with 16 registers
67     // with one input value and two output value base on correspond addresses
68     reg_file u_reg_file (
69         .clk           (clk),
70         .wr_en        (Regwrite),
71         .write_data   (Result),
72         .write_addr   (Instr[15:12]),
73         .read_addr1   (RA1),
74         .read_addr2   (RA2),
75         .read_data1   (RD1),
76         .read_data2   (RD2)
77     );
78
79     // Flag register
80     // Change statusflag from alu when flagwrite and CondEx is true
81     FlagsReg u_flags_Reg (
82         .ctrl          ('0),
83         .Flagwrite    (Flagwrite & CondEx),
84         .write_data   (ALUFlags),
85         .read_data    (StatusFlag)
86     );
87
88
89     // two muxes, put together into an always_comb for clarity
90     // determines which set of instruction bits are used for the immediate
91     always_comb begin
92         if ((ImmSrc == 'b00) ExtImm = {{24{Instr[7]}}, Instr[7:0]]; // 8 bit immediate - reg operations
93         else if (ImmSrc == 'b01) ExtImm = {20'b0, Instr[11:0]};
94         else ExtImm = {{6{Instr[23]}}, Instr[23:0], 2'b00}; // 24 bit immediate - branch operation
95     end
96
97
98     // WriteData and SrcA are direct outputs of the register file, whereas SrcB is chosen between reg file output and the immediate
99     assign WriteData = (RA2 == 'd15) ? PCPlus8 : RD2; // substitute the 15th regfile register for PC
100    assign SrcA   = (RA1 == 'd15) ? PCPlus8 : RD1; // substitute the 15th regfile register for PC
101    assign SrcB   = ALUSrc ? ExtImm : WriteData; // determine alu operand to be either from reg file or from immediate
102
103
104    // ALU
105    // with two input source A and B
106    // controlled by [1:0] ALUcontrol signal
107    // 00 for ADD, 01 for SUB, 10 for AND, 11 for OR
108    // Return computed result and flags
109    alu u_alu (
110        .a           (SrcA),
111        .b           (SrcB),
112        .ALUcontrol (ALUControl),
113        .result      (ALUResult),
114        .ALUFlags    (ALUFlags)
115    );
116
117    // determine the result to run back to PC or the register file based on whether we used a memory instruction
118    assign Result = MemtoReg ? Readdata : ALUResult; // determine whether final writeback result is from dmemory or alu

```

```

118
119  /* The control consists of a large decoder, which evaluates the top bits of the instruction and produces the control bits
120  ** which become the select bits and write enables of the system. The write enables (Regwrite, Memwrite and PCSrc) are
121  ** especially important because they are representative of your processor's current state.
122  */
123 //-----
124 //----- CONTROL -----
125 //-----
126
127 always_comb begin
128
129  // Decoder for CondEx
130  // Result is based on condition signal from instruction
131  case (Instr[31:28])
132
133    //EQ: Equal
134    4'b0000: CondEx = statusFlag[2];
135
136    //NE: Not equal
137    4'b0001: CondEx = ~statusFlag[2];
138
139    //GE: Greater or Equal
140    4'b1010: CondEx = statusFlag[3] ~^ statusFlag[0];
141
142    //LT: Less
143    4'b1011: CondEx = statusFlag[3] ^ statusFlag[0];
144
145    //GT: Greater
146    4'b1100: CondEx = ~statusFlag[2] & (statusFlag[3] ~^ statusFlag[0]);
147
148    //LE: Less or Equal
149    4'b1101: CondEx = statusFlag[2] | (statusFlag[3] ^ statusFlag[0]);
150
151    //Unconditional
152    4'b1110: CondEx = 1; //Keep execute for uncondition
153
154    default: CondEx = 0;
155
156  endcase
157
158  casez (Instr[27:20])
159
160    // ADD (Imm or Reg)
161    8'b00?_0100_0 : begin // note that we use wildcard "?" in bit 25. That bit decides whether we use immediate or reg, but regardless we add
162      PCSrc = 0;
163      MemtoReg = 0;
164      Memwrite = 0;
165      ALUSrc = Instr[25]; // may use immediate
166      Flagwrite = 0;
167      Regwrite = 1;
168      RegSrc = 'b00;
169      ImmSrc = 'b00;
170      ALUControl = 'b00;
171
172    end
173
174    // SUB/CMP (Imm or Reg)
175    8'b00?_0010_? : begin // note that we use wildcard "?" in bit 25. That bit decides whether we use immediate or reg, but regardless we sub
176      PCSrc = 0;
177      MemtoReg = 0;
178      Memwrite = 0;
179      ALUSrc = Instr[25]; // may use immediate
180      Flagwrite = Instr[20]; // may write flag
181      Regwrite = 1;
182      RegSrc = 'b00;
183      ImmSrc = 'b00;
184      ALUControl = 'b01;
185
186    end
187
188    // AND
189    8'b000_0000_0 : begin
190      PCSrc = 0;
191      MemtoReg = 0;
192      Memwrite = 0;
193      ALUSrc = 0;
194      Flagwrite = 0;
195      Regwrite = 1;
196      RegSrc = 'b00;
197      ImmSrc = 'b00; // doesn't matter
198      ALUControl = 'b10;
199
200  end

```

```

198      // ORR
199      8'b000_1100_0 : begin
200          MemToReg = 0;
201          MemWrite = 0;
202          ALUSrc = 0;
203          Flagwrite = 0;
204          Regwrite = 1;
205          RegSrc = 'b00;
206          ImmSrc = 'b00; // doesn't matter
207          ALUControl = 'b11;
208      end
209
210      // LDR
211      8'b010_1100_1 : begin
212          PCSrc = 0;
213          MemToReg = 1;
214          MemWrite = 0;
215          ALUSrc = 1;
216          Flagwrite = 0;
217          Regwrite = 1;
218          RegSrc = 'b10; // msb doesn't matter
219          ImmSrc = 'b01;
220          ALUControl = 'b00; // do an add
221      end
222
223      // STR
224      8'b010_1100_0 : begin
225          PCSrc = 0;
226          MemToReg = 0; // doesn't matter
227          MemWrite = 1;
228          ALUSrc = 0;
229          Flagwrite = 0;
230          Regwrite = 0;
231          RegSrc = 'b10; // msb doesn't matter
232          ImmSrc = 'b01;
233          ALUControl = 'b00; // do an add
234      end
235
236
237      // B/BXX
238      8'b1010_???? : begin
239          PCSrc = CondEx; // depends on CondEx
240          MemToReg = 0;
241          MemWrite = 0;
242          ALUSrc = 1;
243          Flagwrite = 0;
244          Regwrite = 0;
245          RegSrc = 'b01;
246          ImmSrc = 'b10;
247          ALUControl = 'b00; // do an add
248      end
249
250      default: begin
251          PCSrc = 0;
252          MemToReg = 0; // doesn't matter
253          MemWrite = 0;
254          ALUSrc = 0;
255          Flagwrite = 0;
256          Regwrite = 0;
257          RegSrc = 'b00;
258          ImmSrc = 'b00;
259          ALUControl = 'b00; // do an add
260      end
261  endcase
262 end
263
264 endmodule

```

## 2) reg\_file.sv

```

1 //Junchao Zhou, Chenhan Dai
2 //04/05/2023
3 //EE469
4 //Lab #1, Task2
5
6 //reg_file takes clk, wr_en, 32-bit write_data, 4-bit write_addr, read_addr1, read_addr2 as inputs,
7 // 32-bit read_data1, read_data2 as outputs. And we define a 16 * 32 bit memory.
8 // If write enable, the write data is be written into the write_address of the memory,
9 // and we read the data in read_addr1 and read_addr2 of the memory asynchronously.
10
11 module reg_file(input logic clk, wr_en,
12                  input logic [31:0] write_data,
13                  input logic [3:0] write_addr,
14                  input logic [3:0] read_addr1, read_addr2,
15                  output logic [31:0] read_data1, read_data2);
16
17 //logic [15:0][31:0] memory;
18 logic [31:0] memory [0:15];
19
20
21 // Write port
22 always_ff @(posedge clk) begin
23     if (wr_en)
24         memory[write_addr] <= write_data;
25
26
27 // Read Port 1
28 assign read_data1 = memory[read_addr1];
29
30 // Read Port 2
31 assign read_data2 = memory[read_addr2];
32
33
34
35
36

```

```

36 // reg_file_testbench tests three cases
37 // 1. the write data is written into the register file the clock cycle after wr_en is asserted
38 // 2. Read data is updated to the register at an address the same cycle the address was
39 // provided
40 // 3. Read data is updated to write data at an address the cycle after the address was provided
41 // if the write address is the same and wr_en was asserted
42
43 module reg_file_testbench();
44     //Inputs
45     logic clk, wr_en;
46     logic [31:0] write_data;
47     logic [3:0] write_addr, read_addr1, read_addr2;
48
49     //Outputs
50     logic [31:0] read_data1, read_data2;
51
52     reg_file dut(.clk, .wr_en, .write_data, .write_addr,
53                  .read_addr1, .read_addr2, .read_data1, .read_data2);
54
55     always #10 clk = ~clk;
56
57     // Initialize inputs
58     initial begin
59         clk = 0;
60         wr_en = 0;
61         write_data = 0;
62         write_addr = 0;
63         read_addr1 = 0;
64         read_addr2 = 1;
65         #10;
66
67         // Write data is written the cycle after wr_en is asserted
68         wr_en = 0;
69         write_addr = 0;
70         write_data = 32'h00abcdef;
71         #10;
72
73         wr_en = 1; #20;
74
75         // Read data is updated the same cycle the address is provided
76         wr_en = 0;
77         write_data = 32'h11111111;
78         write_addr = 1;
79         #10;
80
81         wr_en = 1;
82         #10;
83
84
85         read_addr1 = 0;
86         read_addr2 = 1;
87         #10;
88
89
90
91         // Read data is updated the cycle after the address is provided and
92         // wr_en is asserted
93         write_data = 32'h22222222;
94         write_addr = 0;
95         wr_en = 0;
96         #10;
97
98         wr_en = 1;
99         #10;
100
101        read_addr1= 0;
102        read_addr2 =1;
103        #10;
104        $stop;
105    end
106
107 endmodule

```

### 3) alu.sv

```

1 //Junchao Zhou, Chenhan Dai
2 //04/05/2023
3 //EE469
4 //Lab #1, Task2
5
6 // alu take 32-bit a, b, 2-bit ALUControl as input and return 32-bit Result and 4-bit ALUFlags as
7 // outputs. We define a 32-bit n_b as the inverse of b and a 33-bit temp to find if the ALU has
8 // carryout.
9 module alu(input logic[31:0]a, b,
10             input logic[1:0] ALUControl,
11             output logic[31:0] Result,
12             output logic[3:0] ALUFlags);
13
14     logic[31:0] n_b;
15     logic[32:0] temp;
16     assign n_b = ~b;
17     always_comb begin
18         case(ALUControl)
19             2'b00: begin
20                 Result = a + b;
21                 temp = a + b;
22             end
23
24             2'b01: begin
25                 Result = a - b;
26                 temp = a + n_b + 1;
27             end
28
29             2'b10: begin
30                 Result = a & b;
31                 temp = 0;
32             end
33
34             2'b11: begin
35                 Result = a | b;
36                 temp = 0;
37             end
38         endcase
39
40         // ALUFlags[0] = 1 when the adder results in overflow
41         ALUFlags[0] = ~(a[31] ^ b[31] ^ ALUControl[0]) & (a[31] ^ Result[31]) & ~ALUControl[1];
42         // ALUFlags[0] = (~Result[31] & a[31] & ~b[31]) | (Result[31]&~a[31] &~b[31]);
43
44         // ALUFlag[1] = 1 when the adder produces a carry out
45         ALUFlags[1] = temp[32];
46
47         // ALUFlag[2] = 1 when the result is 0
48         ALUFlags[2] = Result == 0;
49
50         // ALUFlag[3] = 1 when the result is negative
51         ALUFlags[3] = Result[31];
52     end
53 endmodule
54
55 // alu_testbench read the vector file alu.tv and tests all the cases in the file
56 module alu_testbench();
57     logic [31:0]a, b;
58     logic [1:0] ALUControl;
59     logic [31:0] Result;
60     logic [3:0] ALUFlags;
61     logic clk;
62     logic [103:0] testvectors [1000:0];
63
64     alu dut (.a(a), .b(b), .ALUControl(ALUControl), .Result(Result), .ALUFlags(ALUF
65
66     parameter CLOCK_PERIOD = 100;
67
68     initial clk = 1;
69     always begin
70         #(CLOCK_PERIOD/2);
71         clk = ~clk;
72
73     end
74
75     initial begin
76         $readmemh("alu.tv", testvectors);
77
78         for(int i = 0; i < 20; i = i + 1) begin
79             {ALUControl, a, b, Result, ALUFlags} = testvectors[i]; @(posedge clk);
80         end
81
82     end
83 endmodule
84
85

```

#### 4) top.sv

```
1  /* top is a structurally made toplevel module. It consists of 3 instantiations, as well as the signals that link them.
2  ** It is almost totally self-contained, with no outputs and two system inputs: clk and rst. clk represents the clock
3  ** the system runs on, with one instruction being read and executed every cycle. rst is the system reset and should
4  ** be run for at least a cycle when simulating the system.
5  */
6
7 // clk - system clock
8 // rst - system reset. Technically unnecessary
9 module top(
10   input logic clk, rst
11 );
12
13   // processor io signals
14   logic [31:0] Instr;
15   logic [31:0] ReadData;
16   logic [31:0] WriteData;
17   logic [31:0] PC, ALUResult;
18   logic MemWrite;
19
20   // our single cycle arm processor
21   arm processor (
22     .clk      (clk),
23     .rst      (rst),
24     .Instr    (Instr),
25     .ReadData (ReadData),
26     .WriteData (WriteData),
27     .PC       (PC),
28     .ALUResult (ALUResult),
29     .MemWrite (MemWrite)
30   );
31
32   // instruction memory
33   // contained machine code instructions which instruct processor on which operations to make
34   // effectively a rom because our processor cannot write to it
35   imem imemory (
36     .addr  (PC),
37     .instr (Instr),
38   );
39
40   // data memory
41   // contains data accessible by the processor through ldr and str commands
42   dmem dmemory (
43     .clk      (clk),
44     .wr_en   (MemWrite),
45     .addr    (ALUResult),
46     .wr_data (WriteData),
47     .rd_data (ReadData)
48   );
49
50
51 endmodule
52
53 /* testbench is a simulation module which simply instantiates the processor system and runs 50 cycles
54 ** of instructions before terminating. At termination, specific register file values are checked to
55 ** verify the processors' ability to execute the implemented instructions.
56 */
57 module testbench();
58
59   // system signals
60   logic clk, rst;
61
62   // generate clock with 100ps clk period
63   initial begin
64     clk = 1;
65     forever #50 clk = ~clk;
66   end
67
68   // processor instantiation. Within is the processor as well as imem and dmem
69   top cpu (.clk(clk), .rst(rst));
70
71   initial begin
72     // start with a basic reset
73     rst = 1; @(posedge clk);
74     rst <= 0; @(posedge clk);
75
76     // repeat for 50 cycles. Not all 50 are necessary, however a loop at the end of the program will keep anything weird from happening
77     repeat(50) @(posedge clk);
78
79     // basic checking to ensure the right final answer is achieved. These DO NOT prove your system works. A more careful look at
80     // simulation and code will be made.
81
82     // task 1:
83     //assert(cpu.processor.u_reg_file.memory[8] == 32'd11) $display("Task 1 Passed");
84     //else
85     //  $display("Task 1 Failed");
86
87     // task 2:
88     assert(cpu.processor.u_reg_file.memory[8] == 32'd1) $display("Task 2 Passed");
89     else
90       $display("Task 2 Failed");
91
92   end
93 endmodule
```

#### 5)dmem.sv

```

1  /* dmem is a more traditional, albeit very uninteresting, random access 64 word x 32 bit per word memory.
2  ** This module is also written in RTL, and likely strongly resembles your own register file except for a
3  ** few minor differences. The first is that there is only a single read port, compared to the register
4  ** file's two read ports. The other difference is that the dmem is also byte aligned, and therefore
5  ** discards the bottom two bits of the address when doing a read or write.
6  */
7
8 // clk - system clock, same as the processor
9 // wr_en - write enable, allows the wr_data to overwrite the 32 bit word stored in memory[addr]
10 // addr - the location to which you intend to read or write from
11 // wr_data - the 32 bit data word which you intend to write into memory
12 // rd_data - the data currently stored at memory[addr]
13 module dmem(
14     input logic      clk,
15     input logic [31:0] addr,
16     input logic [31:0] wr_data,
17     output logic [31:0] rd_data
18 );
19
20     logic [31:0] memory [63:0];
21
22     // asynchronous read
23     assign rd_data = memory[addr[31:2]]; // word aligned, drop bottom 2 bits
24
25     // synchronous gated write
26     always_ff @(posedge clk) begin
27         if (wr_en) memory[addr[31:2]] <= wr_data; // word aligned, drop bottom 2 bits
28     end
29
30 endmodule

```

## 6) imem.sv

```

1  /* imem is the read only, 64 word x 32 bit per word instruction memory for our processor.
2  ** Its module is written in RTL, and it strongly resembles a ROM (read only memory) or LUT
3  ** (look up table). This memory has no clock, and cannot be written to, but rather it
4  ** asynchronously reads out the word stored in its memory as soon as an address is given.
5  ** The address and memory are byte aligned, meaning that the bottom two bits are discarded
6  ** when looking for the word. One important line to note is the
7  **     Initial $readmemb("memfile.dat", memory);
8  ** which determines the contents of the memory when the system is initialized. You will alter
9  ** this line to use programs given to you as a part of this lab.
10 */
11
12 // addr - 32 bit address to determine the instruction to return. Note not all 32 bits are used since this
13 // memory only has 64 words
14 // instr - 32 bit instruction to be sent to the processor
15 module imem(
16     input logic [31:0] addr,
17     output logic [31:0] instr
18 );
19     logic [31:0] memory [63:0];
20
21     // modify the name and potentially directory prefix of the file within to load the correct program and preprocessing
22     initial $readmemb("memfile2.dat", memory);
23
24     assign instr = memory[addr[31:2]]; // word aligned, drops bottom 2 bits
25
26 endmodule

```

## 7) FlagsReg.sv

```

1  //Junchao Zhou, Chenhan Dai
2  //04/19/2023
3  //EE469
4  //Lab #2, Task2
5
6  // FlagsReg is a flag register for updating flags
7  // update Flag only when Flagwrite signal is true
8  // output asynchronously
9
10 // clk - system clock, same as the processor
11 // Flagwrite - write enable, allows the write_data to overwrite the 4 bit flag storage in memory
12 // write_data - the 4 bit flag which you intend to write into memory
13 // read_data - the data currently stored at memory
14 module FlagsReg(input logic clk,
15                  input logic Flagwrite,
16                  input logic [3:0] write_data,
17                  output logic [3:0] read_data);
18
19     // memory:
20     logic [3:0] memory;
21
22
23     // write port
24     always_ff @(posedge clk) begin
25         if (Flagwrite)
26             memory <= (write_data);
27     end
28
29     // asynchronous read
30     assign read_data = memory;
31
32 endmodule

```

## 8) memfile.dat

```

// ADD R - 1110_000_0100_0_AAAA_DDDD_0000_0000_BBBB
// ADD I - 1110_001_0100_0_AAAA_DDDD_0000_III_III

```

```

// SUB R - 1110_000_0010_0_AAAA_DDDD_0000_0000_BBBB
// SUB I - 1110_001_0010_0_AAAA_DDDD_0000_IIII_IIII
// AND - 1110_000_0000_0_AAAA_DDDD_0000_0000_BBBB
// ORR - 1110_000_1100_0_AAAA_DDDD_0000_0000_BBBB
// LDR - 1110_010_1100_1_AAAA_DDDD_IIII_IIII_IIII
// STR - 1110_010_1100_0_AAAA_DDDD_IIII_IIII_IIII
// B - 1110_1010_IIII_IIII_IIII_IIII_IIII_IIII

```

|                                          |                  |    |
|------------------------------------------|------------------|----|
| 11100010100011110000000000000000 // MAIN | ADD R0, R15, #0  | 0  |
| 11100000100000000010000000000000 //      | SUB R1, R0, R0   | 4  |
| 11100010100000010010000000001010 //      | ADD R2, R1, #10  | 8  |
| 111000001000000000011000000000010 //     | ADD R3, R0, R2   | 12 |
| 1110001001000010010000000000011 //       | SUB R4, R2, #3   | 16 |
| 11100000010000110101000000000100 //      | SUB R5, R3, R4   | 20 |
| 11100001100001000110000000000101 //      | ORR R6, R4, R5   | 24 |
| 111000000000001100111000000000101 //     | AND R7, R6, R5   | 28 |
| 11100101100000010111000000000000 //      | STR R7, [R1, #0] | 32 |
| 111010100000000000000000000000001 //     | B SKIP           | 36 |
| 11100101100000010001000000000000 //      | STR R1, [R1, #0] | 40 |
| 11101010000000000000000000000000 //      | B LOOP           | 44 |
| 11100101100100011000000000000000 // SKIP | LDR R8, [R1, #0] | 48 |
| 11101010111111111111111111111110 // LOOP | B LOOP           | 52 |

## 9) memfile2.dat

```

// ADD R - 1110_000_0100_0_AAAA_DDDD_0000_0000_BBBB
// ADD I - 1110_001_0100_0_AAAA_DDDD_0000_IIII_IIII
// SUB R - 1110_000_0010_0_AAAA_DDDD_0000_0000_BBBB
// SUB I - 1110_001_0010_0_AAAA_DDDD_0000_IIII_IIII
// CMP R - 1110_000_0010_1_AAAA_DDDD_0000_0000_BBBB
// CMP I - 1110_001_0010_1_AAAA_DDDD_0000_IIII_IIII
// AND - 1110_000_0000_0_AAAA_DDDD_0000_0000_BBBB
// ORR - 1110_000_1100_0_AAAA_DDDD_0000_0000_BBBB
// LDR - 1110_010_1100_1_AAAA_DDDD_IIII_IIII_IIII
// STR - 1110_010_1100_0_AAAA_DDDD_IIII_IIII_IIII
// COND_1010_IIII_IIII_IIII_IIII_IIII_IIII

// Equal      - COND = 0000
// Not Equal   - COND = 0001
// Greater or Equal - COND = 1010
// Greater      - COND = 1100
// Less or Equal - COND = 1101

```

// Less - COND = 1011

|                                               |                  |     |
|-----------------------------------------------|------------------|-----|
| 11100010100011100000000000000000 // MAIN      | ADD R0, R15, #0  | 0   |
| 111000001000000001000000000000 //             | SUB R1, R0, R0   | 4   |
| 11100010100000010010000000001010 //           | ADD R2, R1, #10  | 8   |
| 111000001000000000011000000000010 //          | ADD R3, R0, R2   | 12  |
| 1110001001000010010000000000011 //            | SUB R4, R2, #3   | 16  |
| 11100000010000110101000000000100 //           | SUB R5, R3, R4   | 20  |
| 11100001100001000110000000000101 //           | ORR R6, R4, R5   | 24  |
| 11100000000000110011000000000101 //           | AND R7, R6, R5   | 28  |
| 11100101100000010111000000000000 //           | STR R7, [R1, #0] | 32  |
| 111010100000000000000000000000001 //          | B SKIP           | 36  |
| 11100101100000010001000000000000 //           | STR R1, [R1, #0] | 40  |
| 111010100000000000000000000000000 //          | B LOOP           | 44  |
| 11100101100100011000000000000000 // SKIP      | LDR R8, [R1, #0] | 48  |
| 1110001001010100100000001111 // B_START       | CMP R9, R6, #15  | 52  |
| 000110101111111111111111111101 //             | BNE B_START      | 56  |
| 11100000010101011001000000000100 //           | CMP R9, R5, R4   | 60  |
| 00011010000000000000000000000000 //           | BNE BNE_TESTED   | 64  |
| 1110101011111111111111111111010 //            | B B_START        | 68  |
| 1110000001010010100100000000011 // BNE_TESTED | CMP R9, R2, R3   | 72  |
| 101010101111111111111111111000 //             | BGE B_START      | 76  |
| 1110000001010011100100000000010 //            | CMP R9, R3, R2   | 80  |
| 10101010000000000000000000000000 //           | BGE BGE_TESTED   | 84  |
| 1110101011111111111111111110101 //            | B B_START        | 88  |
| 1110000001010011100100000000010 // BGE_TESTED | CMP R9, R3, R2   | 92  |
| 110110101111111111111111110011 //             | BLE B_START      | 96  |
| 1110000001010010100100000000011 //            | CMP R9, R2, R3   | 100 |
| 11011010000000000000000000000000 //           | BLE BLE_TESTED   | 104 |
| 111010101111111111111111110000 //             | B B_START        | 108 |
| 1110001010000001100000000000001 // BLE_TESTED | ADD R8, R1, #1   | 112 |
| 111010101111111111111111111110 // LOOP        | B LOOP           | 116 |