

## **HW3\_Pipelined THUMB CPU**

Handout: 2025/11/03

Due: 2025/11/17

The course web site has RTL codes of a pipelined microprocessor that can execute 16-bit THUMB instructions. The following lists the THUMB instruction encoding.

| Instruction classes (indexed by <i>op</i> ) |                    |                    |           | 15 | 14 | 13 | 12 | 11        | 10            | 9             | 8             | 7         | 6             | 5                 | 4                 | 3            | 2         | 1                 | 0 |
|---------------------------------------------|--------------------|--------------------|-----------|----|----|----|----|-----------|---------------|---------------|---------------|-----------|---------------|-------------------|-------------------|--------------|-----------|-------------------|---|
| LSL                                         | LSR                |                    |           | 0  | 0  | 0  | 0  | <i>op</i> | <i>immed5</i> |               |               |           | <i>Lm</i>     |                   | <i>Ld</i>         |              |           |                   |   |
| ASR                                         |                    |                    |           | 0  | 0  | 0  | 1  | 0         | <i>immed5</i> |               |               |           | <i>Lm</i>     |                   | <i>Ld</i>         |              |           |                   |   |
| ADD                                         | SUB                |                    |           | 0  | 0  | 0  | 1  | 1         | 0             | <i>op</i>     | <i>Lm</i>     |           |               |                   | <i>Ln</i>         |              | <i>Ld</i> |                   |   |
| ADD                                         | SUB                |                    |           | 0  | 0  | 0  | 1  | 1         | 1             | <i>op</i>     | <i>immed3</i> |           |               |                   | <i>Ln</i>         |              | <i>Ld</i> |                   |   |
| MOV                                         | CMP                |                    |           | 0  | 0  | 1  | 0  | <i>op</i> | <i>Ld/Ln</i>  |               |               |           | <i>immed8</i> |                   |                   |              |           |                   |   |
| ADD                                         | SUB                |                    |           | 0  | 0  | 1  | 1  | <i>op</i> | <i>Ld</i>     |               |               |           | <i>immed8</i> |                   |                   |              |           |                   |   |
| AND                                         | EOR                | LSL                | LSR       | 0  | 1  | 0  | 0  | 0         | 0             | 0             | 0             | <i>op</i> |               | <i>Lm/Ls</i>      |                   | <i>Ld</i>    |           |                   |   |
| ASR                                         | ADC                | SBC                | ROR       | 0  | 1  | 0  | 0  | 0         | 0             | 0             | 1             | <i>op</i> |               | <i>Lm/Ls</i>      |                   | <i>Ld</i>    |           |                   |   |
| TST                                         | NEG                | CMP                | CMN       | 0  | 1  | 0  | 0  | 0         | 0             | 1             | 0             | <i>op</i> |               | <i>Lm</i>         |                   | <i>Ld/Ln</i> |           |                   |   |
| ORR                                         | MUL                | BIC                | MVN       | 0  | 1  | 0  | 0  | 0         | 0             | 1             | 1             | <i>op</i> |               | <i>Lm</i>         |                   | <i>Ld</i>    |           |                   |   |
| CPY                                         | Ld, Lm             |                    |           | 0  | 1  | 0  | 0  | 0         | 1             | 1             | 0             | 0         | 0             | <i>Lm</i>         |                   | <i>Ld</i>    |           |                   |   |
| ADD                                         | MOV                | Ld, Hm             |           | 0  | 1  | 0  | 0  | 0         | 1             | <i>op</i>     | 0             | 0         | 1             | <i>Hm &amp; 7</i> |                   |              |           | <i>Ld</i>         |   |
| ADD                                         | MOV                | Hd, Lm             |           | 0  | 1  | 0  | 0  | 0         | 1             | <i>op</i>     | 0             | 1         | 0             | <i>Lm</i>         | <i>Hd &amp; 7</i> |              |           |                   |   |
| ADD                                         | MOV                | Hd, Hm             |           | 0  | 1  | 0  | 0  | 0         | 1             | <i>op</i>     | 0             | 1         | 1             | <i>Hm &amp; 7</i> |                   |              |           | <i>Hd &amp; 7</i> |   |
| CMP                                         |                    |                    |           | 0  | 1  | 0  | 0  | 0         | 1             | 0             | 1             | 0         | 1             | <i>Hm &amp; 7</i> |                   |              |           | <i>Ln</i>         |   |
| CMP                                         |                    |                    |           | 0  | 1  | 0  | 0  | 0         | 1             | 0             | 1             | 1         | 0             | <i>Lm</i>         | <i>Hn &amp; 7</i> |              |           |                   |   |
| CMP                                         |                    |                    |           | 0  | 1  | 0  | 0  | 0         | 1             | 0             | 1             | 1         | 1             | <i>Hm &amp; 7</i> |                   |              |           | <i>Hn &amp; 7</i> |   |
| BX                                          | BLX                |                    |           | 0  | 1  | 0  | 0  | 0         | 1             | 1             | 1             | <i>op</i> | <i>Rm</i>     | 0                 | 0                 | 0            |           |                   |   |
| LDR                                         | Ld, [pc, #immed*4] |                    |           | 0  | 1  | 0  | 0  | 1         | <i>Ld</i>     | <i>immed8</i> |               |           |               |                   |                   |              |           |                   |   |
| STR                                         | STRH               | STRB               | LDRSB pre | 0  | 1  | 0  | 1  | 0         | <i>op</i>     | <i>Lm</i>     |               |           |               | <i>Ln</i>         |                   | <i>Ld</i>    |           |                   |   |
| LDR                                         | LDRH               | LDRB               | LDRSH pre | 0  | 1  | 0  | 1  | 1         | <i>op</i>     | <i>Lm</i>     |               |           |               | <i>Ln</i>         |                   | <i>Ld</i>    |           |                   |   |
| STR                                         | LDR                | Ld, [Ln, #immed*4] |           | 0  | 1  | 1  | 0  | <i>op</i> | <i>immed5</i> |               |               |           | <i>Ln</i>     |                   | <i>Ld</i>         |              |           |                   |   |
| STRB                                        | LDRB               | Ld, [Ln, #immed]   |           | 0  | 1  | 1  | 1  | <i>op</i> | <i>immed5</i> |               |               |           | <i>Ln</i>     |                   | <i>Ld</i>         |              |           |                   |   |
| STRH                                        | LDRH               | Ld, [Ln, #immed*2] |           | 1  | 0  | 0  | 0  | <i>op</i> | <i>immed5</i> |               |               |           | <i>Ln</i>     |                   | <i>Ld</i>         |              |           |                   |   |

1. Trace the given Verilog RTL codes of the 16-bit pipelined THUMB processor, and use the given testbench tb thumb.v to verify the RTL code. Make necessary modifications for the

- codes to make them functional in both RTL and gate-level.
2. In the original Verilog, all the four pipelined stages (IF stage, ID stage, EX stage, WB stage) are in a single module. Modify the code so that each pipelined stage is in a separate module, so that the critical path delay of each pipelined stage can be easily measured from the synthesis results of the Synopsys Design Compiler (DC).
  3. Generate three different synthesis results using different design constraints: area-optimized result, delay-optimized result, and in-between using different constraints in the Synopsys DC. Compare the differences of critical path delays, area, and power for the three different synthesis results.
  4. Measure the critical path delay of each pipelined stage in the synthesis results. That is, for each synthesis result, you should measure the critical path delays for the four pipelined stages. And you should do the measurement for the three different synthesis results (area-optimized, delay-optimized, and in-between).
  5. Use PrimeTime (PT) to measure the critical path delay and power by providing a sequence of inputs. Compare the delay and power with those obtained from Synopsys Design Compiler (DC). Fill in the following comparison table.

| constraint | area | Delay (DC)      |                 |                 |                 |            | power |    |
|------------|------|-----------------|-----------------|-----------------|-----------------|------------|-------|----|
|            |      | 1 <sup>st</sup> | 2 <sup>nd</sup> | 3 <sup>rd</sup> | 4 <sup>th</sup> | Critical   | DC    | PT |
| delay-opt  |      |                 |                 |                 |                 | DC<br>(PT) |       |    |
| area-opt   |      |                 |                 |                 |                 |            |       |    |
| In=between |      |                 |                 |                 |                 |            |       |    |

6. Perform automatic placement-and-routing for the THUMB CPU. Mark the four pipelined stages in the layout view.

## References:

1. S. Lee, *Advanced Digital Logic Design Using Verilog, State Machines, and Synthesis for FPGAs*, Nelson, 2006. (Chap. 9: Verilog code of the pipelined 16-bit THUMB CPU; Appendix A: THUMB instructions)

## Report Requirement

檔案請依以下格式繳交(100%)

HDL\_HW3\_MXXXXXXXXXX.zip / HDL\_HW3\_BXXXXXXXXXX.zip

-rtl 資料夾

--thumb\_pipe.v (修正、切 4 pipeline stages : IF、ID、EX、WB) (10%)

-gate 資料夾(合成前述 thumb\_pipe.v) (20%)

--area 資料夾

---thumb\_pipe\_area.v

---thumb\_pipe\_area.ddc

--- thumb\_pipe\_area.sdf

--- thumb\_pipe\_area.sdc

--delay 資料夾

---thumb\_pipe\_delay.v

---thumb\_pipe\_delay.ddc

--- thumb\_pipe\_delay.sdf

--- thumb\_pipe\_delay.sdc

--between 資料夾

---thumb\_pipe\_between.v

---thumb\_pipe\_between.ddc

--- thumb\_pipe\_between.sdf

--- thumb\_pipe\_between.sdc

-word 報告

--第一部分:

---rtl level 切 4 階 pipeline 的 testbench 波型圖一張(10%)

---gate level 切 4 階 pipeline 的 testbench 波型圖一張(10%)

---以上波型解釋(10%)

--第二部分:

---synthesis area information, individual critical path delays of each pipelined stage , power in both  
design compiler and PrimeTime 完成以下表格(20%)

---表格中，以 mid(in-between)合成結果之 area(x1)、各階 delay(x4)、power(DC、PT)(x2) ,  
report 報告共 7 張截圖(15%)

--第三部分:

---心得(5%)