

# PROJECT : RISCV CORE PHYSICAL DESIGN.

MAIN TITLE: Full-Flow Physical Implementation of the RV32E201X: A 5-Stage Pipelined RISC-V Core.

SUMMARY: This project details the successful **Very Large Scale Integration (VLSI)** implementation of the **RV32E201X**, a custom 32-bit RISC-V processor core. The design employs a classic **5-stage pipeline architecture** (Fetch, Decode, Execute, Memory Access, Write-Back) to maximize instruction throughput and overall performance.

The work covers the entire **ASIC physical design flow**, progressing from the synthesized Register-Transfer Level (RTL) netlist through floorplanning, placement, Clock Tree Synthesis (CTS), routing, and physical verification. The final implementation achieved timing closure, resulting in a core operating at **100 MHz** with a total area of **0.0278 mm<sup>2</sup>** in the **28nm** process.

## KEY SPECIFICATIONS:

| <u>Specification</u> |   | <u>Detail</u>                                                 |
|----------------------|---|---------------------------------------------------------------|
| ➤ Architecture       | - | Custom RISCV ISA                                              |
| ➤ Core name          | - | RV32E201X                                                     |
| ➤ Pipeline stages    | - | 5 stages (Fetch, Decode, Execute, Memory Access, Write-Back). |
| ➤ Bus width          | - | 32 bit                                                        |
| ➤ Design Tool        | - | Synopsys DC shell, ICC2 shell                                 |
| ➤ Technology node    | - | 28nm                                                          |

## DESIGN GOALS AND CONSTRAINTS:

- Converting technology independent RTL to manufacturing format of GDSII.
- Optimization of Power, performance and area.
- Clearing DRC fixes, setup and hold violations.
- Reducing congestion by giving required blockages and keepout margins.

# 1. SYNTHESIS AND NETLIST GENERATION:

- Synthesis Tool used - Synopsys Design Compiler
- Design Flow Steps -
  - Read RTL code
  - SDC constraints
  - Technology Mapping
  - Optimization
  - Meeting Timing Closure
  - PPA Reports
  - Generation of Output files.
- Operating Conditions -
  - Process: Slow Slow
  - Voltage: 0.81v
  - Temperature: -40c
- Target Clock Period - 10ns

## SYNTHESIS OUTPUTS:

- Total cell area : 21675.905537
- Setup Slack : MET
- Total Power : 1.1897Mw
- Files : (.v) netlist file, .sdc file, .ddf file





## AREA REPORT:

**Cell Count**

|                           |       |
|---------------------------|-------|
| Hierarchical Cell Count:  | 50    |
| Hierarchical Port Count:  | 4366  |
| Leaf Cell Count:          | 20586 |
| Buf/Inv Cell Count:       | 2259  |
| Buf Cell Count:           | 1148  |
| Inv Cell Count:           | 1111  |
| CT Buf/Inv Cell Count:    | 0     |
| Combinational Cell Count: | 16467 |
| Sequential Cell Count:    | 4119  |
| Macro Count:              | 0     |

**Area**

|                        |              |
|------------------------|--------------|
| Combinational Area:    | 12077.477880 |
| Noncombinational Area: | 9598.427657  |
| Buf/Inv Area:          | 713.915994   |
| Total Buffer Area:     | 433.94       |
| Total Inverter Area:   | 279.97       |
| Macro/Black Box Area:  | 0.000000     |
| Net Area:              | 0.000000     |
| Cell Area:             | 21675.905537 |
| Design Area:           | 21675.905537 |

## POWER REPORT:

| Power Group   | Internal Power | Switching Power | Leakage Power | Total Power | ( % )     | Attrs |
|---------------|----------------|-----------------|---------------|-------------|-----------|-------|
| io_pad        | 0.0000         | 0.0000          | 0.0000        | 0.0000      | ( 0.00%)  |       |
| memory        | 0.0000         | 0.0000          | 0.0000        | 0.0000      | ( 0.00%)  |       |
| black_box     | 0.0000         | 0.0000          | 0.0000        | 0.0000      | ( 0.00%)  |       |
| clock_network | 1.1631         | 0.0000          | 0.0000        | 1.1631      | ( 97.77%) | i     |
| register      | 3.8815e-04     | 4.4082e-05      | 432.8919      | 8.6538e-04  | ( 0.07%)  |       |
| sequential    | 0.0000         | 0.0000          | 0.0000        | 0.0000      | ( 0.00%)  |       |
| combinational | 5.5980e-03     | 1.9584e-02      | 506.1636      | 2.5689e-02  | ( 2.16%)  |       |
| Total         | 1.1691 mW      | 1.9628e-02 mW   | 939.0555 nW   | 1.1897 mW   |           |       |

## TIMING REPORTS:

Startpoint: address[1] (input port clocked by vclk)

Endpoint: value\_o[0] (output port clocked by vclk)

Path Group: i2o

Path Type: max

| Des/Clust/Port                             | Wire Load Model | Library                                   |
|--------------------------------------------|-----------------|-------------------------------------------|
| Data_Memory                                | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| CPU                                        | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| Registers                                  | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| Point                                      | Incr            | Path                                      |
| clock vclk (rise edge)                     | 0.00            | 0.00                                      |
| clock network delay (ideal)                | 0.00            | 0.00                                      |
| input external delay                       | 3.00            | 3.00 f                                    |
| address[1] (in)                            | 0.00            | 3.00 f                                    |
| Data_Memory/op_addr[1] (Data_Memory)       | 0.00            | 3.00 f                                    |
| Data_Memory/U1127/ZN (INVDI2BWP40P140HVT)  | 0.03            | 3.03 r                                    |
| Data_Memory/U1129/ZN (NR3D0BWP40P140HVT)   | 0.04            | 3.07 f                                    |
| Data_Memory/U5/ZN (CKND2D1BWP40P140HVT)    | 0.14            | 3.22 r                                    |
| Data_Memory/U379/ZN (OAI2D1BWP40P140HVT)   | 0.11            | 3.33 f                                    |
| Data_Memory/U1083/ZN (AOI22D1BWP40P140HVT) | 0.07            | 3.40 r                                    |
| Data_Memory/U1080/Z (AN4D1BWP40P140HVT)    | 0.12            | 3.52 r                                    |
| Data_Memory/U358/ZN (ND4D1BWP40P140HVT)    | 0.06            | 3.58 f                                    |
| Data_Memory/data_mem_o[8] (Data_Memory)    | 0.00            | 3.58 f                                    |
| U82/Z (A022D0BWP40P140HVT)                 | 0.09            | 3.67 f                                    |
| U102/ZN (AOI22D1BWP40P140HVT)              | 0.06            | 3.73 r                                    |
| U100/ZN (OAI22D1BWP40P140HVT)              | 0.07            | 3.80 f                                    |
| value_o[0] (out)                           | 0.00            | 3.80 f                                    |
| data arrival time                          |                 | 3.80                                      |
| clock vclk (rise edge)                     | 10.00           | 10.00                                     |
| clock network delay (ideal)                | 0.00            | 10.00                                     |
| output external delay                      | -3.00           | 7.00                                      |
| data required time                         |                 | 7.00                                      |
| data required time                         |                 | 7.00                                      |
| data arrival time                          |                 | -3.80                                     |
| slack (MET)                                |                 | 3.20                                      |

Startpoint: Registers/register\_reg[25][18]  
(falling edge-triggered flip-flop clocked by clk)

Endpoint: IF\_ID/inst\_o\_reg[0]

(rising edge-triggered flip-flop clocked by clk)

Path Group: r2r

Path Type: max

| Des/Clust/Port                                           | Wire Load Model | Library                                   |
|----------------------------------------------------------|-----------------|-------------------------------------------|
| MEM_WB                                                   | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| Data_Memory                                              | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| EX_MEM                                                   | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| ID_EX                                                    | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| Registers                                                | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| IF_ID                                                    | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| Instruction_Memory                                       | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| CPU                                                      | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| PC                                                       | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| CPU_DW01_cmp6_0                                          | ZeroWireload    | tcbn28hpcplusbwp40p140hvtssg0p81vm40c_ccs |
| Point                                                    | Incr            | Path                                      |
| clock clk (fall edge)                                    | 5.00            | 5.00                                      |
| clock network delay (ideal)                              | 0.00            | 5.00                                      |
| Registers/register_reg[25][18]/CPN (DFNCND1BWP40P140HVT) | 0.00 #          | 5.00 f                                    |
| Registers/register_reg[25][18]/Q (DFNCND1BWP40P140HVT)   | 0.11            | 5.11 f                                    |
| Registers/U2052/ZN (AOI22D0BWP40P140HVT)                 | 0.05            | 5.16 r                                    |
| Registers/U2053/ZN (ND4D0BWP40P140HVT)                   | 0.08            | 5.24 f                                    |
| Registers/U2054/ZN (NR2D0BWP40P140HVT)                   | 0.06            | 5.30 r                                    |
| Registers/U2055/ZN (OA122D0BWP40P140HVT)                 | 0.06            | 5.36 f                                    |
| Registers/R\$data_o[18] (Registers)                      | 0.00            | 5.36 f                                    |
| eq_138/A[18] (CPU_DW01_cmp6_0)                           | 0.00            | 5.36 f                                    |
| eq_138/U29/Z (XOR2UD0BWP40P140HVT)                       | 0.08            | 5.44 f                                    |
| eq_138/U28/ZN (NR4D0BWP40P140HVT)                        | 0.06            | 5.50 r                                    |
| eq_138/U27/ZN (ND4D0BWP40P140HVT)                        | 0.08            | 5.58 f                                    |
| eq_138/U3/ZN (NR4D0BWP40P140HVT)                         | 0.06            | 5.64 r                                    |
| eq_138/E0 (CPU_DW01_cmp6_0)                              | 0.00            | 5.64 r                                    |
| U79/Z (CKAN2D1BWP40P140HVT)                              | 0.06            | 5.70 r                                    |
| IF_ID/flush_i (IF_ID)                                    | 0.00            | 5.70 r                                    |
| IF_ID/U4/ZN (IND2D1BWP40P140HVT)                         | 0.07            | 5.77 r                                    |
| IF_ID/U17/ZN (INVDI2BWP40P140HVT)                        | 0.06            | 5.82 f                                    |
| IF_ID/U36/Z (CKAN2D1BWP40P140HVT)                        | 0.13            | 5.96 f                                    |
| IF_ID/U3/ZN (NR2D1BWP40P140HVT)                          | 0.25            | 6.21 r                                    |
| IF_ID/U46/Z (A022D0BWP40P140HVT)                         | 0.16            | 6.36 r                                    |
| IF_ID/inst_o_reg[0]/D (DFQD2BWP40P140HVT)                | 0.00            | 6.36 r                                    |
| data arrival time                                        |                 | 6.36                                      |
| clock clk (rise edge)                                    | 10.00           | 10.00                                     |
| clock network delay (ideal)                              | 0.00            | 10.00                                     |
| clock uncertainty                                        | -2.00           | 8.00                                      |
| IF_ID/inst_o_reg[0]/CP (DFQD2BWP40P140HVT)               | 0.00            | 8.00 r                                    |
| library setup time                                       | -0.05           | 7.95                                      |
| data required time                                       |                 | 7.95                                      |
| data required time                                       |                 | 7.95                                      |
| data arrival time                                        |                 | -6.36                                     |
| slack (MET)                                              |                 | 1.59                                      |

## 2. PLACEMENT AND ROUTING (PNR) IMPLEMENTATION:

TOOL USED: Synopsys ICC2 shell

### ➤ 2.1 FLOORPLANNING:

Define the chip layout, core boundary, and placement regions for macros and standard cells.

#### Steps Followed:

- Defined core area utilization = 60% and aspect ratio = 1.
- Set core with core offset as {1.4 0.9} .
- Imported technology LEF and cell libraries.
- Created boundary cells and input/output ports.
- Verified floorplan utilization and congestion maps.



## ➤ 2.2 POWERPLANNING

**Objective:** Create a robust power grid network to ensure proper voltage distribution and minimize IR drop.

### Steps Followed:

- Generated power rings for VDD and VSS around the core boundary.
- Added horizontal and vertical power stripes across multiple metal layers.
- Used create\_power\_stripes command to automate stripe insertion.
- Verified connectivity using check\_pg\_connectivity.
- Performed preliminary IR drop and EM analysis.
- Inserted Tap cells.

| LAYER | PATTERN                   | PITCH ( $\mu\text{m}$ ) | WIDTH ( $\mu\text{m}$ ) | OFFSET ( $\mu\text{m}$ ) |
|-------|---------------------------|-------------------------|-------------------------|--------------------------|
| M1    | Rail Pattern              | -                       | 0.15                    | -                        |
| M5    | Vertical (Core PG mesh)   | 3.0                     | 0.45                    | 0.6                      |
| M6    | Horizontal (Core PG mesh) | 5.0                     | 0.45                    | 0.6                      |
| M7    | Vertical (Core PG mesh)   | 3.0                     | 0.62                    | 0.6                      |
| M8    | Horizontal                | 6.0                     | 1.52                    | 3.2                      |
| M9    | Vertical                  | 7.5                     | 3.0                     | 3.48                     |





## ➤ 2.3 PLACEMENT

**Objective:** Place all standard cells within the defined core area while maintaining design density and timing targets.

### Steps Followed:

- Ran **initial placement** (place\_opt) and **legalization** (legalize\_placement).
- Checked congestion and optimized cell spreading.
- Inserted buffers and resized cells for better timing and congestion.





## ➤ 2.4 CLOCK TREE SYNTHESIS (CTS):

**Objective:** Build and balance the clock network to achieve minimal skew and latency.

### Steps Followed:

- Defined clock tree specification (create\_clock\_tree\_spec).
- Performed clock optimization using clock\_opt.
- Used dedicated clock buffer/inverter cells.
- Verified clock skew, latency, and insertion delay.
- Rechecked timing after CTS for setup and hold.





## ➤ **2.5 ROUTING:**

**Objective:** Perform signal routing to interconnect all cells while meeting DRC and timing closure.

### Steps Followed:

- Executed **global routing** to estimate path congestion.
- Performed **track assignment** and **detailed routing** using `route_auto` and `route_detail`.
- Resolved routing violations using `check_routes` and `repair_routes`.
- Verified connectivity and layer usage.



## QOR REPORT:

Area

|                        |           |
|------------------------|-----------|
| Combinational Area:    | 17339.62  |
| Noncombinational Area: | 9590.24   |
| Buf/Inv Area:          | 5737.79   |
| Total Buffer Area:     | 5535.68   |
| Total Inverter Area:   | 202.10    |
| Macro/Black Box Area:  | 0.00      |
| Net Area:              | 0         |
| Net XLength:           | 170792.09 |
| Net YLength:           | 166991.32 |

Cell Area (netlist): 26929.85  
Cell Area (netlist and physical only): 27786.91  
Net Length: 337783.40

Design Rules

|                       |       |
|-----------------------|-------|
| Total Number of Nets: | 24381 |
| Nets with Violations: | 0     |
| Max Trans Violations: | 0     |
| Max Cap Violations:   | 0     |

1

## TIMING REPORT:

```
*****
Report : timing
  -path_type full
  -delay_type max
  -max_paths 1
  -report_by design
Design : CPU
Version: T-2022.03-SP2-1
Date   : Thu Oct  2 09:20:15 2025
*****
Information: Timer using 'SI, Timing Window Analysis'. (TIM-050)

Startpoint: Registers/register_reg[16][4] (falling edge-triggered flip-flop clocked by clk)
Endpoint: IF_ID/inst_o_reg[13] (rising edge-triggered flip-flop clocked by clk)
Mode: func
Corner: ffg0p99v125c_rcb
Scenario: func_hold_ffg0p99v125c_rcb
Path Group: r2r
Path Type: max

Point           Incr      Path
-----
clock clk (fall edge)      5.00    5.00
clock network delay (propagated)  0.16    5.16
Registers/register_reg[16][4]/CPN (DFNCND1BWP40P140HVT)  0.00    5.16 f
Registers/register_reg[16][4]/Q (DFNCND1BWP40P140HVT)  0.08    5.24 f
Registers/U1725/ZN (AOI22D0BWP40P140HVT)  0.03    5.28 r
Registers/U1726/ZN (ND4D0BWP40P140HVT)  0.21    5.49 f
Registers/U1732/ZN (NR2D0BWP40P140HVT)  0.28    5.76 r
Registers/U1733/ZN (OA122D0BWP40P140HVT)  0.12    5.89 f
eq_138/U6/ZN (XNR2UD0BWP40P140HVT)  0.08    5.97 r
eq_138/U4/ZN (ND4D0BWP40P140HVT)  0.10    6.07 f
eq_138/U3/ZN (NR4D0BWP40P140HVT)  0.13    6.20 r
U79/Z (AN2D0BWP40P140HVT)  0.09    6.29 r
IF_ID/U4/ZN (IND2D0BWP40P140HVT)  0.09    6.38 r
IF_ID/ctmTdsLR_1_2192/ZN (INR2D1BWP40P140HVT)  0.16    6.54 f
IF_ID/U3/ZN (NR2D0BWP40P140HVT)  0.36    6.90 r
IF_ID/U30/Z (A022D0BWP40P140HVT)  0.07    6.96 r
IF_ID/cpt_h_inst_9716/Z (DEL075MD1BWP40P140HVT)  0.11    7.07 r
IF_ID/inst_o_reg[13]/D (DFQD1BWP40P140HVT)  0.00    7.07 r
data arrival time          7.07

clock clk (rise edge)      10.00   10.00
clock network delay (propagated)  0.25    10.25
IF_ID/inst_o_reg[13]/CP (DFQD1BWP40P140HVT)  0.00    10.25 r
clock uncertainty        -1.20    9.05
library setup time         -0.01    9.04
data required time         9.04
data arrival time          -7.07

slack (MET)                1.97
```

## UTILIZATION REPORT:

```
*****
Report : report_utilization
Design : CPU
Version: T-2022.03-SP2-1
Date   : Thu Oct 2 09:21:13 2025
*****
Utilization Ratio:          0.7464
Utilization options:
- Area calculation based on: site_row of block CPU
- Categories of objects excluded: hard_macros macro_keepouts soft_macros io_cells hard_blockages
Total Area:                36077.2020
Total Capacity Area:        36077.2020
Total Area of cells:        26928.5940
Area of excluded objects:
- hard_macros      : 0.0000
- macro_keepouts   : 0.0000
- soft_macros      : 0.0000
- io_cells         : 0.0000
- hard_blockages  : 0.0000
Utilization of site-rows with:
- Site 'unit':           0.7464
0.7464
```

## CONSTRAINTS REPORT:

```
*****
Report : constraint
-all_violators
Design : CPU
Version: T-2022.03-SP2-1
Date   : Thu Oct 2 09:20:55 2025
*****
late_timing
-----
Information: Timer using 'SI, Timing Window Analysis'. (TIM-050)
Endpoint      Path Delay    Path Required     CRP    Slack Group Scenario
-----
No paths.

early_timing
-----
Information: Timer using 'SI, Timing Window Analysis'. (TIM-050)
Endpoint      Path Delay    Path Required     CRP    Slack Group Scenario
-----
No paths.

Mode: func Corner: ffg0p99v125c_rcb
Scenario: func_hold_ffg0p99v125c_rcb
-----
Number of max_transition violation(s): 0

Mode: func Corner: ffg0p99v125c_rcb
Scenario: func_hold_ffg0p99v125c_rcb
-----
Number of max_capacitance violation(s): 0

Mode: func Corner: ffg0p99v125c_rcb
Scenario: func_hold_ffg0p99v125c_rcb
-----
Number of min_capacitance violation(s): 0

Total number of violation(s): 0
*****
```

## POWER REPORT:

Cell Internal Power = 1.79e+06 nW ( 79.8%)  
Net Switching Power = 4.52e+05 nW ( 20.2%)  
Total Dynamic Power = 2.24e+06 nW (100.0%)

Cell Leakage Power = 2.91e+05 nW

### Attributes

u - User defined power group  
i - Includes clock pin internal power

| Power Group   | Internal Power | Switching Power | Leakage Power | Total Power | ( % )    | Attrs |
|---------------|----------------|-----------------|---------------|-------------|----------|-------|
| io_pad        | 0.00e+00       | 0.00e+00        | 0.00e+00      | 0.00e+00    | ( 0.0%)  |       |
| memory        | 0.00e+00       | 0.00e+00        | 0.00e+00      | 0.00e+00    | ( 0.0%)  |       |
| black_box     | 0.00e+00       | 0.00e+00        | 0.00e+00      | 0.00e+00    | ( 0.0%)  |       |
| clock_network | 1.78e+06       | 4.01e+05        | 2.75e+03      | 2.18e+06    | ( 86.4%) | i     |
| register      | 1.03e+03       | 1.41e+02        | 1.12e+05      | 1.13e+05    | ( 4.5%)  |       |
| sequential    | 0.00e+00       | 0.00e+00        | 0.00e+00      | 0.00e+00    | ( 0.0%)  |       |
| combinational | 4.51e+03       | 5.10e+04        | 1.76e+05      | 2.32e+05    | ( 9.2%)  |       |
| Total         | 1.79e+06 nW    | 4.52e+05 nW     | 2.91e+05 nW   | 2.53e+06 nW |          |       |
| 1             |                |                 |               |             |          |       |

## ➤ 2.6 PROBLEMS RESOLVED: GLOBAL ROUTING CONGESTION:



Fix: Created routing blockage in layers M2 and M4





## FINAL OUTCOMES:

| CATEGORY    | METRIC        | SYNTHESIS | PNR      |
|-------------|---------------|-----------|----------|
| QoR (Area)  | $\mu\text{m}$ | 21675.91  | 27786.91 |
| Utilization | %             | -         | 0.7464   |
| Power       | Total (mW)    | 1.1897    | 2.536    |
|             | Dynamic (mW)  | 1.1888    | 2.24     |
|             | Leakage(nW)   | 939.06    | 2.91     |
| Timing      | R2R slack     | +3.20     | +1.97    |