



# Overview of the Back-End Design of Digital Chips

Lida Xu

Institute of Computing Technology, Chinese Academy of Sciences

2025.12.27



---

CONTENTS

---

1. Overview of the Back-end

---

2. Logical synthesis

---

3. Timing analysis

---

4. Place & Route

---

5. Physical verification

---

# 1. Overview of the Back-end

■ **Digital chip design** : Translating functional requirements into manufacturable silicon chip circuits

■ Front-end design: Design and verification of RTL

■ **Back-end design: From RTL (Netlist) to the generation of GDS**



# 1. Overview of the Backend

## ■ Back-end design process

- Logical synthesis (SYN)
- Formal verification (FM)
- Static timing analysis (STA)
- Place & Route (PR)
- Physical verification (PV)
- Signoff
- Floorplan
- ERC
- Place
- DRC
- CTS
- LVS
- Route
- Tape out



# 1. Overview of the Backend

## ■ Back-end design EDA tools



# 1. Overview of the Backend

## ■ ECOS (EDA、Chip、One-Student-One-Chip、System )

**SoC Template**  
Select SoC template from the IP Market

**Float Panel System**  
Support the dragging, resizing operations and customized data

**Advanced extensible Layout Editor**

**Cloud-based Agile EDA Platform**  
**iEDA Inside**  
“An open-source EDA infrastructure”  
(support ICS55/SKY130/IHP130)

**ECOS Studio**  
“Componentization of Data”

**Dashboard**

**Merge to Latest Shuttle**

**Real-time Collaboration\***  
Flow →  
Architecture  
RTL/Verify  
Backend  
Pipeline Mode  
\*: this feature will be released in 2025Q4

# 1. Overview of the Backend

- Supports one-stop RISC-V processor SoC chip design and tape-out using open-source EDA tools, open-source IP, and open-source PDK

Submit Design  
(Processor core/IP/...)

```
1 module ysyx_23060171_idu(
2     input [6:0]opcode,
3     input [2:0]f3,
4     input [6:0]f7,
5     output [2:0]alucontrol,
6     output Regwrite,
7     output [2:0]immtype
8 );
9     assign alucontrol = 3'b000;
10    assign Regwrite = 1;
11    assign immtype = 3'b000;
12 endmodule
```



Home delivery  
(Chip + Board)



# 1. Overview of the Backend

■ Supports end-to-end layout design of 100,000 instances of chips from RTL to GDSII on ICsprout 55nm process



Home interface



Layout render effect



Merge interface



# 1. Overview of the Backend

■ Two chips designed based on ICsprout 55nm process have been fabricated and are running successfully



浙 | 江 | 创 | 芯

A screenshot of a terminal window displaying boot logs for the ECOS chip. The logs include initialization messages, memory map, and other diagnostic information.

Design: A RV32IMAC SoC(PSRAM, QSPI, UART, I2C, PWM, TIMER, RNG).  
Size: 4mm<sup>2</sup>(128KB OCM, no PLL)

Freq: 100MHz(external clock bypass)  
Gates: 1.517M(73,009 cells) Power: 115.4mW(dynamic) 0.42mW(static)

---

CONTENTS

---

1. Overview of the Back-end

---

2. Logical synthesis

---

3. Timing analysis

---

4. Place & Route

---

5. Physical verification

---

## 2. Synthesis

### ■ Synthesis (Syn)

- Synthesis involves mapping the front-end RTL code to a specific process library, adding constraint information, and logically optimizing the RTL code to generate a gate-level netlist. It is the process from **RTL** to **Gate-Level Netlist**



## 2. Synthesis

### ■ Synthesis : RTL to Gate-level Netlist generation

#### ■ Input file:

- RTL Code (Verilog、VHDL)
- Library files (.lib/.db)
- Constraint files (.sdc)



lib、sdc

RTL

Netlist

Translation

- RTL translate to GTECH

Optimization

- Structural optimization
- Logic optimization

Mapping

- Optimized after mapping

## 2. Synthesis

### ■ Comparison of the RTL and Netlist files

```
1 module asic_top(
2     input clock,
3     input reset,
4     input in,
5     output out
6 );
7
8 reg reg_in;
9 reg reg_out;
10
11 always @(posedge clock or posedge reset)
12 begin
13     if (reset)begin
14         reg_in <= 1'b0;
15         reg_out <= 1'b0;
16     end
17     else begin
18         reg_in <= 1'b1;
19         reg_out <= reg_in ? in : 1'b0;
20     end
21 end
22
23 assign out = reg_out;
24
25 endmodule
```

RTL

VS

```
1 module asic_top (
2     clock,
3     in,
4     out,
5     reset
6 );
7
8 input clock ;
9 input in ;
10 output out ;
11 input reset ;
12
13 wire _0_ ;
14 wire _1_ ;
15 wire _2_ ;
16 wire clock ;
17 wire in ;
18 wire out ;
19 wire out_reg_p_D ;
20 wire reg_in ;
21 wire reset ;
22
23 AND2X16H7L _3_ ( .A(reg_in ), .B(in ), .Y(out_reg_p_D ) );
24 INVX0P5H7L _4_ ( .A(reset ), .Y(_0_ ) );
25 INVX0P5H7L _5_ ( .A(reset ), .Y(_1_ ) );
26 TIEHIH7H _6_ ( .Y(_2_ ) );
27 DFFRQX2H7L out_reg_p ( .CK(clock ), .D(out_reg_p_D ), .Q(out ), .RN(_0_ ) );
28 DFFRQX2H7L reg_in_reg_p ( .CK(clock ), .D(_2_ ), .Q(reg_in ), .RN(_1_ ) );
29
30 endmodule
```

standard cell

netlist

## 2. Synthesis

### ■ PVT (Process, Voltage, Temperature)

- The speed of the process Corner, voltage level and temperature level will all affect the timing analysis results



- BC: ff\_0p88v\_m40c:
  - fast process
  - high power voltage
  - low temperature
- Typical: typical\_0p8v\_25c:
  - typical process
  - normal power voltage
  - normal temperature
- WC: ss\_0p72v\_125c:
  - slow process
  - low power voltage
  - high temperature

## 2. Synthesis

### ■ Synthesis report

- ≡ asic\_top.area.rpt
- ≡ asic\_top.check\_design.pre.rpt
- ≡ asic\_top.check\_design.rpt
- ≡ asic\_top.check\_timing.rpt
- ≡ asic\_top.clock\_gating.rpt
- ≡ asic\_top.clock.rpt
- ≡ asic\_top.constraint.rpt
- ≡ asic\_top.dont\_touch.rpt
- ≡ asic\_top.drc.rpt
- ≡ asic\_top.power.rpt
- ≡ asic\_top.qor.rpt
- ≡ asic\_top.registers.rpt
- ≡ **asic\_top.statistics.rpt**
- ≡ asic\_top.timing.rpt



```
1 #-----  
2 # Area Area (μm2)  
3 #-----  
4 total std mem ipio sub_harden  
5 4219061.4 2221580.0 0.0 1997481.4 0.0  
6  
7 #-----  
8 # Cell Count Design scale  
9 #-----  
10 total std mem ipio sub_harden  
11 182428 182320 0 108 0  
12  
13 #-----  
14 # Vt/C1 ratio(area) Device type  
15 #-----  
16 SVT40 LVT40  
17 97.59 2.41  
18  
19 #-----  
20 # Timing Timing  
21 #-----  
22 group org_freq over_freq wns tns num  
23 CLK_u0_chiplink_rx_clk_pad_PAD 25.0 38.5 15.714 0.000 0  
24 CLK_u0_clk_XC 25.0 38.5 2.283 0.000 0  
25 CLK_u0_pll_CLK_OUT 100.0 153.8 0.000 0.000 0  
26 CLK_u1_clk_XC 100.0 153.8 0.006 0.000 0  
27 CLK_u1_pll_CLK_OUT 100.0 153.8 0.006 0.000 0
```

---

CONTENTS

---

1. Overview of the Back-end

---

2. Logical synthesis

---

3. Timing analysis

---

4. Place & Route

---

5. Physical verification

---

## ■ Static Timing Analysis (STA)

■ STA is the method for verifying the circuit timing by checking the timing information of all paths

- Divide the design into several paths
- Calculate the delay of each path respectively
- Check whether the delay of each path meets the design constraints



### 3. Timing Analysis

#### ■ Setup Time ( $T_{\text{setup}}$ )

■  $T_{\text{setup}}$  is the time when the data remains stable before the rising edge of the clock arrives



(1) The time when the data reaches the D end of UFF1 (arrival time):

$$T_a = T_{\text{launch}} + T_{\text{clk-q}} + T_{\text{logic}}$$

(2) Meet the maximum time allowed by setup (required time):

$$T_r = T_{\text{capture}} + T_{\text{clk}} - T_{\text{setup}}$$

(3) According to the setup time requirements,  $T_r - T_a = T_{\text{margin}} \geq 0$

$$T_{\text{capture}} + T_{\text{clk}} - T_{\text{setup}} - T_{\text{launch}} - T_{\text{clk-q}} - T_{\text{logic}} \geq 0$$

(4) Let  $T_{\text{capture}} - T_{\text{launch}} = T_{\text{skew}}$ , and we get:

$$T_{\text{skew}} + T_{\text{clk}} \geq T_{\text{setup}} + T_{\text{clk-q}} + T_{\text{logic}}$$



### 3. Timing Analysis

#### ■ Hold Time ( $T_{hold}$ )

■  $T_{hold}$  is the time when the data remains stable after the rising edge of the clock arrives



(1) The time when the data reaches the D end of UFF1 (arrival time) :

$$T_a = T_{\text{launch}} + T_{\text{ck-q}} + T_{\text{logic}}$$

2) Meet the maximum time allowed by hold (required time):

$$T_r = T_{\text{capture}} + T_{\text{hold}}$$

3) According to the hold time requirements,  $T_a - T_r = T_{\text{margin}} \geq 0$

$$T_{\text{launch}} + T_{\text{ck-q}} + T_{\text{logic}} - T_{\text{capture}} - T_{\text{hold}} \geq 0$$

(4) Let  $T_{\text{capture}} - T_{\text{launch}} = T_{\text{skew}}$ , and we get:

$$T_{\text{skew}} + T_{\text{hold}} \leq T_{\text{ck-q}} + T_{\text{logic}}$$



## ■ Timing violation repair

- Method for fixing  $T_{\text{setup}}$  timing violations:

$$T_{\text{skew}} + T_{\text{clk}} \geq T_{\text{setup}} + T_{\text{ck-q}} + T_{\text{logic}}$$

- (1) Increase  $T_{\text{clk}}$ : Reduce frequency
- (2) Reduce  $T_{\text{logic}}$ : Optimize combinatorial logic, divide the pipeline, and reduce the load on the critical path
- (3) Reduce  $T_{\text{ck-q}}$ : Switch to faster logic units, such as HVT->LVT

- Method for fixing  $T_{\text{hold}}$  timing violations:

$$T_{\text{skew}} + T_{\text{hold}} \leq T_{\text{ck-q}} + T_{\text{logic}}$$

- (1) Increase  $T_{\text{logic}}$ : Increase the delay of the combined path and insert the buffer
- (2) Reduce  $T_{\text{skew}}$ : Even adopt a negative skew.

### 3.Timing Analysis ( $T_{\text{setup}}$ )

| Point                                                                           | Incr   | Path            |
|---------------------------------------------------------------------------------|--------|-----------------|
| clock (port)                                                                    | 0.000  | 0.000r          |
| clock (clock net)                                                               |        |                 |
| ysyx_2025_my_ifu.curr_state_1_reg_p:CK (DFFQX1H7L)                              | 0.000  | 0.000r          |
| clock core_clock (rise edge)                                                    | 0      | 0               |
| clock network delay (ideal)                                                     | 0.000  | 0.000           |
| ysyx_2025_my_ifu.curr_state_1_reg_p:CK (DFFQX1H7L)                              | 0.000  | 0.000r          |
| ysyx_2025_my_ifu.curr_state_1_reg_p:Q (DFFQX1H7L)                               | 0.089  | 0.089r          |
| ysyx_2025_my_ifu.curr_state_1_(net)                                             |        |                 |
| ysyx_2025_my_ifu.curr_state_1_BUFX8H7L_A:A (BUFX8H7L)                           | 0.000  | 0.089r          |
| ysyx_2025_my_ifu.curr_state_1_BUFX8H7L_A:Y (BUFX8H7L)                           | 0.054  | 0.144r          |
| mepc_7_INVX0P5H7L_A_Y_MUX2X0P5H7L_A_B_AOI21X0P5H7L_Y_A0_NOR2BX1P4H7L_Z_AN (net) |        |                 |
| .....                                                                           |        |                 |
| ysyx_2025_gpr.rf[8]_18_reg_p_D_NOR2BX8H7L_Z:AN (NOR2BX8H7L)                     | 0.000  | 1.738f          |
| ysyx_2025_gpr.rf[8]_18_reg_p_D_NOR2BX8H7L_Z:Z (NOR2BX8H7L)                      | 0.034  | 1.772f          |
| ysyx_2025_gpr.rf[8]_18_reg_p_D (net)                                            |        |                 |
| ysyx_2025_gpr.rf[8]_18_reg_p:D (DFFQX1H7L)                                      | 0.000  | 1.772f          |
| clock (port)                                                                    | 0.000  | 0.000r          |
| clock (clock net)                                                               |        |                 |
| ysyx_2025_gpr.rf[8]_0_reg_p_CK_ICGX0P5H7L_ECK:CK (ICGX0P5H7L)                   | 0.000  | 0.000r          |
| ysyx_2025_gpr.rf[8]_0_reg_p_CK_ICGX0P5H7L_ECK:ECK (ICGX0P5H7L)                  | 0.000  | 0.000r          |
| ysyx_2025_gpr.rf[8]_0_reg_p_CK (clock net)                                      |        |                 |
| ysyx_2025_gpr.rf[8]_18_reg_p:CK (DFFQX1H7L)                                     | 0.000  | 0.000r          |
| clock core_clock (rise edge)                                                    | 10     | 10              |
| clock network delay (ideal)                                                     | 0.000  | 10.000          |
| ysyx_2025_gpr.rf[8]_18_reg_p:CK (DFFQX1H7L)                                     |        | 10.000r         |
| library setup time                                                              | -0.849 | 9.951           |
| clock reconvergence pessimism                                                   | 0.000  | 9.951           |
| path cell delay                                                                 |        | 1.772(100.000%) |
| path net delay                                                                  |        | 0.000(0.000%)   |
| data require time                                                               |        | 9.951           |
| data arrival time                                                               |        | 1.772           |
| slack (MET)                                                                     |        | 8.179           |

slack =  $T_r - T_a = 9.951 - 1.772 = 8.179 > 0$ , The timing meets the requirements.

### 3. Timing Analysis ( $T_{hold}$ )

| Point                                            | Incr   | Path            |
|--------------------------------------------------|--------|-----------------|
| clock (port)                                     | 0.000  | 0.000r          |
| clock (clock net)                                |        |                 |
| ysyx_2025_my_lsu.en_flash_0_reg_p:CK (DFFQX1H7L) | 0.000  | 0.000r          |
| clock core_clock (rise edge)                     | 0      | 0               |
| clock network delay (ideal)                      | 0.000  | 0.000           |
| ysyx_2025_my_lsu.en_flash_0_reg_p:CK (DFFQX1H7L) | 0.000  | 0.000r          |
| ysyx_2025_my_lsu.en_flash_0_reg_p:Q (DFFQX1H7L)  | 0.064  | 0.064f          |
| ysyx_2025_my_lsu.en_flash_0_ (net)               |        |                 |
| ysyx_2025_my_lsu.en_flash_1_reg_p:D (DFFQX1H7L)  | 0.000  | 0.064f          |
| clock (port)                                     | 0.000  | 0.000r          |
| clock (clock net)                                |        |                 |
| ysyx_2025_my_lsu.en_flash_1_reg_p:CK (DFFQX1H7L) | 0.000  | 0.000r          |
| clock core_clock (rise edge)                     | 0      | 0               |
| clock network delay (ideal)                      | 0.000  | 0.000           |
| ysyx_2025_my_lsu.en_flash_1_reg_p:CK (DFFQX1H7L) |        | 0.000r          |
| library hold time                                | -0.008 | -0.008          |
| clock reconvergence pessimism                    | -0.000 | -0.008          |
| path cell delay                                  |        | 0.064(100.000%) |
| path net delay                                   |        | 0.000(0.000%)   |
| data require time                                |        | -0.008          |
| data arrival time                                |        | 0.064           |
| slack (MET)                                      |        | 0.072           |

slack =  $T_a - T_r = 0.064 - (-0.008) = 0.072 > 0$ , The timing meets the requirements.

---

CONTENTS

---

1. Overview of the Back-end

---

2. Logical synthesis

---

3. Timing analysis

---

4. Place & Route

---

5. Physical verification

---

## ■ Place & Route(PR)

■ Floorplan

■ Place

■ Clock Tree Synthesis(CTS)

■ Route



## ■ init & Floorplan

- Chip design input
- Chip I/O planning
- Macro unit placement
- Power supply planning
  - Power ring design
  - Power grid design
  - Power network connection



Chip utilization rate  $D = (\text{Total area of standard units}) / (\text{Core area} - \text{SRAM})$

## ■ init & Floorplan

### ■ Chip design input

#### ■ Chip I/O planning

### ■ Macro unit placement

### ■ Power supply planning

- Power ring design
- Power grid design
- Power network connection

VDD/VSS



# 4. Place & Route

## ■ init & Floorplan

- Chip design input
- Chip I/O planning
- Macro unit placement
- Power supply planning
  - Power ring design
  - Power grid design
  - Power network connection



## ■ init & Floorplan

- Chip design input
- Chip I/O planning
- Macro unit placement
- Power supply planning
  - Power ring design
  - Power grid design
  - Power network connection



## ■ init & Floorplan

- Chip design input
- Chip I/O planning
- Macro unit placement
- Power supply planning
  - Power ring design
  - Power grid design
  - Power network connection



# 4. Place & Route

## ■ init & Floorplan

- Chip design input
- Chip I/O planning
- Macro unit placement
- Power supply planning

- Power ring design
- Power grid design
- Power network connection



# 4. Place & Route

## ■ init & Floorplan

- Chip design input
- Chip I/O planning
- Macro unit placement
- Power supply planning
  - Power ring design
  - Power grid design
  - Power network connection



## ■ Place

### ■ The placement of standard units

- Global place
- Detail place

### ■ Optimization of standard units

- Whether it meets the timing sequence
- Whether it has sufficient resources



Before Place

# 4. Place & Route

## ■ Place

### ■ The placement of standard units

- Global place
- Detail place

### ■ Optimization of standard units

- Whether it meets the timing sequence
- Whether it has sufficient resources



# 4. Place & Route

## ■ CTS

### ■ Growing clock tree

- The clock grows from the root point to each sink point
- The time skew of the clock reaching the clock terminals of each register should be as small as possible

### ■ Clock tree routing

### ■ Optimize the timing



## ■ CTS

### ■ Growing clock tree

- The clock grows from the root point to each sink point
- The time skew of the clock reaching the clock terminals of each register should be as small as possible

### ■ Clock tree routing

### ■ Optimize the timing



## ■ Route

### ■ Global routing

- Plan the signal path and allocate wiring resources

### ■ Detail routing

- Implement the physical path to satisfy DRC

### ■ Search and repair

- Fix the violations after routing

### ■ Filler insert

#### ■ Insert the standard Filler cell

- Connect the diffusion layer to form the power ground track



## ■ Route

### ■ Global routing

- Plan the signal path and allocate wiring resources

### ■ Detail routing

- Implement the physical path to satisfy DRC

### ■ Search and repair

- Fix the violations after routing

## ■ Filler insert

### ■ Insert the standard Filler cell

- Connect the diffusion layer to form the power ground track



---

CONTENTS

---

1. Overview of the Back-end

---

2. Logical synthesis

---

3. Timing analysis

---

4. Place & Route

---

5. Physical verification

---

## ■ Physical Verification (PV)

- Electrical Rule Check (ERC)
- Design Rule Check (DRC)
- Layout Versus Schematic (LVS)



## ■ Design Rule Check (DRC)

- Ensure that the design of the chip layout complies with the design rules provided by the wafer fab (whether it can be manufactured)





Thanks!