

# Structured ASIC Design Flow

## Placement, CTS, Routing & Analysis Results

Digital Design 2 - Group 9 (Ramy, Seif, Mohamed)

The American University in Cairo

December 18, 2025

# Outline

- 1 Arith Design
- 2 6502 Design
- 3 Z80 Design
- 4 SoC Design
- 5 AES\_128 Design
- 6 Knob Analysis
- 7 Results Summary

# Arith - Placement Heatmap



# Arith - Net Length Distribution

Net Length Distribution - arith



# Arith - Clock Tree Synthesis



# Arith - Timing Report

| Delay  | Time   | Description                                    |
|--------|--------|------------------------------------------------|
| 0.000  | 0.000  | clock clk (rise edge)                          |
| 0.000  | 0.000  | clock network delay (ideal)                    |
| 2.400  | 2.400  | ^ input external delay                         |
| 0.000  | 2.400  | ^ in_9 (in)                                    |
| 0.604  | 3.004  | v T0Y89_R2_INV_0/Y (sky130_fd_sc_hd_clkinv_2)  |
| 0.434  | 3.438  | v T0Y50_R2_OR_0/X (sky130_fd_sc_hd_or2_2)      |
| 0.477  | 3.915  | ^ T0Y49_R2_NAND_2/Y (sky130_fd_sc_hd_nand2_2)  |
| 0.285  | 4.200  | ^ T0Y49_R2_OR_0/X (sky130_fd_sc_hd_or2_2)      |
| 0.158  | 4.358  | v T0Y47_R1_NAND_2/Y (sky130_fd_sc_hd_nand2_2)  |
| 0.496  | 4.853  | v T0Y28_R2_OR_1/X (sky130_fd_sc_hd_or2_2)      |
| 0.269  | 5.123  | ^ T13Y22_R1_NAND_0/Y (sky130_fd_sc_hd_nand2_2) |
| 0.173  | 5.296  | ^ T7Y21_R2_OR_1/X (sky130_fd_sc_hd_or2_2)      |
| 0.110  | 5.405  | v T8Y21_R2_INV_0/Y (sky130_fd_sc_hd_clkinv_2)  |
| 0.310  | 5.715  | v T10Y14_R0_OR_1/X (sky130_fd_sc_hd_or2_2)     |
| 0.502  | 6.217  | v T10Y14_R2_OR_3/X (sky130_fd_sc_hd_or2_2)     |
| 0.474  | 6.691  | v T7Y59_R0_OR_2/X (sky130_fd_sc_hd_or2_2)      |
| 0.324  | 7.015  | v T12Y59_R0_OR_0/X (sky130_fd_sc_hd_or2_2)     |
| 0.288  | 7.303  | v T15Y59_R0_OR_2/X (sky130_fd_sc_hd_or2_2)     |
| 0.278  | 7.581  | v T16Y59_R0_OR_2/X (sky130_fd_sc_hd_or2_2)     |
| 0.262  | 7.843  | v T17Y59_R0_OR_1/X (sky130_fd_sc_hd_or2_2)     |
| 0.440  | 8.283  | v T17Y59_R0_OR_2/X (sky130_fd_sc_hd_or2_2)     |
| 0.395  | 8.677  | ^ T27Y74_R2_INV_0/Y (sky130_fd_sc_hd_clkinv_2) |
| 0.099  | 8.777  | ^ out_39 (out)                                 |
|        | 8.777  | data arrival time                              |
| 12.000 | 12.000 | clock clk (rise edge)                          |
| 0.000  | 12.000 | clock network delay (ideal)                    |
| -0.250 | 11.750 | clock uncertainty                              |
| 0.000  | 11.750 | clock reconvergence pessimism                  |
| -2.400 | 9.350  | output external delay                          |
|        | 9.350  | data required time                             |
|        | 9.350  | data required time                             |
|        | -8.777 | data arrival time                              |
|        | 0.573  | slack (MET)                                    |

# 6502 - Placement Heatmap



# 6502 - Net Length Distribution

Net Length Distribution - 6502



# 6502 - Clock Tree Synthesis

CTS Tree Structure - 6502



# 6502 - Routed Design



# 6502 - Timing Report

```
# Timing Summary for 6502
# Generated by sta.tcl

Clocks:
  clk: period = 90.000000 ns

Setup Timing:
  Worst Negative Slack (WNS): 7.744247909613478e-9 ns
  Total Negative Slack (TNS): 0.0 ns

Hold Timing:
  Worst Negative Slack (WNS): 6.709113264946609e-10 ns
  Total Negative Slack (TNS): 0.0 ns

Note: For detailed timing information, see:
  - Setup report: build/6502/6502_setup.rpt
  - Hold report: build/6502/6502_hold.rpt
  - Clock skew report: build/6502/6502_clock_skew.rpt
```

# Z80 - Placement Heatmap



# Z80 - Net Length Distribution

Net Length Distribution - z80



# Z80 - Clock Tree Synthesis



## Z80 - Failed Routing



# SoC - Placement Heatmap



# SoC - Net Length Distribution



# SoC - Clock Tree Synthesis



# AES - Placement Heatmap



# AES - Net Length Distribution



# AES - Clock Tree Synthesis



# Knob Analysis - Cooling Rate Effect



# Knob Analysis - Moves Per Temperature Effect



# Knob Analysis - Batch Size Effect



# Best Knob Settings

Based on the knob analysis experiments:

| Parameter              | Optimal Value | Effect                                               |
|------------------------|---------------|------------------------------------------------------|
| Cooling Rate           | 0.8           | Slower cooling = better quality and Balanced Runtime |
| Moves per Temp         | 5000+         | More exploration = lower HPWL                        |
| Batch Size             | 64-128        | Moderate batching optimal                            |
| Initial Temperature    | High (auto)   | Based on design size                                 |
| Refinement Probability | 0.3-0.5       | Balanced local search                                |

## HPWL and Runtime Results

| Design | Cells  | Final HPWL ( $\mu m$ ) | Improvement | Runtime (s) |
|--------|--------|------------------------|-------------|-------------|
| arith  | 860    | 138,406                | 10.85%      | 15          |
| 6502   | 3,512  | 656,843                | 31.17%      | 120         |
| z80    | 9,144  | 1,460,000              | 23.65%      | 237         |
| soc    | 70,320 | 15,480,113             | 9.98%       | 2,031       |
| AES    | 85,819 | 12,903,763             | 12.86%      | 9,501       |

Table: Placement results: HPWL and runtime for each design

### Machine Specifications:

- CPU: [Intel Core i7 - 11800H]
- Memory: [32GB RAM]
- OS: Windows 10.
- Note: Laptop connected to power during all runs.

# Summary

## Key Achievements:

- Successfully placed 4 designs on structured ASIC fabric
- Achieved 7.98% - 31.17% HPWL improvement with SA optimization
- Implemented congestion-aware placement
- Built X-tree CTS for clock distribution
- Routed designs using OpenROAD
- Added STA using OpenROAD

## Challenges:

- li1 layer 99% blocked - limits routing capacity
- Larger designs (SoC) show less improvement due to fabric constraints

# Thank You!

Questions?