

# PnR and STA Flow for a Structured ASIC Platform

Project Presentation

---

Mazin Bersy  
Malek Mahmoud  
Mostafa Elshamy

# Recap



Checkpoint 1

Checkpoint 2



# Phase 4 - Deliverables Status

| Deliverable                           | Status             | Comments                                 |
|---------------------------------------|--------------------|------------------------------------------|
| <u>make_def.py</u>                    | Complete           | Generates *_fixed.def correctly          |
| rename.py                             | Complete           | Named “rename_verilog_cells” for clarity |
| route.tcl                             | Complete           |                                          |
| Global routing                        | Complete           | Produces *_routed.def                    |
| Detailed Routing                      | Partially complete | Works on small designs only              |
| extract_parasitics → .spef            | Complete           | SPEF files are ~1.5M lines each          |
| report_congestion → *_congestion.rpt  | Not Working        | Not functional                           |
| Congestion heatmap (*_congestion.png) | Completed          | Obtained from openroad gui               |

# Phase 5 - Deliverables Status

| Deliverable                                 | Status                | Issue                                                 |
|---------------------------------------------|-----------------------|-------------------------------------------------------|
| sta.tcl                                     | Generated - Partially | For small designs only                                |
| SDC files                                   | Generated             | Generic: works for all designs after detailed routing |
| Setup timing report                         | Generated             | *_setup_timing.rpt (1600+ lines)                      |
| Hold timing report                          | Generated             | *_hold_timing.rpt                                     |
| Clock skew report                           | Generated             | *_clock_skew.rpt                                      |
| WNS/TNS reports                             | Generated             | *_worst_slack.rpt,<br>*_total_negative_slack.rpt      |
| Power report                                | Generated             | *_power.rpt                                           |
| Slack histogram (*_slack.png)               | Created               | Slide 12                                              |
| Critical path overlay (*_critical_path.png) | Created               | Slide 11                                              |

# Phase 3

# Clock Tree Synthesis

Objective: Deliver clock signal to all DFFs with minimal skew

- Inputs:

- Placed DFF locations (sinks)
- Available unused buffers in fabric

- Algorithm: H-Tree

1. Recursively partition sinks into quadrants
2. Find geometric centroid of each partition
3. Claim nearest unused buffer at centroid
4. Repeat until all sinks connected

Output:

| Metric           | Arith    | expanded_6502 |
|------------------|----------|---------------|
| CTS Buffers Used | 3,669    | 3,669         |
| Tree Depth       | 6 levels | 6 levels      |
| DFF Sinks        | 6,480    | 6,480         |
| Clock Skew       | 0.27 ns  | 0.28 ns       |

# CTS Visualization

## arith



# CTS Visualization

## 6502



Z80



# Phase 4

# Routing

- DEF Generation (make\_def.py):
  - Die area, rows, tracks
  - All I/O pins (+ FIXED)
  - All fabric components (+ FIXED) - used and unused
- OpenROAD Flow (route.tcl):
  - Global routing → Detailed routing
  - Parasitic extraction → .spef
  - Congestion reporting
- Outputs:
  - \_fixed.def - Input to router
  - \_routed.def - Final routed design
  - \_congestion.rpt - Routing congestion data

# Results

| Design        | Routed DEF Lines | DRC Violations | Status |
|---------------|------------------|----------------|--------|
| arith         | 669,499          | 0              | Clean  |
| expanded_6502 | 737,751          | 0              | Clean  |
| 6502          | 134,142          | 3,821          | Shorts |
| z80           | 139,122          | N/A            | N/A    |

# Openroad Heatmap screenshots

arith



# Phase 5

# Static Timing Analysis - arith



# Static Timing Analysis Results - Expanded 6502



# Static Timing Analysis - 6502



# Static Timing Analysis Results

| Design        | WNS (ns) | TNS (ns) | Status |
|---------------|----------|----------|--------|
| arith         | +1.14    | 0.00     | MET    |
| 6502          | +2.47    | 0.00     | MET    |
| expanded_6502 | +2.47    | 0.00     | MET    |

Clock Period: 10 ns (100 MHz target)

- Positive WNS = timing margin (no violations)
- TNS = 0 = no failing paths
- All completed designs achieve timing closure

# HPWL run time analysis

| Design  | Format          | Calculated HPWL                  | Processor                                       | Cores | RAM  | GPU                        | Checkpoint Reached                 | Runtime                                   |
|---------|-----------------|----------------------------------|-------------------------------------------------|-------|------|----------------------------|------------------------------------|-------------------------------------------|
| Arith   | SA Optimized    | Total wire length = 591410 um.   | 12th Gen Intel(R) Core(TM) i7-12700H (2.30 GHz) | 14    | 16GB | NVIDIA RTX 3060 Laptop GPU | Detailed Routing - 0 Violations    | Elapsed time: 2386 seconds (39 minutes)   |
| 6502    | SA Optimized    | Total wire length = 906315 um.   | 12th Gen Intel(R) Core(TM) i9-12900H            | 24    | 64GB | NVIDIA RTX 3090 GPU        | Detailed Routing - 3821 Violations | Elapsed time: 30874 (514 minutes)         |
| 6502    | Expanded Greedy | Total wire length = 1342275 um.  | 12th Gen Intel(R) Core(TM) i7-12700H (2.30 GHz) | 14    | 16GB | NVIDIA RTX 3060 Laptop GPU | Detailed Routing - 0 Violations    | Elapsed time: 3568 seconds (59 minutes)   |
| z80     | SA Optimized    | Total wire length: 2128429 um.   | 12th Gen Intel(R) Core(TM) i7-12700H (2.30 GHz) | 14    | 16GB | NVIDIA RTX 3060 Laptop GPU | Global Routing                     | Elapsed time: 15733 seconds (262 minutes) |
| aes_128 | SA Optimized    | Total wire length = 10016384 um  | 12th Gen Intel(R) Core(TM) i7-12700H (2.30 GHz) | 14    | 16GB | NVIDIA RTX 3060 Laptop GPU | Global Routing                     | Elapsed time: 34584 seconds (576 minutes) |
| soc     | SA Optimized    | Total wire length = 20059167 um. | 12th Gen Intel(R) Core(TM) i9-12900H            | 24    | 64GB | NVIDIA RTX 3090 GPU        | Global Routing                     | Elapsed time: 37073 seconds (618 minutes) |

# Power Distribution Analysis

| Design        | Sequential     | Combinational   | Clock           | Total   |
|---------------|----------------|-----------------|-----------------|---------|
| arith         | 0.14 mW (0.8%) | 0.16 mW (0.8%)  | 18.2 mW (98.4%) | 18.5 mW |
| 6502          | 1.01 mW (4.3%) | 4.46 mW (18.7%) | 18.3 mW (77.0%) | 23.8 mW |
| expanded_6502 | 1.01 mW (4.3%) | 4.46 mW (18.7%) | 18.3 mW (77.0%) | 23.8 mW |

# Test design: soc



Thank You