

## Digital ASIC Design



## ASIC design flow

ASIC flow objective : RTL to GDS II

- Also, called Automated PnR and/or Physical Implementation

## Simplified RTL to GDS II flow



→ **Synthesis** :- Converts RTL to a circuit out of components from the standard cell library (SCL)



- "Standard Cells" have regular layout

- Each has different views / models
  - Electrical (HDL, SPICE,
  - Layout (Abstract and Detailed)

## Floor & Power planning

- Chip floor-planning : Partition the chip die between different system building blocks and the place the I/O pads

→ **Macro floor Planning** :- Dimensions, pin location, layer definition

### Power planning

power pads      power others      power M1  
(VDD, VSS)

→ **Placements** → Place the cells on the floor plan rows aligned with the pads



- Usually done in 2 steps : Global and Detailed



#### → Clock Tree Synthesis

- Create a clock distribution network

- To deliver the clock to all sequential elements (e.g. FF)
- With minimum skew (zero is hard to achieve)
- And in a good shape
- Usually a Tree (M, X, - -)



#### → Routing

- Implementing the interconnect using the available metal layers
- Metal tracks from a routing grid

- Routing grid is huge

#### Divide and Conquer

- Global Routing : Generates routing guide
- Detailed Routing : Uses the routing guide to implement actual wiring

#### → Sign off

##### Physical Verification

- Design Rule Checking (DRC)
- Layout vs Schematic (LVS)

##### Timing Verification

- Static Timing Analysis (STA)

#### \* OpenLane

- Started as an open-source flow for a true open source Tape-out experience

- SiFive is a family of open everything SoCs

open PDK, open GDS, open RTL  
SOC Features

SiFive  
SoC  
family

|           |                                       |
|-----------|---------------------------------------|
| SiFive    | Sky130 SCL + Synthesized 1Kbytes SRAM |
| SiFive 2  | Sky130 SCL + 1Kbytes Open RAM block   |
| SiFive 2g | SiFive 2 with a single chip module    |
| SiFive 3  | OSU SCL + Synthesized 1Kbytes SRAM    |
| SiFive 5  | Sky130 SCL + 8x1Kbytes Open RAM banks |
| SiFive 6  | SiFive 2 with OPT                     |

✓ The Good Paper

- Main Goal:-  
Produce a G Clean GDS II with no human intervention  
(no-human-in-the-loop)
- Clean means:
  - No LVS Violations
  - No DRC Violations
  - Timing violations? WIP! (Work in Progress)
- Tuned for Sky Water 130nm Open PDK
  - Also, supports xFAB 180 and GF130G
- Containerized
  - Functional out of the box
  - Instructions to build and run natively will follow
- Can be used to handle Macros or Chips
- 2 mode of operation:
  - Autonomous or Interactive
- Design Space Exploration:
  - Find the best set of flow configuration
- Large number of design examples
  - 43 designs with their best configurations
  - More will be added soon.

- Date \_\_\_\_\_  
Page No. 8
- Date \_\_\_\_\_  
Page No. 6
- OpenLane Regression Testing
  - The design exploration utility is also used for regression testing (CI)
  - we run OpenLane on ~70 designs and compare the results to the best known ones
  - PFT (Design for Test)
    - Scan Insertion
    - Automatic Test Pattern Generation (ATPG)
    - Test Pattern Compaction
    - Fault Coverage
    - Fault Simulation
  - Physical Implementation,
    - Also called automated P&R (Place and Route)
      - Floor / Power Planning
      - End Decoupling capacitors and Tap cell insertion
      - Placement : Global and Detailed
      - Post placement optimization
      - Clock Tree Synthesis (CTS)
      - Routing : Global and Detailed
  - Logic Equivalence Check (LEC)
    - Every time the netlist is modified, verification must be performed
      - CTS modifies the netlist
      - Post Placement optimizations modifies the netlist

\* LEC is used to formally confirm that the function did not change after modifying the testlist

\* Design Preparation Setup  
→ cd Desktop/work/tools/gf流れ/working\_dir/openbase

→ cd designs

→ cd picov32a

→ ls -lth

→ cd ..

→ cd ..

→ docker

→ ./flow.tcl -interactive  
→ pup -design picov32a  
→ run -synthesis

$$\text{flip ratio} = \frac{\text{No. of cells}}{\text{No. of rows}}$$

→ Package requires openbase 0.9

New Terminal  
in same directory.

→ cd runs

→ ls -lth  
→ cd 08-07-13-57

→ ls -lth  
• after run - synthesis  
→ cd results  
→ cd synthesis

✓The Good Paper

→ less picov32a.synthesis.v

→ cd .. / ..

→ cd reports

→ cd synthesis

→ less 1-yours.4.stat.vpt

1 → Define width and height of Core and Die



Lets begin with a adder



✓The Good Paper

FF = Flip Flops / Latches / Registers  
 AI, OI = Standard Cells (AND, OR, INVERTER)

Note: A "netlist" describes the connectivity of an electronic design

→ Now let's convert the highlighted symbols into physical dimension



1 Define width and height of Core and Die



Let's calculate an occupied by the above netlist on a silicon wafer



→ A 'core' is the section of the chip after the fundamental logic of the design is placed

→ die A 'die', which consists of core, is small semiconductor material specimen on which the fundamental circuit is fabricated

→ Place all logical cells inside the 'Core'

→ The logical cells occupies the complete area of the core

100% utilization

$$\text{Aspect Ratio} = \frac{\text{Height}}{\text{Width}}$$

$$= \frac{2}{2}$$

$$= 1$$



$$= \frac{\text{Area occupied by Netlist}}{\text{Total area of core}}$$

$$= \frac{4 \times 1 \text{ sq. unit}}{2 \times 2 \text{ sq. unit}}$$

$$= 1$$

## 2. Define locations of preplaced cells

Combinational logic  $\Rightarrow$



Cut 1 : Cut 2



Extend IO pins



Block Box the boxes



Separate the Black Boxes as two different IPs or modules

✓ The Good Paper



- Similarly, there are other IP's also available, for e.g.



- The arrangement of these IP's in a chip is referred as **Floor planning**.
- These IP's / block have user-defined locations and hence are placed in chip before automated placement and routing and are called as **pre-placed cells**.
- Automated placement and routing tools places the remaining logical cells in the design onto chip.



✓ The Good Paper

### 3) Surround bi polar cells with Decoupling Capacitor.

→ Consider the amount of switching current required for a complex circuit something like below.



2. Consider capacitance to be zero for the discussion.  $R_{dd}$ ,  $R_{ss}$ ,  $I_{dd}$  and  $I_{ss}$  are well defined values.

2. During switching operation, the circuit demands switching current i.e. peak current ( $I_{peak}$ ).

3. Now, due to the presence of  $R_{dd}$  and  $I_{dd}$ , there will be a voltage drop across them and the voltage at Node 'A' would be  $V_{dd}'$  instead of  $V_{dd}$ .

i) If  $V_{dd}'$  goes below the noise margin, due to  $R_{dd}$  and  $I_{dd}$ , the logic '1' at the output of circuit 'A' is detected as logic '0' at the input of the circuit following the circuit.

### → Noise Margin Summary



Noise induced bump characteristics at different noise margin levels

→ For any signal to be considered as logic '0' and logic '1', it should be in the NM<sub>L</sub> and NM<sub>H</sub> ranges, respectively.

### Solution : Add Decoupling Capacitor

1. Addition of Decoupling Capacitor in parallel with the circuit

2. Every time the circuit switches, it draws current from  $C_d$  whereas, the RL network is used to replenish the charge in  $C_d$



→ Now suppose we are sending a signal from driver to load then this signal should be done all the time but we can't add capacitors everywhere.

→ Now, lets the output of 16-bit bus, is connected to an inverter.



✓ The Good Paper

1110010111000110

↑ 16 bit bus

↓ 16 bit bus

0001101000111001

→ What does this mean?

→ This means all the capacitors which were charged to 'N' volts will have to discharge to '0' volt through single 'Ground' tap point. This will cause a bump in 'Ground' tap point.



→ If this Ground Bounce is greater than noise margin level it will be in undefined region.

→ Also all capacitors which were '0' volt will have to charge to 'V' volt through single 'Vdd' tap point. This will cause lowering the voltage at 'Vdd' tap point.



✓ The Good Paper

→ So what could be the solution?



→ Multiple power supply.

#### 4) Power planning



✓ The Good Paper

#### 5) Pin Placement

Let's take below design for eg. That needs to implement



The connectivity information between the gates is coded using VHDL/Vhdl language and is called as the "Netlist".

✓ The Good Paper

### Pin Placement



### 6) Logical cell placement Blockage

(i) We block the area using some blockage so that automated placement and routing tool doesn't place any cell in this area.



→ After synthesis is run if it's not again the same

→ run -Koolplan

→ cd Desktop/clock/tools/oplane  
working\_dir/oplane/configuration

→ less README.md

→ less Koolplan.tcl

→ cd oplane/designs/picow32a/news  
08-07-13-57/results/Koolplan

→ less picow32a.Koolplan.tcl

Here we can find total die area

→ magic -T /Desktop/clock/tools/  
oplane working\_dir/picow32a/  
lib.tech/magic/sky130A.tch  
read .../tmp/merged.lyf  
read picow32a.Koolplan.lyf

In magic

S → select

B → Center

Zoom → Right click → light click → 2  
Flyt

## Netlist binding and initial place design

### Placement & Routing

1) Bind netlist with physical cells



|     |   |   |     |
|-----|---|---|-----|
| FF1 | 1 | 2 | FF2 |
| FF1 | 1 | 2 | FF2 |
| FF2 | 1 | 2 | FF2 |
| FF1 | 1 | 2 | FF2 |

Physical View of logic gates

Block a      Block b

Block c

Library → consist of shape & size of input  
→ consist of delay

different size  
if with heat removal

|     |   |   |     |
|-----|---|---|-----|
| FF1 | 1 | 2 | FF2 |
| FF1 | 1 | 2 | FF2 |
| FF1 | 1 | 2 | FF2 |
| FF1 | 1 | 2 | FF2 |

The Good Paper

### Placement



Floorplan.

→ Till now we have Floorplan (Physical cell / Placement), Netlist and Physical view of logic gate



The Good Paper

### 3) Optimize Placement

→ This is the stage where we estimate wire length and capacitance and, based on that, insert buffers.

→ For 1 distance is reasonable enough so no need of repeaters

→ Tin 2



→ Tin 3



Tin 4



→ Need for characterization

Library characterization and modelling

Part 1 → Concepts and Theory - NLOM, CCS timing, power and noise characterization

Logic synthesis → functionality to legal hardware (code)

Floor planning

Placement

Clock tree synthesis

Routing

Sign off Timing Analysis

One common thing across all stages  
"gates or cell"

| Library  |
|----------|
| AND gate |
| OR gate  |
| BUFFER   |
| INVERTER |
| DFF      |
| LATCH    |
| ICG      |
| ---      |

## → Cell Design flow

|     |     |      |     |
|-----|-----|------|-----|
| FF1 | and | or   | FF2 |
| FF1 | buf | (in) | FF2 |
| FF1 | inv | and  | FF2 |
| FF1 | buf | or   | FF2 |

Foundry

Cell design flow

Inputs

Process design kits (PDks):  
DRC & LVS rules, SPICE models,  
Library & user-defined specs

Design steps

Circuit design, layout design,  
characterization

Layout Output

CDL (Circuit description language), GDS II  
LEF, extracted via Nangc (ch)



Cell Height

Library

Supply voltage

User defined

Metal layers

Pin location

Min. gate length

Characterization flow

## → Congestion aware placement

→ run\_placement

→ cd openlane\_routing\_dsl/openlane/  
designs/picodw32a/lms/108-07\_13-87/  
results/placement

→ ls

→ magic (see ~~read picodw32a.placement.dfl~~)

read picodw32a.placement.dfl

# Day 3

### Timing characterization

#### → Timing threshold definitions

|                         |           |
|-------------------------|-----------|
| slew - low - rise - th  | about 20% |
| slew - high - rise - th |           |
| slew - low - fall - th  |           |
| slew - high - fall - th |           |
| in - rise - th          | 80%       |
| in - fall - th          | 50%       |
| out - rise - th         | 80%       |
| out - fall - th         | 50%       |

#### → Propagation delay

$$\text{time}(\text{out\_} * \text{th}) - \text{time}(\text{in\_} * \text{th})$$

#### → Transition time.

$$\text{time}(\text{slew\_high\_} * \text{th}) -$$

$$\text{time}(\text{slew\_low\_} * \text{th}) - \text{time}(\text{slew\_high\_} * \text{th}).$$

#### → output current waveform

### IO place revision

cd .. opaln/design/proj3  
mvn / 17-07-03-S1/results/jar/b

+ magic-T

picrossa\_floorplan.dfl

→ Bid Terminal  
cd .. opaln/  
configuration

→ less floorplanned

copy ::cnv(FP\_IO  
\_MODE)

→ set ::env  
(FP\_IO\_MODE)  
... 2

→ run\_floorplan

→ magic-T

## SPICE deck creation for CMOS inverter

### SPICE deck

- Component connectivity



- Component values



Identify nodes

Notes



\*\*\* MODEL Description \*\*\*  
\*\*\* Netlist description \*\*\*

M1 out in vdd vdd pmos c0=0.375u L=0.25u  
M2 out in 0 0 nmos c0=0.375u L=0.25u

Cload out o 10f

vdd vdd o 2.5  
vin in o 2.5

\*\*\* Simulation Commands \*\*\*

.op  
.dc vin o 2.5 0.05

\*\*\* include tsmc\_025um-model.mod \*\*\*  
.lib "tsmc\_025um-model.mod" CMOS\_MODELS  
.end

Spice waveform:  $c_{wn} \omega_p = 0.375 \mu$ ,  $L_n p = 0.25 \mu$  dena  
 $(c_{wn}/L_n = \omega_p/L_p = 1.5)$



N.W. ( $c_{wn}/L_n = \omega_p/L_p = 2.5$ )  
 $\omega_n = 0.398$   $\omega_p = 0.9375 \mu$   $L_n p = 0.25 \mu$



✓The Good Paper

### 1. Switching Threshold, $V_m$



$V_{dsf} = I_{dsN}$



✓The Good Paper

Static and dynamic simulation of MOS inverters

|             |                                                              |                                  |
|-------------|--------------------------------------------------------------|----------------------------------|
| $w_p/l_p$   | $x \cdot w_n/l_n$                                            | static behavior (Cmos Inverters) |
| $w_n/l_n$   |                                                              | Robustness                       |
| $2 w_n/l_n$ |                                                              |                                  |
| $3 w_n/l_n$ | $V_m = R \cdot V_{dd} / (1 + R)$                             |                                  |
| $4 w_n/l_n$ | $\text{When } R = k_p \cdot V_{dsatp} = k_n \cdot V_{dsath}$ |                                  |
| $5 w_n/l_n$ | $\rightarrow = \frac{(w_p)}{(w_n)} k_p \cdot V_{dsatp}$      |                                  |

→ cd opbase - working dir (pkits) sky130A.tech

→ cp sky130A.tech /home/vishnu /Osktop (osktool) opbase working - dir/opbase / vishnu/celldesign

→ ls -lbu

→ magic -T sky130A.tech sky130\_inv.nug

Rise delay Fall delay → for  $w_p/l_p$   $w_n/l_n$

148 ps

71 ps

$$V_m = 0.99V$$

→ cd Osktop . . . opbase

→ git clone https://github.com/nickson-jose/vishnu/celldesign.git.

→ ls -lbu

→ cd vishnu/celldesign

### Create Active Regions

16 - Mask CMOS process

### Selecting a substrate

p-type, High resistivity (5-50 ohms), doping level ( $10^{15} \text{ cm}^{-3}$ ), thickness (100 μm)

Substrate doping should be less than well doping.

2) Creating active region for transistors



After this remove mask

$\text{Si}_3\text{N}_4$  etch



Remove photoresist  
chemically



→ Placed in oxidation furnace

Bird's beak



Field oxide is grown. This process is also called  
"LOCOS" "Local Oxidation of silicon".

→  $\text{Si}_3\text{N}_4$  stripped using hot phosphoric acid



### 3) N-well and P-well formation.



MY CHOICE  
Date: \_\_\_\_\_  
Page No.: 37



→ Do same for N-well



✓ The Good Paper

MY CHOICE  
Date: \_\_\_\_\_  
Page No.: 38

↓

P type material

↓

Mono Pwell

↓

n type material

↓

Leave thin

↓

Boron

↓

Drive-in

↓

furnace

↓

Drive-in diffusion

↓

Size increase

✓ The Good Paper

#### 4) Gate formation



→ original oxide etched/stripped using dilute Hydrofluoric (HF) solution



→ Then re-grown. Again to give high quality oxide (~60nm)



→ ~0.4um polysilicon layer  
N type (phosphorus activation)  
implanted for low gate resistance



### 5. Lightly doped drain (LDD) formation



### 2) Reason for this

- Hot electron effect
  - Short channel effect
- Electric field =  $E = V/d$   
 → High energy carrier break  
 Si-Si bonds 3.2 eV bandgap  
 b/w Si conduction band  
 SiO<sub>2</sub> conduction band
- drain field penetrates channel



Mask 8



~0.1 μm Si<sub>3</sub>N<sub>4</sub> & SiO<sub>2</sub>



Side-wall spacer

### 6) Source and drain formation





7) Steps to form contacts and interconnects (Local)



→ deposit titanium on wafer surface, using sputtering



→ Wafer heated at about 660-700°C in N<sub>2</sub> ambient for 60s

Result  
= low resistance  
TiSi<sub>2</sub>

Mask 11



→ TiN is etched using RCA cleaning.

TiN Local interconnects • Hydrogen peroxide (H<sub>2</sub>O<sub>2</sub>), 1 part



✓ The Good Paper

### 8) Higher Level metal formation

→ Film of  $\text{SiO}_2$  doped with phosphorous or boron (Chow as phosphosilicate glass or borophosphosilicate glass) deposited on wafer surface



→ chemical mechanical polishing (CMP) technique for planarizing wafer surface



Mask 1<sup>2</sup>



→ ~10nm TiN



→ Blanket Tungsten (W) layer deposition



→ Again Chemical mechanical polishing (CMP) technique for planarizing wafer surface



→ Aluminium (Al) layer deposition



Mask 1<sup>3</sup> ←





for fuel metal

→ SiO<sub>2</sub> deposit & CMP done

→ Mask 14 used to define contact hole

→ TiN deposition

→ Tungsten as contacts

→ Mask 15 to define the pattern



→ Top dielectric (Si<sub>3</sub>N<sub>4</sub>) to protect the chip



→ Final mask 16 to open contact holes on this layer

Fabricate



→ Labs → open the # magic from last lab

S → select

- what

3 time S → 2 times - connectivity

In magic

→ pcmd

→ cd operation / selection

→ extract all

→ ls → -lru

→ ext2spice

→ ls -lru

→ vim sky130\_inv.spice

To Edit  
→ Exe  
→ Copy  
→ Delete

Date: \_\_\_\_\_  
Page No.: 43

change spice

X0 → M1001

X1 → M1000

.option scale 0.014

.include .lib/pshort.lib

.include .lib/nshort.lib

11. subckt sky130\_inv A Y VPWR VGND

M1000 Y 0A VPWR VPWR pshort.model.0 w=37 l=23

+ ad = 1443 pd = 152 ao = 1517 ps = 156

M1001 Y A VGND VGND nshort.model.0 w=38 l=23

+ ad = 1435 pd = 152 ao = 1365 ps = 148

VDD VPWR 0 3.3V

VSS VGND 0 0V

Va A VGND PULSE (0V 3.3V 0 0.1ns 0.1ns 2ns 4ns)

C0 A Y 0.05fF

C1 Y VPWR 0.11fF

C2 A VPWR 0.07fF

C3 Y 0 0.29fF

C4 VPWR 0 0.53fF

11.cs VPWR VGND 0.000017fF

11.end

.tran 1n 2ns

.control

.run

.endc

.end

Date: \_\_\_\_\_  
Page No.: 50

→ cd opulane /usbstdcellsdesign  
→ sudo apt install ngspice fairly easily  
→ ngspice sky130\_inv.spic

→ plot y vs time a  
→ exit

→ vim sky130\_inv.spic

change in C3

C3 Y 0 2fF

exit Vim

→ ngspice sky130\_inv.spic

→ plot y vs time a

rise time

$$X_0 = 2.18189 \times 10^{-9}, y_0 = 0.66$$

$$X_0 = 2.24444 \times 10^{-9}, y_0 = 2.64$$

$$\Delta t_{rise} = (2.24444 - 2.18189) \times 10^{-9} \\ = 0.06255$$

fall time

$$X_0 = 4.04839 \times 10^{-9}, y_0 = 2.6400$$

$$X_0 = 4.09328 \times 10^{-9}, y_0 = 0.66$$

✓ The Good Paper

propagation delay

$$x_0 = 2.14758e-08, y_0 = 1.65$$

$$x_0 = 2.20806e-09, y_0 = 1.65$$

cell fall delay

$$x_0 = 4.04906e-09, y_0 = 1.65$$

$$x_0 = 4.07736e-09, y_0 = 1.65$$

Magic tools & DRC rules



Documentation: <http://opencircuitdesign.com/magic>

Magic DRC: [http://opencircuitdesign.com/magic/Technology\\_files/The\\_Magic\\_Technology\\_Files\\_Manual\\_Drc\\_Section](http://opencircuitdesign.com/magic/Technology_files/The_Magic_Technology_Files_Manual_Drc_Section)

Link to Google Skywater Design Rules: <https://skywater-pdh.readthedocs.io/en/main/Rules/>

For reference: <https://github.com/google/skywater-pdh>

To download: → curl [http://opencircuitdesign.com/open\\_pdh/archive/drc-tests.tgz](http://opencircuitdesign.com/open_pdh/archive/drc-tests.tgz)

To extract the lab files from the downloaded file:  
→ tar xfz drc-tests.tgz

✓ The Good Paper

→ cd drc-tests

→ ls -al { list all the directories }

→ magic -d XL & { Start magic tools with better graphics }

→ vi .magicrc { Content of .magicrc file's }

In magic

→ Click file

→ Click open

→ Click mets.mag

→ Then click open.

Then make any box  
Move to any colour

→ Press P { to choose colour }

→ Press : { to go to tcolor window indirectly }

→ cil see VIA2

→ Load poly.mag

→ (C) Vim Sky130A.tech

Changes → poly.g → all polygons { after making sky & see ss }

✓ The Good Paper

In magic

Read polyway

tech load sky130A.tech

→ open nwell.mag

→ {change in sky130A.tech}

→ tech load sky130A.tech

→ drc. style drc(full).

Make a bolt press 'a'

go to empty area press 'c'  
to copy and past

## # Day 4

magic → sky130A.tech sky130-inv.mag &

In magic

g → grid

Tkcon p → help grid

→ grid 0.46um 0.340um 0.23um 0.17 um.

To create lef file → Test.gedit → change  
some sky130-inv.mag

→ open this

→ modify with

cd openlane working dir/  
pbks/sky130A/4bb04d  
/openlane/sky130A.drc

→ less tracks.info.

✓ The Good Paper

✓ The Good Paper

To copy this lef to picow32a

In lef file directory → vsdstd cell design

→ cp sky130-vsdinv.lef /home/usduser/Desktop/picow32a/tools/openlane/design/picow32a/sdc

→ cd libs

→ cp sky130\_fd-sc\_hd-\* /home/picow32a/sdc

→ cd .. /  
now in openlane directory

→ cd openlane/design/picow32a

→ Vim config.tcl

→ docked → ./flow.tcl -interactive  
→ package require openlane 0.9  
→ prep\_design picow32a -tag  
17-07-09-51 -overwrite

→ change it

→ set lefs [glob \$::env(DESIGN\_DIR)/src/\*\_lef]

→ add\_lef -src \$lefs

→ run\_synth

✓ The Good Paper

## # Delay Table

| $t_1$ | $\text{clk} \rightarrow D \rightarrow Y$ |   | $t_2$ | $\text{clk} \rightarrow Y$ |   |
|-------|------------------------------------------|---|-------|----------------------------|---|
| EN    | clk                                      | Y | EN    | clk                        | Y |
| 0     | 0                                        | 0 | 0     | 0                          | 0 |
| 0     | 1                                        | 0 | 0     | 1                          | 1 |
| 1     | 0                                        | 0 | 1     | 0                          | 1 |
| 1     | 1                                        | 1 | 1     | 1                          | 1 |



Observations.

- 2 levels of buffering
- At every level, each node divides some load
- Identical buffer at same level

## → Delay Table for CBUF '1'

|            | Output load                    |
|------------|--------------------------------|
| 20ps       | 10fF 30fF 50fF 70fF 90fF 110fF |
| slew       | x1 x2 x3 x4 x5 x6              |
| Input load | x7 x8 x9 x10 x11 x12           |
| 60ps       | x13 x14 x15 x16 x17 x18        |
| 80ps       | x19 x20 x21 x22 x23 x24        |

## → Delay Table for CBUF '2'

|      | Output load                    |
|------|--------------------------------|
| 20ps | 10fF 30fF 50fF 70fF 90fF 110fF |
| slew | y1 y2 y3 y4 y5 y6              |
| 60   | y7 y8 y9 y10 y11 y12           |
| 80ps | y13 y14 y15 y16 y17 y18        |
|      | y19 y20 y21 y22 y23 y24        |

Ex: Let's input transition = 40ps  $\rightarrow \text{CBuf}'1$   
exp output capacitance = 80fF

| 40ps | 80fF 100fF |
|------|------------|
|      | x9 x10     |

$$\text{delay} = x9$$

for Cbuf'2

$$\text{output for buf}'2 = 60\text{fF}$$

$$\text{input transition for Cbuf}'2 = 60\text{ps}$$

$$\text{output capacitance for Cbuf}'2 = 80\text{fF}$$

$$\text{delay} = y15$$



Continuing from last book

- cd opbase/configuration
- less README.md.

In openlane terminal.

- echo \${: env(SYNTH\_STRATEGY)}
- set :: env(CSYNTH\_STRATEGY) 1
- set :: env(CSYNTH\_STRATEGY) "DELAY 3"
- echo \${: env(SYNTH\_BUFFERING)}
- echo \${: env(SYNTH\_SIZING)}
- set :: env(SYNTH\_SIZING) 1
- echo \${: env(SYNTH\_DRIVING\_CELL)}
- run\_synth
- run\_flowsolver { gives error}

To do these changes  
seen first of all  
set cell check  
the value of slack  
and chip area and  
note it down  
then restart  
the docker  
and perform all  
these steps and  
then click the  
value of slack  
and area  
slack ↓  
area ↑

→ init\_flowsolver

→ place\_io

→ top-decap-oi

→ run\_placement

These commands are all together  
correct instead of "run\_flowsolver" command  
→ run\_flowsolver doesn't work directly with  
custom cells, so we divide it into 3 sub  
commands

→ cd /opbase/design | 17:57  
03-16  
start placement

→ magic -T  
by read ... /.../top.mgdef  
def read picasso3c9\_placement

→ In magic  
→ repeat

## # Timing Analysis (with ideal clock)

Setup Analysis - Single Clock

Specifications:-  
Clock frequency (F) = 1 GHz  
Clock period (T) = VF  
= 1 ns





Hence finite time 'S' required (before Clk edge) for 'D' to reach Qn i.e. internal delay of Mux1 = Setup time.

## # Jitter



Lab steps to optimize synthesis to reduce setup violation  
configure OpenSTA for post-synth timing analysis

- In cl openlane
- touch pu-sta.conf (Create file)
- add context {as in SST}
- cd openlane/designs/picovision/sdc
- touch my\_base.sdc (Create file)
- add context {as in SST}
- cd openlane
- sta bne\_stal.conf

Lab steps to optimize synthesis to reduce setup violation

- → prep -design picav32 -tag 17-07-09-51 -overwrite
- → set lsf [glob \$::env(CDESIGN\_DS) /src/\*.lsv]
- → add\_lsf -src \$lsf
- → set ::env(SYNTM\_SIZING) 1
- → set ::env(SYNTH\_MAX\_FANOUT) 4
- → echo \$::env(SYNTH\_DRIVING\_CELL)
- → NetSynthesis
- → cd operare
- → sta pr-sta.con
- → report\_net -connections -11672
- → help replace-cell
- → replace\_cell -14510- sky130\_fd\_sc\_hd.sch
- → report\_checks -fields ?net cap slew  
input\_pins 1 - digits 4
- → Report - net - connections - 11672
- → replace\_cell -14510- sky130\_fd\_sc\_hd.sch

### Clock tree Synthesis



$$t_2 - t_1 = \text{skew}$$

Skew should be  $\approx 0$

### H-Tree





**Buffering** - repeater



**Clock net shielding**



MY CHOICE  
Date: \_\_\_\_\_  
Page No.: 63

What can go wrong, if there's a glitch??



MY CHOICE  
Date: \_\_\_\_\_  
Page No.: 69

→ incorrect data in memory will result in inaccurate functionality.

|   |   |   |   |   |
|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 0 |
| 0 | 0 | 0 | 0 | 0 |

→ Impact of Crosstalk ~~between~~ Delta Delay - skew



✓ The Good Paper

✓ The Good Paper

→ Lab activities to run CTS using Tutor CTS

In terminal  
when we run sta pre-study

→ write\_verilog /home/xchowdury/Desktop/  
work/tools/openlane-working.dci  
openlane/designs/picos32a/news  
17-07-09-51/results/synthesis/picos  
synthesis.v

In openlane

- init-floplanning
- place-to
- top-decap-dci
- run-placemt

→ run\_cts

→ cd /openlane-working.dci/gpflow/  
scripts/openlane/  
→ open road  
→ read\_def /openLANE-flow/designs/  
picos32a/news/17-07-09-51/  
top/merged.def

→ read\_def /openLANE-flow/designs/  
picos32a/news/17-07-09-51/  
results/cb/picos32a.cts.dly

→ write\_db picos\_cts.db

### Timing analysis (with real clock)

setup analysis - single clock



$\Theta + \Delta_1 \rightarrow$  Launch flop delay

$\Theta + \Delta_2 \rightarrow \Delta_2$

$$|\Delta_2 - \Delta_1| = \Delta_{\text{key}}$$



With  $T = 1\text{ns}$

$$\begin{aligned} \text{Assume } S &= 10\text{ps} \\ &= 0.01\text{ns} \end{aligned}$$

$$\begin{aligned} \text{Uncertainty} &= 90\text{ps} \\ &= 0.09\text{ns} \end{aligned}$$



Data Required  
Data - Data

Actual time

slack (should  
be +ve only)

## Hold Timing Analysis



→ from the detailed analysis of capture flop

→ setup time 'H' required (after clk edge) for 'Qm' to reach Q, i.e.  
internal delay of MUX = hold time



With  $T = 1\text{ns}$   
 $\Delta_{min} = 10\text{ps}$   
 $\geq 0.0ns$   
 $\Delta_{max} = 50\text{ps}$   
 $\geq 0.01\text{ns}$



Data Arrival Time  
Data Reg. to  
slack

✓ The Good Paper



$$\begin{aligned}\Delta_1 = & \text{ Real wire RC delay} \\ & + \text{BUF delay} \\ & + \text{Real edge RC delay} \\ & + \text{BUF delay} \\ & + \text{Real } - - - 3 \\ & + \text{BUF delay} \\ & + \text{Real } - - - 9 \\ & + \text{BUF delay} \\ & + \text{Real } - - - 5\end{aligned}$$

$$\begin{aligned}\Delta_2 = & \text{ Real wire RC delay} \\ & + \text{BUF delay} \\ & + \text{Real } - - - 2 \\ & + \text{BUF delay} \\ & + \text{Real } - - - 8 \\ & + \text{BUF delay} \\ & + \text{Real } - - - 4 \\ & + \text{BUF delay} \\ & + \text{Real } - - - 5\end{aligned}$$

→ Lab setup to analyze timing with real clock using OpenSTA

After write db

→ Read\_db bicoscts.db

→ Read\_verilog /openLANE flow/designs/picorv32a/libs/17-07-09-51/ results/synthesis/picorv32a.synthesis\_cts.v

→ Read.liberty \$::env(LIB\_SYNTH\_COMPLETE)

✓ The Good Paper

MY CHOICE  
 Date: \_\_\_\_\_  
 Page No.: 65

→ link\_design picov32a  
 → Read\_sdc /openLANE\_flow/designs/picov32a/sig/my-base.sdc  
 → set\_propagated\_clock (all\_clocks)  
 → Report\_checks -path\_delay min\_max -fields ? slow slow net cap input\_pins ? -joint full\_clock\_expanded -digits 4  
 → exit !exit openroad  
 → echo \$::env(CTS\_CLK\_BUFFER\_LIST)  
 → set\_environ(CTS\_CLK\_BUFFER\_LIST) [list replace \$::env(CTS\_CLK\_BUFFER\_LIST) o o ]  
 → echo \$::env(CTS\_CLK\_BUFFER\_LIST)  
 → dho \$::env(CURRENT\_DEF)  
 → set ::env(CURRENT\_DEF) /openLANE\_flow/designs/picov32a/mru/17-07-09-8/results/placement/picov32a.placement.def  
 → Mun\_cts  
 → echo \$::env(CTS\_CLK\_BUFFER\_LIST)  
 → openroad  
 → Read\_lsf /openLANE / 17-07-09-8/tfb/merged

✓ The Good Paper

MY CHOICE  
 Date: \_\_\_\_\_  
 Page No.: 70

→ Read\_lsf /openLANE\_flow / 17-07-09-8/results/cts1/picov32a.cts.def  
 → exit\_db picos\_cts1.db  
 → Read\_db picos\_cts1.db  
 → Read\_verilog /openlane / 17-07-09-8/results/synthesis/picov32a.synthesis.clr.v  
 → Read\_Liberty \$::env(LIB\_SYNTH\_COMPLETE)  
 → link\_design picov32a  
 → Report\_clock\_skew -hold  
 → Report\_clock\_skew -setup  
 → exit!

# Day 5

→ Introduction to Maze Routing — Lee's algorithm.  
 Routing :- Connecting 2 point , best path , but possible solution

✓ The Good Paper



option 2

not preferred



option 2

not preferred  
more no. of bends

→ wire width



wire width

2) wire pitch



wire pitch

3) wire spacing



wire spacing



DRC violation  
Type: Single Sot

→ complete Routing of our example is in Screenshots

## # DRC Clear (Design Rule Check)



→ 3 Typical design Rules for the above pair of wires

✓ The Good Paper



Metal layer: M<sub>h</sub>

Metal layer: M<sub>h+1</sub>

No signal slot

2 new design Rules to be checked

↳ via distance



via  
spacing

→ After routing & DRC next step is Parasitic extraction  
→ Resistance and capacitance present on every wire should be extracted  
and used for further process (shown in SS) ✓ The Good Paper

# Lab steps to build power distribution network

→ docked  
→ ./flowtel -interactive

→ package require opencore 0.9  
→ prep -design picow32a -tag 17-07-09-s1

→ echo \$::env(CURRENT\_DEF)  
→ gen\_pdn

# Lab steps from power shps to std cell power.

→ check the ss

# Basic of global and detail routing and configure TritonRoute

→ echo \$::env(CURRENT\_DEF)  
→ echo \$::env(GLOBAL\_ROUTER)  
→ echo \$::env(DETAILED\_ROUTER)  
→ run-routing



### # Triton features ::

- Pre-processed route guides
  - perform initial detailed route
  - Honors the preprocessed route guides (obtained after fast route).  
i.e., attempts as much as possible to route within route guides
  - Assumes route guides for each net satisfy inter-guide connectivity
  - Work on proposed MILP-based panel routing scheme with intra-layer parallel and inter-layer sequential routing framework
- Requirements of preprocessed guides:
- Should have unit width
  - Should be in the preferred direction.

#### Preprocessing :

- a) initial route guides
- b) splitting
- c) merging
- d) bringing

c) preprocessed guides

→ The preferred direction for M1 is vertical, and horizontal for M2.

✓ The Good Paper

## Inter guide Connectivity

- Two guides are connected if
  - They are on the same metal layer with touching edge
  - They are on neighbouring metal layers with a non-zero vertically overlapped area
- Each unconnected terminal (i.e., pin of a standard-cell instance) should have its pin shape overlapped by a route guide

Intra-layer parallel and inter-layer sequential panel routing

## Triton Route

- Problem statement

- Inputs:- LEF, DEF, Preprocessed Route guides
- Output:- Detailed Routing Solution with optimized wire-length
- Constraints:- Route guide honoring, connectivity, and via count constraints and design rules

## Handling Connectivity

- Accesspoint (AP) :- An on-grid point on the metal layer of the route guide, and is used to connect to lower-layer segments, upper-layer segments, pins or IO ports.

→ Access point Cluster (APC) :- A union of all access points derived from some lower-layer segments, upper-layer guide, a pin or an IO port.

## Routing Topology Algorithm

Algorithm 1 :- optimization of Routing Topology

```

for all i = 1 to n-1 do
  for all j = i+1 to n do
    costi,j ← dist (APCi, APCj)
  end for
end for
T ← MST (APCs, COSTs)
Return ei,j ∈ T
  
```

→ cd openlane

```

→ magic -T /Desktop/cwl/tools/openlane_working_dir/
  /home/vshetu/Desktop/work/tools/openlane_working_dir/openlane
  /designs/picow32a/runs/17-07-09-51/tmp/maged.lyf.read
  /home/vshetu/Desktop/work/tools/openlane_working_dir/openlane
  /designs/picow32a/runs/17-07-09-51/results/routing/picow32a.lyf
  
```