

# **ASAP7 Predictive Design Kit Development and Cell Design Technology Co-optimization**

**Vinay Vashishta  
Manoj Vangala**

**Lawrence T. Clark**

**School of Electrical, Computer and Energy Engineering  
Arizona State University**

**{vinay.vashishta,lawrence.clark,manoj.vangala}@asu.edu**

# Outline

- **Motivation**
- **PDK overview**
- **Cell library architecture**
- **Cell library details**
- **Place and route usage**
- **Summary**

# Motivation

- **Academia has lacked process design kits (PDK), cell libraries, and design flows for advanced technology nodes**
- **ASAP7: A finFET based 7 nm (N7) predictive PDK for academic use**
  - Developed by ASU in 2015-2016 with ARM Research
  - Long lived: N7 was not yet shipping
    - Foundry agnostic—fully predictive, so no issues with foundries
  - Realistic design rules
    - Special SRAM array rules
  - Transistor models with temperature and corner behavior
  - Full physical verification (DRC, LVS, Parasitic Extraction)
  - Standard Cell Library
    - Collaterals support widely used commercial Cadence CAD tools

# Electrical Scaling Assumptions

- **Models consistent with scaling trends, ITRS**
- **Four  $V_t$**
- **Three corners (TT, SS, FF)**
- **SRAM device → no LDD**
  - Longer channel, low leakage
- **0.7 V nominal  $V_{DD}$**



| Parameter                    | SRAM | RVT  | LVT  | SLVT |
|------------------------------|------|------|------|------|
| $I_{dsat}$ ( $\mu\text{A}$ ) | 1058 | 1402 | 1674 | 1881 |
| $I_{off}$ (nA)               | 0.1  | 1    | 10   | 100  |
| $V_{tsat}$ (V)               | 0.25 | 0.17 | 0.10 | 0.04 |
| $V_{tlin}$ (V)               | 0.27 | 0.19 | 0.12 | 0.06 |

# Electrical Scaling Assumptions

- **Better DIBL, near ideal SS with FinFET**
- **54 nm CPP and 21 nm  $L_g$** 
  - **Enable low SS and DIBL assumptions**
  - **Aggressive scaling can cause poor SS and DIBL**
- **N:P ratio  $\approx 1:0.9$** 
  - **Some literature shows  $I_{DSAT}(P) > I_{DSAT}(N)$**  [s. Yang et al., Symp. VLSI, 2017]



NMOS typical corner parameters (per  $\mu\text{m}$ ) at  $25^\circ\text{C}$

| Parameter      | SRAM  | RVT   | LVT   | SLVT  |
|----------------|-------|-------|-------|-------|
| SS (mV/decade) | 62.44 | 63.03 | 62.90 | 63.33 |
| DIBL (mV/V)    | 19.23 | 21.31 | 22.32 | 22.55 |

# P:N Ratio

- Optimal fan-up at each inversion

- FO4 ( $2 \times I_{eff\_PMOS} \approx I_{eff\_NMOS}$ )
- FO6 ( $I_{eff\_PMOS} \approx I_{eff\_NMOS}$ )



Balanced: No need for separate balanced P:N clock cells

NAND  $\approx$  NOR

NOR better in future?  
Equal is electrically optimal  
Stay there regardless?

# Lithography Assumptions



[G. Dicker et al., Proc. SPIE, 2015]

- **EUV lithography for critical layers**

- $Pitch_{EUV} = 2 \times k_1 \frac{\lambda}{NA} = 2 \times 0.4 \left( \frac{13.5}{0.33} \right) \approx 32 \text{ nm} \rightarrow 36 \text{ nm for bi-directional (2-D) M1 routing}$   
Matches subsequent foundry demonstration [R-H. Kim, SPIE, 2016]
  - Conventional 2-D M1 standard cell layouts → Easier classroom use

- **Multi-patterning assumption for non-EUV layers**

- **Self-aligned quadruple patterning (SAQP)**, **Self-aligned double patterning (SADP)**
- **Litho-etch litho-etch (LELE)**

- **193i/ArFi single exposure pitch ≈ 80 nm**

# Cell Level Design Technology Co-optimization

- Photolithography choice affects cost, variability, and design complexity
- 111 6-T SRAM cell
- Layout and DRC rules required extensive DTCO
  - Avoid TDDB between middle of line (MOL) metals accounting for CDU and misalignment



# Fin Scaling Assumptions

- **Pitch scaling**
  - $0.8 \times \rightarrow 27 \text{ nm}$



# Fin Scaling Assumptions

- **Pitch scaling**

- $0.8 \times \rightarrow 27 \text{ nm}$



- **Thickness reduction**

- **0.5 nm/node since N22  $\rightarrow$  6.5 nm (7 nm drawn)**



# Fin Scaling Assumptions

- **Pitch scaling**
  - $0.8x \rightarrow 27 \text{ nm}$
- **Thickness reduction**
  - **0.5 nm/node since N22  $\rightarrow$  6.5 nm (7 nm drawn)**
- **SAQP**



# Gate Scaling Assumptions

- **Pitch scaling**

- N14-N10 → 0.85×
- N10-N7 → 0.9×



# Gate Scaling Assumptions

- **Pitch scaling**
  - N14-N10 → 0.85×
  - N10-N7 → 0.9×
- **Gate length ( $L_g$ )**
  - 3 nm and 2 nm reduction since N14 → 21 nm (20 nm drawn)
- **SADP**



# M<sub>x</sub> Patterning Assumptions

- **M<sub>x</sub> (1× metal) layers**

- M1-M3

- **Pitch scaling**

- 0.7× since N16/14 → 32 nm pitch

- SAQP or EUVL?

- SAQP → costly and complex
    - EUVL assumption

- **Difficult 2-D routing at 32 nm pitch**

- M<sub>x</sub> Pitch relaxed to 36 nm
  - Other metal layer (1.5×, 2×, and 2.5×) pitch values are relative to 32 nm pitch



# Gear Ratio and Cell Height

- **Standard cell height selection is application specific**
  - Related to fins/gate, i.e. drive strength
- **Gear ratio: fin-to-metal pitch ratio**
  - Cell height needs to be integer # of fins and (mostly) an integer # of metals accessing the cell pins (e.g. M2)



- **12 fin pitches, 9 M2 tracks**
  - Easy intra-cell routing, rich library
  - Wasteful for density
- **10 fin pitches, 7.5 M2 tracks**
  - Rich library without overly difficult routing or poor density
- **8 fin pitches, 6 M2 tracks**
  - Difficult intra-cell routing, diminished library richness
  - Limited pin access

# Gear Ratio and Cell Height

- **Standard cell height selection is application specific**
  - Related to fins/gate, i.e. drive strength
- **Gear ratio: fin-to-metal pitch ratio**
  - Cell height needs to be integer # of fins and (mostly) an integer # of metals accessing the cell pins (e.g. M2)



# FEOL and MOL Cross Sections

- **Source-drain trench (SDT)**
  - Connects raised source-drain (SD) to MOL
  - Self-aligned to gate spacers
- **LISD**
  - Connects SD to M1 thru V0
- **LIG**
  - Connects Gate to M1 thru V0



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch



□ Fin (pre-cut)

□ Cell Boundary

# Standard Cell Architecture and Cross-section

- **Cell architecture**

- **7.5 M2 track height**
  - Provides good gear ratio with fin, poly, and M2 pitch
- **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**



# Standard Cell Architecture and Cross-section

- **Cell architecture**

- **7.5 M2 track height**
  - Provides good gear ratio with fin, poly, and M2 pitch
- **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
- **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch
  - **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
  - **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**
- **DDB needed since the 32 nm node, depending on foundry**
  - **Design rules check for connectivity**



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch
  - **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
  - **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**
- **DDB needed since the 32 nm node, depending on foundry**
  - **Design rules check for connectivity**



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch
  - **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
  - **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**
- **DDB needed since the 32 nm node, depending on foundry**
  - **Design rules check for connectivity**



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch
  - **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
  - **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**
- **DDB needed since the 32 nm node, depending on foundry**
  - **Design rules check for connectivity**



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch
  - **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
  - **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**
- **DDB needed since the 32 nm node, depending on foundry**
  - **Design rules check for connectivity**



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch
  - **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
  - **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**
- **DDB needed since the 32 nm node, depending on foundry**
  - **Design rules check for connectivity**



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch
  - **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
  - **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**
- **DDB needed since the 32 nm node, depending on foundry**
  - **Design rules check for connectivity**



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch
  - **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
  - **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**
- **DDB needed since the 32 nm node, depending on foundry**
  - **Design rules check for connectivity**



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch
  - **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
  - **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**
- **DDB needed since the 32 nm node, depending on foundry**
  - **Design rules check for connectivity**



# Standard Cell Architecture and Cross-section

- **Cell architecture**
  - **7.5 M2 track height**
    - Provides good gear ratio with fin, poly, and M2 pitch
  - **Adjacent NAND3 and inverter FEOL and MOL show the double diffusion break (DDB)**
  - **Drawing is not WSYWIG—the fins extend to  $\frac{1}{2}$  the gate horizontally past drawn active**
- **DDB needed since the 32 nm node, depending on foundry**
  - **Design rules check for connectivity**



# Standard Cell M1 Template

- **M1 template enables rapid cell library development**
  - Larger M1 spacing at the center
    - Better pin access through M1 extension past M2 tracks
- **C-shaped M1 pins**
  - Avoid large tip-to-side design rules
  - Maximize pin access
  - No longer necessary on all pins



# Standard Cells: Latch



- This demonstrates a crossover
  - Note single diffusion breaks (SDBs)
  - Horizontal M2 can only support limited tracks
- Intel, Samsung support SDBs (no DDBs) at N10/N7 [EETimes]

# Self-aligned Via Merging

- **Via merging is very helpful in standard cells at V0**
  - Maximizes access to I/O pins
  - Allows adjacent vias in routing



# Cell Architecture Impact on Library Richness

| Family            | 3um  | 130nm | 90nm | 65nm   | 40nm | 20nm |
|-------------------|------|-------|------|--------|------|------|
| Lib size (approx) | <100 | 2000  | 5000 | 10000+ | 6000 | 100? |

[C. Bittlestone, et al.,  
IEDM short course 2010]

- **Cell height limits the available cells**
  - **Horizontal Mx can only support limited tracks**
    - Power rails use one track
    - 1-2 needed for gate contacts
    - 1-2 for output node
  - **7 (or 7.5) track has 6 internal tracks, 6 track has 5**
- **Most efficient cells fit in 7.5 track cells**
  - **All 3 stack except NAND/NOR**
    - **NAND/NOR up to 5 stack**
    - **No diffusion breaks**
- **~190 cells per V<sub>t</sub> with drive differences**



# Fin Cut Implications and Dummy Poly



- **Fin block/cut mask can create sharp edges**
  - High charge density/electric field → Severe for TDDB
- **Cutting the dummy poly avoids shorts in DDBs**
  - Improves LIG routing

# APR Collaterals

- **Cadence Innovus collaterals developed at ASU**
- **Cell library includes GDS, LEF, LIB, QRC techfile, CDL**
  - All collateral scaled by 4x to use standard academic licensing
  - 7x7 LIB look-up tables centered at FO6 capacitance/slew rates
  - LIBs for SS, TT, FF corners at 0.63 , 0.7, and 0.77 V, respectively
  - Separate library for each of the four  $V_t$
- **Synopsys ICC collaterals developed at Harvey Mudd**
  - Not included as part of the library as yet

# Cell Library Description



- **Combinational logic cells, scan and non-scan flip-flops, latches, and integrated clock gaters**
- **Inverter and buffer strength up to 13 $\times$  and 24 $\times$ , respectively**
- **Inefficient AOI, OAI, AO, OA layouts excluded**
- **Drive to area optimized instead of balanced rise/fall times**
  - But cells for clock tree synthesis must be carefully selected

# SADP Design Rule Development



- **Color agnostic SADP design rules for 48 nm/64 nm pitch metals**
- **Restrictive design rules for correct-by-construction topologies**
  - Validated by developing color and mask decomposition Calibre decks

# Scaled LEF and QRC Techfile

- **Special APR tool license required for sub-20 nm dimensions**
- **Workaround:**
  - Use 4× scaled LEFs and QRC techfile (calibrated to Calibre PEX) during APR
  - Scale back the design to original dimensions when importing into OA environment



97.8% correlation (capacitance)  
99.1% correlation (resistance)

# APR Study (Small Block)

- **Level 2 cache error detection and correction (EDAC) block providing Hamming ECC for a 128-bit memory word**
  - APR flow debug vehicle
- **Validated on single, mixed  $V_{th}$  flows, multi-corner optimization**
- **22  $\mu\text{m} \times 22 \mu\text{m}$**
- **535 top-level IO pins**
- **~4k cell instances ~90% cell area utilization achieved**
- **>5 GHz  $f_{\text{clk}}$** 
  - 6 GHz with useful skew (TT, 25° C)
    - SLVT cell usage dominates



# APR Study (Large Block)

- **Triple modular redundant advanced encryption standard (AES) engine with fully unrolled 14 stage pipelines**
- **1596 top-level IO pins**
- **Three independent clock domains**
- **250 μm × 250 μm**
- **~350k cell instances**
- **$T_{clk} = 1 \text{ ns (SS)}$**
- **$T_{clk} = 520 \text{ ps (TT, } 25^\circ \text{ C)}$** 
  - **38% SLVT cells**
  - **24% SRAM  $V_{th}$  cells for low leakage on non-critical paths**



# Memory Array

- **8kB array shown here with 128 cells per bit-line (BL)**
  - 64-bit words, 84.2% array efficiency
  - Control logic APRed using cell library
  - Custom decoder at SRAM pitch for high density
  - Suitable for circuit and architectural level studies
- **Memory release pending**



[Vashishtha, et al., Proc. ISCAS, 2017]

# APR Study (Microprocessor)

- **MIPS M14k**
  - To test SRAM integration
- **215  $\mu\text{m} \times 80 \mu\text{m}$ ; ~50k cell instances; ~1GHz  $f_{\text{clk}}$**



# APR Study (Microprocessor)

- **MIPS M14k**
  - To test SRAM integration
- **215  $\mu\text{m} \times 80 \mu\text{m}$ ; ~50k cell instances; ~1GHz  $f_{\text{clk}}$**



# Lines and Cuts BEOL Electrical Impact

- Dummies inserted post-APR using Calibre DRC flows
- PEX run on the pre-post fill—timing analysis using Primetime
  - 375k cells, 72.3% area utilization, 6 metals @ 36 nm pitch
  - Cuts not aligned
    - So results are slightly optimistic—no added stubs on routes



Net (only) capacitance increases 2x to 3x



[Vashishta, et al., Proc. SPIE DTCO, 2017]

# SAV in Routing and Power

- **SAVs are same width as upper metal**
  - Rectangular, rather than square vias due to dissimilar consecutive layer widths
- **Wide vias are specified in the technology LEF for APR**
- **Power rail outer edges coincident with signal on the outer tracks**
  - Should also respect SADP coloring scheme to prevent odd-cycle conflicts
    - Power rails widths can only be 3, 5, 7, or 9 tracks



# ASAP7: Standard Cell Metals: 1-D Assumptions

- Cells are really not that different for 1-D
  - We convert between styles for experiments



- 7.5 track cell height
- 3 fins for NMOS/PMOS

- 6-track 1-D horizontal M1



- 2 fins for NMOS/PMOS
- Latch uses all M1 tracks
- M1 tracks left for routing use
  - All filled for lines/cuts metallization scheme

# ASAP7 FinFET Device Simulation

- **Done after SPICE model development**
  - **Good correlation between electrical performance results and assumptions**
  - **Sentaurus device editor used for simulations**

| Regions                                      | Dimensions |
|----------------------------------------------|------------|
| Hfin                                         | 32nm       |
| Tfin                                         | 6.5nm      |
| Gate height from oxide                       | 44nm       |
| Lgate                                        | 21nm       |
| $L_{\text{effective}}$                       | 19nm       |
| Spacer width                                 | 8nm        |
| Spacer's height                              | 44nm       |
| Source/Drain width                           | 15nm       |
| Width of oxide(HfO <sub>2</sub> ) around fin | 1nm        |
| Width of oxide(SiO <sub>2</sub> ) around fin | 0.6nm      |



# ASAP5 Nanowire Device Simulation

- Transistor models based on device simulations
- Calibrated to ASAP7 FinFETs



# ASAP7 PDK Use in Courses

- **Early testing in the fall 2015-2017 EEE625 Advanced VLSI course**
  - Students here contributed to memory designs
  - 6-T, 8-T, 10-T cell based embedded memories have been developed
- **Used for the EEE525 VLSI courses since 2016**
- **We are interested in knowing if you are using it in your course**

# Design Rule Manual

- Design rules fully documented with PDK
  - Includes examples of allowed and not allowed structures

| Rule     | Rule Type | Description                                                                                                                                                                                                                | Operator | Values | Units |
|----------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|--------|-------|
| M4.W.1   | Width     | Minimum vertical width of M4                                                                                                                                                                                               | $\geq$   | 24     | nm    |
| M4.W.2   | Width     | Maximum vertical width of M4                                                                                                                                                                                               | $\leq$   | 480    | nm    |
| M4.W.3   | Width     | M4 vertical width may not be an even integer multiple of its minimum width.                                                                                                                                                | -        | -      | -     |
| M4.W.4   | Width     | M4 vertical width, resulting in the polygon spanning an even number of minimum width routing tracks vertically, is not allowed.                                                                                            | -        | -      | -     |
| M4.W.5   | Width     | Minimum horizontal width of M4                                                                                                                                                                                             | $\geq$   | 44     | nm    |
| M4.S.1   | Spacing   | Minimum vertical spacing between two M4 layer polygons' edges, regardless of the edge lengths and mask colors                                                                                                              | $\geq$   | 24     | nm    |
| M4.S.2   | Spacing   | Minimum horizontal spacing between two M4 layer polygons' edges, regardless of the edge lengths and mask colors                                                                                                            | $\geq$   | 40     | nm    |
| M4.S.3   | Spacing   | Minimum tip-to-tip spacing between two M4 layer polygons—that do not share a parallel run length—on adjacent tracks                                                                                                        | $\geq$   | 40     | nm    |
| M4.S.4   | Spacing   | Minimum tip-to-tip spacing between two M4 layer polygons—that share a parallel run length—on adjacent tracks                                                                                                               | $\geq$   | 40     | nm    |
| M4.S.5   | Spacing   | Minimum parallel run length of two M4 layer polygons on adjacent tracks                                                                                                                                                    | $\geq$   | 44     | nm    |
| M4.AUX.1 | Auxiliary | M4 horizontal edges must be at a grid of                                                                                                                                                                                   | $\equiv$ | 24     | nm    |
| M4.AUX.2 | Auxiliary | Minimum width M4 tracks must lie along the horizontal routing tracks. These tracks are located at a spacing equal to: $2N \times \text{minimum metal width} + \text{offset}$ from the origin, where $N \in \mathbb{Z}^+$ . | -        | -      | -     |
| M4.AUX.3 | Auxiliary | M4 may not bend.                                                                                                                                                                                                           | -        | -      | -     |
| M4.AUX.4 | Auxiliary | Outside edge of a wide M4 layer polygon may not touch a routing track edge.                                                                                                                                                | -        | -      | -     |



# Other CAD Tool Support

- Cadence Virtuoso
  - Schematic and layout
- SPICE models (BSIM-CMG) from netlister
- Mentor Calibre DRC, LVS, PEX (xACT3D)



# Download page

- See:
  - <http://asap.asu.edu/asap>
- Downloaded by over 75 different Universities so far
- Latest release
  - New better library
    - ~50 cells improved
    - ~70 cells added
  - TechLEF
    - Almost no DRCs at >80% utilization
  - Sample Innovus .tcl
  - xACT3D extraction
  - Minor DRC changes



# Summary

- **ASAP7 PDK and 7.5-track cell libraries for N7**
  - Realistic assumptions for N7
- **Libraries allow credible APR for research/coursework**
  - Full Cadence Innovus APR collateral for routing and power distribution
  - Workaround for routing at advanced geometry with academic license described
- **Features to reduce cell size, parasitics, leakage, and address reliability described**

# Acknowledgment

- **Thanks to:**
    - **Anant Mithal, Nanda Kishore Babu Vemula, Chandrasekaran Ramamurthy, Parshant Rana, Sai Chaitanya Jakkireddy, Shivangi Mittal, Lovish Masand, Ankita Dosi, Parv Sharma (ASU)**
    - **Other students in the spring 2015 special topics class (ASU)**
    - **Saurabh Sinha, Lucian Shifren, Brian Cline, Greg Yeric (ARM)**
    - **Tarek Ramadan (Mentor Graphics)**
- for contributions to this effort**

**THANK YOU!**

**Questions?**