

# **Variation in 45nm and Implications for 32nm and Beyond**

**Kelin J. Kuhn**

**Intel Fellow**

**Director of Advanced Device Technology**



# **AGENDA**

## **Technology scaling**

- I. Physical Variation Sources and Mitigation**
- II. Measurements, results and interpretation**
- II. Next generation challenges**

## **Closing thoughts**

# Technology Scaling



# Lithography Scaling Limitations

## From Broers [1] IEDM Plenary Session 1980

In the limit, microscope objectives with 0.95 N.A. are available and, provided very small fields ( $200\mu \times 200\mu$ ) are adequate, linewidths  $< 0.4\mu$  should be achievable under carefully controlled laboratory conditions, and in very thin resist layers.

Depth of field will be reduced to about  $\pm 0.2\mu$ . Deep U.V. ( $\lambda = 200\text{nm} - 260\text{nm}$ ) lenses will be difficult to build because of the lack of materials that are transparent at these wavelengths and yet have relatively high refractive indices.

1980:  
Optical Lithography Limit  
 $\sim 400\text{nm}$



# Transistor Scaling Limitations

## From Meindl [2] IEDM Plenary Session 1983

For conservative design margins, typical results suggest that IGFET channel lengths can be reduced to approximately 0.40 microns in E/D NMOS logic gates; 0.30 microns in CMOS transmission gates; and 0.20 microns in E/E CMOS logic gates. Smaller channel lengths can be projected for more aggressive designs. The dominant mechanism imposing these limits is subthreshold drain current due to short channel charge sharing and drain induced barrier lowering.

1983:  
Transistor architecture limit  
200-400nm (SCE)



# Transistor Scaling Limitations

## From Heilmeier [4] IEDM Plenary Session 1984

other factors limiting the scaling of ICs come into play. Some of these factors are interconnect capacitance, channel capacitance, interconnect resistance, parasitic resistances, velocity saturation, ionizing radiation, drain breakdown, gate oxide breakdown, hot carrier injection, subthreshold current, punch through, and statistical control of oxide thickness and channel doping. It appears that minimum geometries for high-volume ICs will saturate in the range of 0.3 to 0.5 microns.

1984:  
Transistor architecture limit  
300-500nm (laundry list of reasons...)



# How small is a 32nm memory cell?



*1983-84 limits on gate size, are commensurate with the dimensions of 2008's entire 32nm SRAM cell!*



Blood cell: Elec. Mic. Fac. (NCI-Frederick) 2007

**Small enough that a 2008 32nm SRAM cell is dwarfed by a human redblood cell**



# How small is a 32nm memory cell?

1980 SRAM Cell:  $1700 \mu\text{m}^2$



10000X

32nm SRAM Cell:  $0.171 \mu\text{m}^2$



**Small enough that a 2008 32nm SRAM cell is dwarfed by a 1980 SRAM cell CONTACT**

# How small is a 32nm memory cell?



**Small enough that a 2008 32nm SRAM cell is dwarfed by a 1980 SRAM cell CONTACT**



M. Bohr, ISCC, 2009

Kuhn - 2009 2<sup>nd</sup> International CMOS Variability Conference - London

# Atomic dimensions are now routine



# **Part I: Physical Variation Sources and Mitigation**

# Part I – Physical Variation Sources and Mitigation



Patterning



Polish



Strain

# Part I – Physical Variation Sources and Mitigation



Patterning



Polish



Strain

# How small is a 45nm transistor?



- 5.5X smaller than the 193nm light that prints it
- ~15X smaller than visible green light





# Optical Proximity Correction (OPC) As a Resolution Enhancement Technique



Contour prediction – no OPC



Contour prediction – with OPC



SEM Image – no OPC



SEM Image – with OPC



K. Wells-Kilpatrick: 2007

# 45nm: OPC as a Variation Management Technique



Top-down resist CD meets spec, but poor contrast leads to poor resist profile which gets transferred to metal pattern after etch, resulting in shorting marginality



**Computational lithography solution**



# MEEF

## Mask Error Enhancement Factor

- MEEF is a scaling factor that causes certain layout geometries to exhibit a greater sensitivity to mask dimension tolerances.
- Any dimensional error in the mask is magnified on the wafer by the MEEF value.

$$\Delta W_{\text{wafer}} = \text{MEEF} * \Delta W_{\text{mask}}$$

- Depending on the value of the mask error and the lithography exposure/focus conditions the final printed pattern can be either larger or smaller.

# MEEF Impact on Ze Error

Ze error can be either positive or negative



Yellow: DCCD contour after OPC  
Green: with -3.375 nm mask making error  
Red: with 3.375 nm mask making error

# MEEF and Historical gate CD vs. pitch



248nm; OAI; model-based OPC

193nm; OAI; model-based OPC

193nm; APSM; model-based OPC



● Contacted  
gate pitch

C. Kenyon  
TOK conf.  
Dec. 2008

193nm; APSM; model-based OPC  
double patterning

193nm; immersion; APSM; model-based  
OPC; double patterning; polarization

Low MEEF requires targeting in the “flat” portion of CD vs. pitch  
Process innovations continue this trend in the 32nm node



C. Kenyon  
TOK conf.  
Dec. 2008

# FLARE



- Flare is unwanted scattered light arriving at the wafer
- Flare is caused by interactions that force the light to travel in a "non-ray trace" direction.
- Flare is both a function of local environment around a feature (short range flare) and the total amount of energy going through the lens (long range flare).

# Impact of flare on gate CDs



- During 65nm process development, large CD deviations were observed for structures having identical pitch and reticle CD due to flare
- Gates only 500 $\mu$ m away from one another could be >5nm different in CD

C. Kenyon  
TOK conf.  
Dec. 2008



# Flare Variation Improvement with OPC

| Color Code | Chrome Fraction |
|------------|-----------------|
| 109:0      | 56.0 – 63.0     |
| 108:0      | 49.0 – 56.0     |
| 107:0      | 42.0 – 49.0     |
| 106:0      | 35.0 – 42.0     |
| 105:0      | 28.0 – 35.0     |
| 104:0      | 21.0 – 28.0     |
| 103:0      | 14.0 – 21.0     |
| 102:0      | 7.0 – 14.0      |
| 101:0      | 0.0 – 7.0       |



Development effort produced an algorithm capable of scanning designs and binning regions by local chrome fraction

Binning algorithm is combined with flare-calibrated OPC model



C. Kenyon, TOK conf., Dec. 2008



C. Kenyon  
TOK conf.  
Dec. 2008

# 45nm highlights role of lithography/etch in resolving LER/LWR



Original



Improvement A



Improvements B,C



Final after improvements A,B,C

# Technology Trend Systematic Gate CD Lithography Variation



Critical to management of variation is the ability to deliver a 0.7X gate CD variation improvement in each generation enabled by continuous process technology improvements

# Part I – Physical Variation Sources and Mitigation



Patterning



Polish



Strain

# CMP Integration at 45 nm – HiK Metal Gate



STI deposition and polish

**STI  
CMP**



Wells and VT implants

ALD deposition of high-k gate dielectric

Polysilicon deposition and gate patterning

S/D extensions, spacer, Si recess and SiGe deposition

S/D formation, Ni silicidation, ILD0 deposition

Poly Opening Polish, Poly removal

**POP  
CMP**



PMOS workfunction metal deposition

Metal gate patterning, NMOS WF metal deposition

Metal gate fill and polish, ESL deposition

**MGD  
CMP**



**First Generation HiK – Replacement Metal Gate**  
**Three critical CMP operations in the FE**



# CMP Integration at 45 nm – HiK Metal Gate



STI deposition and polish

Wells and VT implants

ALD deposition of high-k gate dielectric

Polysilicon deposition and gate patterning

S/D extensions, spacer, Si recess and SiGe deposition

S/D formation, Ni silicidation, ILD0 deposition

Poly Opening Polish, Poly removal

PMOS workfunction metal deposition

Metal gate patterning, NMOS WF metal deposition

Metal gate fill and polish, ESL deposition

K.Mistry et al., IEDM (2007)  
C.Auth et al. VLSI Symp, (2008)  
J. Steigerwald, IEDM (2008)

**STI  
CMP**



**POP  
CMP**



**MGD  
CMP**



**First Generation HiK – Replacement Metal Gate**  
**Three critical CMP operations in the FE**

# STR Pattern Density Variation Impact

High Pattern Density



Low Pattern Density

Slower Polish Rate



Faster Polish Rate

# STI Step Height Variation

High Pattern Density Area



Positive Step Height

Low Pattern Density Area



Zero Step Height

STI  
topography  
impacts  
transistor  
Le and Ze

# STI Step Height Variation

High Pattern Density Area



Low Pattern Density Area



# STI Step Height Impact on Gate CD



Gate CD is shorter at the diffusion boundary

# SRAM Density Scaling



90nm – TALL  
 $1.0 \mu\text{m}^2$



65nm – WIDE -  $0.57 \mu\text{m}^2$



45nm – WIDE  
 $0.346 \mu\text{m}^2$



32nm – WIDE  
 $0.171 \mu\text{m}^2$

## 65nm to 32nm: Patterning and polish enhancements

- Improved CD uniformity across STI boundaries
- Square corners (eliminate “dogbone” and “icicle” corners)

# CMP Integration at 45 nm – HiK Metal Gate



STI deposition and polish

**STI  
CMP**



Wells and VT implants

ALD deposition of high-k gate dielectric

**POP  
CMP**



Polysilicon deposition and gate patterning

S/D extensions, spacer, Si recess and SiGe deposition

S/D formation, Ni silicidation, ILD0 deposition

Poly Opening Polish, Poly removal



PMOS workfunction metal deposition



Metal gate patterning, NMOS WF metal deposition

**MGD  
CMP**



Metal gate fill and polish, ESL deposition



K.Mistry et al., IEDM (2007)  
C.Auth et al. VLSI Symp, (2008)  
J. Steigerwald, IEDM (2008)

**First Generation HiK – Replacement Metal Gate**  
**Three critical CMP operations in the FE**



# Variation Challenges of RMG CMP Steps

- Gate height control critical to reducing variation
- PMOS/NMOS differences complicate CMP



# Variation Challenges of RMG CMP Steps

## OVERPOLISH

Exposes raised S/D  
Rext/mobility impact



S/D region – attacked  
during poly etch

## UNDERPOLISH

Underetched contact  
Rext impact



NMOS S/D region contact  
S/D region – marginal contact

# Poly Opening Polish (POP) Thickness Control

POP: WID by process Rev.



Patterned Wafers: WIW Profiles



45nm: with-in die (WID) and with-in wafer (WIW) improvement  
High selectivity between films is required.  
Key aspect is control of polish rate at edge of wafer.

# 45 nm: POP CMP Improvement Overscaling Topography Improvement



**Improvements in polish enabled dramatic improvements in topography variation**

# Part I – Physical Variation Sources and Mitigation



Patterning



Polish



Strain

# Strain: Importance in scaling



**Strain (first introduced at 90nm) is a critical ingredient in modern transistor scaling**

# Strain: Pitch dependence



**NMOS**  
Pitch degradation increases with film pinchoff, requires higher stress, thinner films



**PMOS**  
eSiGe S/D mobility strongly dependent on pitch

# NMOS strain: Scaling with pitch



Tensile trench contacts



Compressive gate stress

# PMOS strain: Scaling with pitch



# Random $V_T$ variability and strain

Weber et al.  
IEDM 2008  
pp. 245-248



Similar  $V_T$  matching with CESL while 35%  $I_{ON}$  enhancement is achieved

$650\mu\text{A}/\mu\text{m}$  –  $30\text{pA}/\mu\text{m}$  at  $V_{dd}=1\text{V}$  and  $L_g=25\text{nm}$

# **Part II: Measurements, results and interpretation**

# Systematic and Random

- **Statistician's viewpoint:**



- **Process engineer's viewpoint:**



- **Device engineer's viewpoint:**



# Measurement “food pyramid”

INCREASING DATA QUANTITY  
DECREASING ABILITY TO SEGMENT ORIGIN

- In-line or off-line physical measurements of test wafers (TEM, SIMs, Auger, etc.)
- Device parametric measurements on test material (Ion/Ioff, IG/VG etc.)
- In-line physical measurements of selected sites in product (CD, thickness, etc.)
- Device parametric measurements on product ( $I_{dsat}/I_{in}$ , VT)
- Device parametric measurements on simple circuits ( $f_{max}$ ,  $f_{min}$ , etc)
- Device sort on completed product ( $V_{ccmin}$  and performance)



# Measurement of Random and Systematic VT Variation at the Device Level



Traditional method:

1. Measure two identical adjacent devices and extract the difference  $\sigma(VT_A - VT_B)$
2. Measure the entire population of all devices and extract  $\sigma(VT_{pop})$

Random Variation  
for a matched pair

$$Random_{mp} = StdDev(VT_A - VT_B) = \sigma(DVT)$$

Random Variation  
for a single device

$$Random_{one-device} = \frac{StdDev(VT_A - VT_B)}{\sqrt{2}} = \frac{\sigma(DVT)}{\sqrt{2}}$$

Systematic Variation  
for a single device

$$Systematic = \sqrt{(\sigma VT_{pop})^2 - \left(\frac{\sigma(DVT)}{\sqrt{2}}\right)^2}$$

# Pelgrom Plots: What is $A_{VT}$ anyway?

Two choices are widely used in the literature

IEDM 2008: Weber



IEDM 2008: Arnaud



$\sqrt{2}$

Choice A       $\leftarrow$   
Slope of  $\sigma_{VT}$  vs  $1/\sqrt{WL}$

Choice B       $\rightarrow$   
Slope of  $\sigma\Delta_{VT}$  vs  $1/\sqrt{WL}$

# What did Pelgrom say?

Pelgrom “Matching properties of MOS transistors”  
(IEEE Journal of Solid-State Circuits, Vol. 24, No. 5, Oct. 1989)

- Eq. 5 defines a generic AP for a parameter  $\Delta P$ ; implying AVT would then be the parameter for  $\Delta VT$

$$\sigma^2(\Delta P) = \frac{A_P^2}{WL} + S_P^2 D_x^2.$$

- However, one page further in the paper, he explicitly defines AVT in terms of VT only in equation 8:

$$\sigma^2(V_{T0}) = \frac{A_{VT0}^2}{WL} + S_{VT0}^2 D^2.$$

- So – which is did he mean? Well, I asked him.

# What is A<sub>VT</sub> anyway?

IEDM 2008: Weber



$$\sigma_W = \frac{\sigma_{\Delta V_t}}{\sqrt{2}} \quad \sigma_{V_t} = \frac{A_{V_t}}{\sqrt{WL}}$$

Choice A       $\sqrt{2}$       Choice B  
 Slope of  $\sigma V_T$  vs  $1/\sqrt{LW}$

IEDM 2008: Arnaud



This is A<sub>VT</sub>

$\sqrt{2}$       Slope of  $\sigma \Delta V_T$  vs  $1/\sqrt{LW}$

# What is $A_{VT}$ anyway?

IEDM 2008: Weber



IEDM 2008: Arnaud



I will call this  $C_{VT}$  (or  $C_2$ )

$$C_{VT} = A_{VT} / \sqrt{2}$$

Choice A       $\leftarrow$   
Slope of  $\sigma_{VT}$  vs  $1/\sqrt{LW}$

This is  $A_{VT}$   
Choice B       $\rightarrow$   
Slope of  $\sigma_{\Delta VT}$  vs  $1/\sqrt{LW}$

# ***Additional propagation of confusion (By me, it turns out ...)***



**RDF is frequently described by (Stolk):**

K. Kuhn, IEDM 2007

$$\sigma V_{Tran} = \left( \frac{\sqrt[4]{4q^3 \epsilon_{si} \phi_B}}{2} \right) \cdot \frac{T_{ox}}{\epsilon_{ox}} \cdot \left( \frac{\sqrt[4]{N}}{\sqrt{L_{eff} \cdot Z_{eff}}} \right) = \frac{1}{\sqrt{2}} \left( \frac{c_2}{\sqrt{L_{eff} \cdot Z_{eff}}} \right) \quad (1)$$

# **Additional propagation of confusion (By me, it turns out ...)**



RDF is frequently described by (Stolk):

K. Kuhn, IEDM 2007

$$\sigma V_{Tran} = \left( \frac{\sqrt[4]{4q^3 \epsilon_{si} \phi_B}}{2} \right) \cdot \frac{T_{ox}}{\epsilon_{ox}} \cdot \left( \frac{\sqrt[4]{N}}{\sqrt{L_{eff} \cdot Z_{eff}}} \right) = \frac{1}{\sqrt{2}} \left( \frac{c_2}{\sqrt{L_{eff} \cdot Z_{eff}}} \right) \quad (1)$$

# What is $B_{VT}$ then?



$B_{VT}$   
Slope of  $\sigma_{VT}$  vs  $\sqrt{T_{inv}(V_{TH}+0.1)/LW}$



Fig. 3: Takeuchi, IEDM 2007

# But what about simple circuits?



One powerful tool for assessment of variation is locating ring-oscillators (ROs) routinely in all product designs

# Random and Systematic Variation for Matched Ring Oscillators

Random:

- Calculate Delta
- **Random Variation**

$$Delta = \frac{FreqA - FreqB}{FreqA + FreqB} * \frac{200}{\sqrt{2}}$$

$$Rand = StdDev(Delta)$$

per data unit

Systematic:

- Total Sigma

$$\sigma = StdDev(FreqA)$$

per data unit

- Grand Mean

$$\mu = \frac{Mean(FreqA) + Mean(FreqB)}{2}$$

$$Syst = \sqrt{\left(\frac{\sigma}{\mu} * 100\right)^2 - Rand^2}$$

per data unit

Total Variation:

$$Total = \frac{StdDev(FreqA)}{Mean(FreqA)} * 100$$

per data unit



# 45nm: Within Wafer Variation

RANDOM



SYSTEMATIC



For random variation: Uniform across wafer

For systematic variation: More variation at the wafer edge

- 15cm ■ 14cm ■ 13cm ■ 12cm ■ 11cm ■ 10cm ■ 9cm ■ 8cm

# 45nm: Within Die (WID), Within Wafer (WIW) and Wafer to Wafer (WTW)



For random variation: Uniform with population choice

For systematic variation: Variation increases significantly going from within-die (WID) to within-wafer (WIW)

# 45nm Product wafer: Random variation



# Random and Systematic Variation Trends



**Systematic WIW variation  
is comparable from one  
generation to the next**



**Random WIW variation in  
32nm is comparable to  
45nm and significantly  
improved over 65nm and  
90nm due to HiK-MG**

# What about more complex circuits? RSM Methodology for Variation Model Parameters

- Identify the set of input parameters in variation modeling files that can be allowed to vary
- Create DOE to vary all parameters within selected limits
- Create a series of variation modeling files, using the matrix of parameters from the DOE
- Simulate an appropriate set of circuits and devices to obtain responses to the set of variation modeling files
- Enter simulation results back into DOE to determine sensitivity to model parameters
- Optimize variation modeling file parameters to get best match to measured data



# Example Matrix of Inputs and Associated Responses



Not all responses are sensitive to all inputs—key is to determine which responses are appropriate for setting each input parameter

# 32nm SRAM Test Chip



**SRAM test chip with advanced test features (PBIST, eFUSE, ECC, etc.) to support development of 32nm high-volume manufacturing process**

# SRAM $V_{CCmin}$ – Silicon to Simulation



Wafer-level SRAM P/NMOS transistor systematic  $V_T$  variation

# 32nm Voltage-Frequency Shmoo

3.25Mb SRAM Macro



- 32nm SRAM operates over a broad range of supply voltages, enabling dynamic voltage scaling for low-power application
- 32nm SRAM achieves operating frequency of 4GHz at 1.0V, 15% better than 45nm design



K. Zhang, ISCC 2009

# **Part III:**

# **Next generation**

# **challenges**

# Lithography Pipeline



Extend 193nm Optical Lithography as far as possible  
Deploy EUV Lithography when available/affordable

# Extreme Ultraviolet Lithography



Cymer beta source



Intel EUV Mask



ASML ADT printed wafer



Philips beta source



Photoresist Development



Nikon EUV1 printed wafer

Continued progress towards EUV implementation



M. Bohr, ISCC, 2009

# Non-EUV Lithography Beyond 32 nm

## Double Patterning

- Pitch doubling
- Improved 2-D features

Pitch Doubling



2-D Features



## Spacer Gate Patterning

- Pitch doubling
- Improved variation



M. Bohr, ISCC, 2009

Bencher et al, Proc. of SPIE Vol. 6924 69244E-7

# Pitch doubling and gate CD control



Resist freeze

Double Pattern Transfer

Neither Resist Freeze nor Double Pattern Transfer achieve full benefit of patterning at  $\frac{1}{2}$  pitch

Both techniques still require resolution of a very small space (MEEF, LWR etc.)

# Disadvantages of Double-patterning



**Misalignment between the 2 exposures is a crucial liability for this technique and can limit its usability**

**Transistor parameters can be affected by asymmetry between the source and drain regions**

# Pitch doubling and gate CD matching



**Pitch doubling eliminates the close correlation which currently exists between the CDs of adjacent gates**

**This has implications for memory cells and other circuits which depend upon this CD matching**

# Pitch doubling and gate CD matching



Single patterning: the distribution of CD mismatches between adjacent gates is a very small fraction of total gate CD variation

**Pitch doubling: the distribution of CD mismatches is GREATER than the total gate CD variation**

# Non-EUV Lithography Beyond 32 nm

## Double Patterning

- Pitch doubling
- Improved 2-D features



## Spacer Gate Patterning

- Pitch doubling
- Improved variation



M. Bohr, ISCC, 2009

Bencher et al, Proc. of SPIE Vol. 6924 69244E-7

# Alternative: Spacer patterning



(1) Print and Resist Trim



**Spacer patterning retains correlation  
between doubled features**

Bencher et al, Patterning by CVD Spacer Self Alignment  
DoublePatterning (SADP), Proc. of SPIE Vol. 6924 69244E-7

# Alternative: Spacer patterning



(1) Print and Resist Trim



(2) Etch Template



(3) Form Spacers



(4) Strip Template



(5) Transfer Etch



(6) STI Etch and ash



**Spacer inhomogeneities not transferred to patterned features**

Bencher et al, Patterning by CVD Spacer Self Alignment  
DoublePatterning (SADP), Proc. of SPIE Vol. 6924 69244E-7

# Uniformity matters: Logic images vs. technology node



65nm node



45nm node



32nm node



# Layout Restrictions 65nm to 32nm

65 nm Layout Style



32 nm Layout Style



- Bi-directional features
- Varied gate dimensions
- Varied pitches

- Uni-directional features
- Uniform gate dimension
- Gridded layout

# Transistor Architecture Enhancements



Weber et al. IEDM 2008 pp. 245-248



Vellianitis et al. IEDM 2008 pp. 681-683

Fully depleted devices (such as UTB or FinFET) are examples of innovations which permit significant improvement in RDF due to the ability to maintain channel control at lower channel doping.

# $V_T$ matching performance



$(\sigma_{V_t} = \sigma_{\Delta V_t} / \sqrt{2}$  to compare measurements on pairs  
and on arrays of transistors in the literature)

Fully depleted devices (such as UTB or FinFET) are examples of innovations which permit significant improvement in RDF due to the ability to maintain channel control at lower channel doping.

Weber et al.  
IEDM 2008  
pp. 245-248

# Closing Thoughts

# Random and Systematic Variation Trends



**Systematic WIW variation  
is comparable from one  
generation to the next**



**Random WIW variation in  
32nm is comparable to  
45nm and significantly  
improved over 65nm and  
90nm due to HiK-MG**

# 45 nm: POP CMP Improvement Overscaling Topography Improvement



**Improvements in polish enabled dramatic improvements in topography variation**

# Technology Trend Systematic Gate CD Lithography Variation



Critical to management of variation is the ability to deliver a 0.7X gate CD variation improvement in each generation enabled by continuous process technology improvements

# SRAM Density Scaling



**Improved fidelity / uniformity on 32nm vs 90nm**

K. Zhang, ISCC, 2009



# Q&A



Blood cell: Elec. Mic. Fac. (NCI-Frederick) 2007

For further information on Intel's silicon technology, please visit our  
Technology & Research page at  
[www.intel.com/technology](http://www.intel.com/technology)