

# Yield, Statistical Process Control, & Design of Experiment

ELEN 4944  
Prof. Ioannis (John) Kymissis  
Oliver Durnan

[johnkym@ee.columbia.edu](mailto:johnkym@ee.columbia.edu)  
[oad2114@columbia.edu](mailto:oad2114@columbia.edu)

# Statistical Process Control (SPC)

---

- This is a fancy name for... keeping track of things
- Monitor and control variation of individual fab steps
- Ensure processes stay within control limits → Ensure chips come out the way you want them to
- Tells you when something is wrong

# SPC Example

---



# Design of Experiment (DOE)

---

- This is a fancy name for... trying out new process parameters
- Seeks to identify cause → effect relationships
- Helps determine what parameters matter, what the “process window” is, how process is optimized
- Helps you fix things when they go wrong

# DOE Example

---

| Sample | Process Temp (C) | SiH4 Flow (sccm) | Leakage (A) |
|--------|------------------|------------------|-------------|
| 1      | 200              | 20               |             |
| 2      | 250              | 20               |             |
| 3      | 300              | 20               |             |
| 4      | 350              | 20               |             |
| 5      | 200              | 40               |             |
| 6      | 250              | 40               |             |
| 7      | 300              | 40               |             |
| 8      | 350              | 40               |             |

# Why?



Intel Careers

## LTD Advanced Device Development Engineer

Apply

### Qualifications:

#### Qualifications:

You must possess the minimum qualifications below to be initially considered for this position. Preferred qualifications are in addition to the minimum requirements and are considered a plus factor in identifying top candidates. The experience listed below would be obtained through a combination of your schoolwork/classes/experimental research and/or relevant previous job and/or internship experiences.

### Minimum Qualifications

- Must possess Ph.D. degree in electrical engineering (EE), electrical and computer engineering (ECE), electrical engineering and computer science (EECS) directly related to Semiconductor field.
- 10+ years of experience in advanced node semiconductor industry in one or more of the following:
- Semiconductor materials, fabrication, and device physics.
- Electrical characterization of Semiconductor Devices (transistor, diode, etc.)

### Preferred Qualifications

- Advanced Transistor Device Structures and Device Physics.
- Process monitoring Test structures design and layout experience.
- Device and circuit simulation.
- Statistical Process Control (SPC) or Design of Experiments (DOE) principles and engineering analysis tools.
- Expertise in database structures, research methods, machine learning, analytics packages (i.e., JMP, MATLAB, Octave), scripting languages (i.e., Python, JSL, Perl, TCL), or programming languages (i.e., SQL, C/C++)



Careers

## Silicon Photonics Technology Development & Integration Engineer (2026 New College Graduate)

Apply

& Security requirements and programs.

### Required Qualifications:

- Education – Graduating with Bachelors degree in Science, Math, Engineering, Semiconductor Manufacturing or related field from an accredited degree program.
- Must have at least an overall 3.0 GPA and proven good academic standing.
- Language Fluency - English (Written & Verbal)

### Preferred Qualifications:

- Prior related internship or co-op experience.
- Demonstrated prior leadership experience in the workplace, school projects, competitions, etc.
- Project management skills, i.e. the ability to innovate and execute solutions that matter; the ability to navigate ambiguity.
- Strong written and verbal communication skills
- Strong planning & organizational skills
- Excellent structured problem solving and knowledge of Lean Manufacturing principles
- Lab or pre-professional experience in semiconductor processing or in Silicon Photonics
- Understanding and knowledge of Statistical Process Control (SPC) and/or Design of Experiments (DOE)
- Ability to work effectively and efficiently with diverse teams, customers, as well as internal and external partners

#NCGProgramUS

### Expected Salary Range

\$54,200.00 - \$110,300.00



## CVD Process Engineer

Apply now



### About Samsung Austin Semiconductor

Samsung is a world leader in advanced semiconductor technology, founded on the belief that the pursuit of excellence creates a better world. At SAS, we are Innovating Today to Power the Devices of Tomorrow.

### Come innovate with us!

### Position Summary

Samsung Austin Semiconductor is seeking a process engineer interested in working alongside a talented team of professionals with a key focus on establishing and maintaining world class process/equipment and implementing yield enhancement as a competitive advantage along with ability to identify complex problems and implementing solutions.

### Role and Responsibilities

Here's what you'll be responsible for:

- Maintaining a high standard for safety and quality through 5S, communication, process monitoring, and proactive and continuous improvements.
- Coordinating with equipment engineering and CVD operations to ensure cohesive plans are communicated and executed for non-standard work.
- Collaborating long-down and chronic issue work plans with equipment engineering and vendors.
- Using data analysis, benchmarking, DOE, and vendor guidance to develop process improvements for particle reduction, deposition stability, process efficiency, and cost reduction.
- Meeting project milestones by creating organized project plans, preparing structured, self-evident data packages, clearly and confidently delivering proposals, and pushing approvals through completion.
- Advising and assists other unit parts, process integration, and other support groups on complex problems.
- Coaching and training Jr. engineers.

### Skills and Qualifications

Here's what you'll need:

- BS/MS Engineering - Chem Engr, EE, Material Science, or Mechanical Engineering.
- 3+ years of CVD, fab engineering, or similar industry experience Skills/Abilities.
- Experience with DOE and project management - Process control (SPC, APC, FDC, etc.) and Risk Analysis/FMEA.

# Yield

---

- We'll look at this through the lens of "yield"
- What percentage of chips come out as designed
- Determining meaningful yield metrics can be complex
  - Overall vs. step-wise?
  - Over what area?
  - What are the tolerances?

# Yield Statistics

---

*“There are three kinds of lies: lies, damned lies, and statistics”*



郭明錤 (Ming-Chi Kuo) 

@mingchikuo · [Follow](#)



The first Panther Lake engineering samples, made with Intel/IFS's 18A, are currently being tested by major PC ODM/EMS makers. My early 2025 industry survey showed 18A yields below 20-30%, so there's still a lot of room to step up—which doesn't bode well for Intel's goal of [Show more](#)

10:59 AM · Feb 24, 2025



 259



Reply



[Copy link](#)

[Read 60 replies](#)

Do these numbers mean anything?

# Process Ramp Up and Yield

---

As a fab is built and process is developed, yield “ramps” up



# Why Yield Matters

---

Getting yield over the edge is critical for financial viability

$$\text{Chip cost} = \frac{\text{cost per wafer}}{\text{yield} \times \text{number of dice per wafer}}.$$

Yield is a huge factor in **the bottom line**

# Yield Modeling (Poisson Process)

---

- This assumes a random scattering of the defects

$$Y = \int_0^{\infty} e^{-A_c D} \delta(D - D_0) dD = e^{-A_c D_0}.$$

- $Y \rightarrow$  Chip yield
- $A_c \rightarrow$  Chip area
- $D \rightarrow$  Defect density variable
- $D_0 \rightarrow$  Observed defect density

**Take home message:**  
Increasing chip area and defect density drive yield down!

# Why Chip Size?

Consider Apple M5 CPU vs. NVIDIA RTX 5090 GPU



# The Yield Curve – Display Angle

---

- Low (zero?) tolerance for defects
- As the number of elements increases there is more opportunity for defects
- 4K displays have  $\approx$ 8 million pixels
- Where is that on this plot?



# Dead Pixels

---



Almost statistically  
impossible to avoid this

# Other Yield Models

---

- Poisson assumes that defects are randomly distributed in space
- There are other yield models that assume defects are clustered
- May handle distribution of defect size
- Murphy, Negative Binomial, etc., ...there are many
- These are a bit more realistic → Better fit to actual fab data

# Aside: Binning

---

- Scenario: Fabless Design Company Inc.<sup>TM</sup> designs a series of CPUs
  - *Ultra tier* → 16 cores
  - *Plus tier* → 8 cores
  - *Value tier* → 4 cores
- Designing 3 separate chips costs a lot of money
- If yield isn't perfect, don't get many ultras
- Solution: “Bin” the chips after production based on number of defects



Good die → *Ultra*



Med die → *Plus*



Bad die → *Value*

# Aside: Binning

---

- Binning doesn't only apply to defective cores
- Certain chips win (or lose) the “silicon lottery” and are extra fast (or slow) compared to others
- These also get binned based on performance
- Interesting implication: Chips of one class are not necessarily the exact same. Some iPhone 17s are faster than others. Just comes down to luck.

# Fatal vs Non-fatal Defects

---

- Depends on the feature
- Transistors are very small → Most front-end (FEOL) errors are fatal
- Back-end (BEOL) depends on the feature – there is a “critical area” which is different for each layer



# How Does Repair Help?

---

- Some processes allow for repair (especially displays)
- This now changes the calculus a bit...instead of one defect being fatal, now a larger number are needed (or a different defect)
- Fabricate → Inspect → Repair
- The spec for yield on these devices can be complicated
  - “No more than X defects”
  - “No more than Y defects adjacent to each other”

# Aside: Across Wafer Gradients

---

- Failure may be more subtle than short/open circuits
- Mismatch can be especially deleterious to analog chips
- Digital are more robust against this, but timing issues can still arise



# Aside: Across Wafer Gradients

---

- Variation is often caused by difference in layer thickness or difference in lithography exposure focus across the wafer
- These differences are usually assumed to be linear over a reasonably small area of the wafer...
- i.e. oxide thickness  $t_{ox}(x) \approx t_0 + m \cdot x$



# Aside: Across Wafer Gradients

Example: non-uniform oxidation



Credit: GanWafer.com via Pam-Xiamen

| Film Thickness (Å) | Color of Film (those shown are only indicative)                              |
|--------------------|------------------------------------------------------------------------------|
| 500                | tan                                                                          |
| 700                | brown                                                                        |
| 1000               | dark violet to red violet                                                    |
| 1200               | royal blue                                                                   |
| 1500               | light blue to metallic blue                                                  |
| 1700               | metallic to very light yellow-green                                          |
| 2000               | light gold or yellow - slightly metallic                                     |
| 2200               | gold with slight yellow-orange                                               |
| 2500               | orange to melon                                                              |
| 2700               | red-violet                                                                   |
| 3000               | blue to violet/blue                                                          |
| 3100               | blue                                                                         |
| 3200               | blue to blue-green                                                           |
| 3400               | light green                                                                  |
| 3500               | green to yellow-green                                                        |
| 3600               | yellow-green                                                                 |
| 3700               | green-yellow                                                                 |
| 3900               | yellow                                                                       |
| 4100               | light orange                                                                 |
| 4200               | carnation pink                                                               |
| 4400               | violet-red                                                                   |
| 4600               | red-violet                                                                   |
| 4700               | violet                                                                       |
| 4800               | blue-violet                                                                  |
| 4900               | blue                                                                         |
| 5000               | blue-green                                                                   |
| 5200               | green                                                                        |
| 5400               | yellow-green                                                                 |
| 5600               | green-yellow                                                                 |
| 5700               | yellow to "yellowish" (at times appears light gray or metallic)              |
| 5800               | light orange or yellow to pink                                               |
| 6000               | carnation pink                                                               |
| 6300               | violet red                                                                   |
| 6800               | "bluish" (appears between violet-red and blue-green - overall looks grayish) |
| 7200               | blue-green to green                                                          |
| 7700               | "yellowish"                                                                  |
| 8000               | orange                                                                       |
| 8200               | salmon                                                                       |
| 8500               | dull light red-violet                                                        |
| 8600               | violet                                                                       |
| 8700               | blue-violet                                                                  |
| 8900               | blue                                                                         |
| 9200               | blue-green                                                                   |
| 9500               | dull yellow-green                                                            |
| 9700               | yellow to "yellowish"                                                        |
| 9900               | orange                                                                       |

Ultimately, this leads to chips failing to meet quality standards → discarded or binned lower

# Aside: Across Wafer Gradients

---



Intel Pentium CPU - Credit: Wikipedia



ST TL072 Op-Amp – Credit Zeptobars

# Statistical Process Control

---

- Again, this is a fancy name for keeping track of things
- Consists of measurements:
  - In-line → Measurements taken during fab process, integrated into production line
  - Off-line → Measurements taken after fab process, may be destructive
- Tabulate measurement over time to observe shift or stability



# In-line vs. Off-line Measurements

- In-line:

- Catch defects when they happen
- Ability to pause further processing, prevent more out of spec wafers
- Typically material measurements (e.g. layer thickness, etch rate)

- Off-line:

- More complex analysis possible
- Can be material or electrical measurements (e.g. layer thickness, etch rate, or IV curve)



Credit: SemiLab

# Test Structures

---



- Sometimes the device isn't convenient to measure
- It's typical to include additional "test structures" or "test element groups" (TEGs) on the chip
- Examples:
  - Via chains
  - Capacitor series for gate oxide
  - TLM series for contact resistance

# Wafer Acceptance Test (WAT)

- Final test done on wafer → Tells us which dies are worth packaging



Wafer probing - Credit: wevolver.com



KGD Map - Credit: ipTEST

# Control charts

---

- A graph that tracks parameters over time and specifies what is OK and what is not
- This also tells us the trends over time
- The trick is to pick which parameters to track → Can't track everything!



# Process Capability

- These metrics describe how well a process meets a specification
  - Cp: “Potential capability” → How wide is process spread compared to tolerance?
  - Cpk: “Actual capability” → Is the spread acceptable given the mean?
- $Cp = \frac{USL - LSL}{6\sigma}$  &  $Cpk = \min\left[\frac{\mu - LSL}{3\sigma}, \frac{USL - \mu}{3\sigma}\right]$  → Typically require both  $> 1.33$
- Customer requires oxide with  $75 \text{ nm} < t < 125 \text{ nm}$ . Is our process capable?



# Analysis of Variation

- Another piece of jargon: “ANalysis Of VAriation (ANOVA)
- Basically, answers the question: “Are the differences between these groups significant, or just random?”
- Compares within group variance to between group variance using a statistical test called an F-test



- K: number of groups
- n: observations per group (let's say 5)
- N: total observations
- $\mu_t$ : 102
- $F_{\text{crit}} = 5.32$  (This just comes from a table... depends on N, K, and significance level)

$$F = \frac{n \cdot \sum_i (\mu_i - \mu_t)^2}{(n - 1) \cdot \sum_i \sigma_i^2} \cdot \frac{N - K}{K - 1}$$

# BIR – Build In Reliability

---

- Most semiconductor foundries follow a philosophy of “BIR”
- This is a statement that yield cannot be measured by the final product
- Basically, highlights the importance of measuring reliability in-line alongside other SPC metrics

# Design of Experiment (DOE)

---

- Again, a fancy way of following the scientific method
- Change parameters (try to do as few as possible at a time...)
- Find a way to depict the results across the two directions
- Infer conclusions about the sensitivity of the process to certain parameters



# Example DOE Process



- Example deposition + etch sequence
- There is an optimum thickness, finding it requires some work



# Sensitivity Is Important

---

- For any control parameter, you can only know if you hit the optimum if you can pass over
- The slope in the parameter chart is the ‘sensitivity’

**Figure 1:**  
**John's Experiment (Time = 170 Minutes)**



**Figure 2:**  
**Mary's Experiment (Temp = 229°C)**



# Seek Opportunities To Settle Two Dimensions

- Multifactorial → Basically just doing a 2D “grid” of experiments
- Multifactorial DOE can reduce the amount of time searching

-BUT-

- You have to understand the sensitivity and cross-sensitivity
- This can be an iterative process (2 is common, more are possible)

**Figure 4:**  
**Yield Contours (God's Equation)**



# Experimental design

---

What do you want to know?

| <u>Number<br/>of Factors</u> | <u>Comparative<br/>Objective</u>                                 | <u>Screening<br/>Objective</u>                                                            | <u>Response Surface<br/>Objective</u>                                          |
|------------------------------|------------------------------------------------------------------|-------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------|
| 1                            | <a href="#"><u>1-factor completely<br/>randomized design</u></a> | —                                                                                         | —                                                                              |
| 2 - 4                        | <a href="#"><u>Randomized block design</u></a>                   | <a href="#"><u>Full</u></a> or <a href="#"><u>fractional factorial</u></a>                | <a href="#"><u>Central composite</u></a> or <a href="#"><u>Box-Behnken</u></a> |
| 5 or more                    | <a href="#"><u>Randomized block design</u></a>                   | <a href="#"><u>Fractional<br/>factorial</u></a> or <a href="#"><u>Plackett-Burman</u></a> | <a href="#"><u>Screen</u></a> first to reduce<br>number of factors             |

**Takeaway:** if you have more factors, you need to be more strategic

# What Is a Block?

---

- Ideally, you would be able to control everything that you don't care about
- Real life is not always that sample (limited lot size, etc.)
- Best – run all at once
- Next best – make “blocks” and run them randomly in groups
- Worst – run them all randomly
- *“Block what you can, randomize what you cannot.”*

# Block Example

---

## Example:

- You want to do an experiment on your oxide growth conditions, select 4 treatments of interest: A, B, C, and D
- You also know that position in the furnace, and lot the wafer is from will affect your results

|             |    | Furnace Position P |    |    |    |
|-------------|----|--------------------|----|----|----|
|             |    | P1                 | P2 | P3 | P4 |
| Wafer lot L | L1 | A                  | B  | C  | D  |
|             | L2 | D                  | A  | B  | C  |
|             | L4 | C                  | D  | A  | B  |
|             | L4 | B                  | C  | D  | A  |

“Good” blocking

|             |    | Furnace Position P |    |    |    |
|-------------|----|--------------------|----|----|----|
|             |    | P1                 | P2 | P3 | P4 |
| Wafer lot L | L1 | A                  | A  | A  | A  |
|             | L2 | B                  | B  | B  | B  |
|             | L4 | C                  | C  | C  | C  |
|             | L4 | D                  | D  | D  | D  |

“Bad” blocking

# Schemes for Block Design

---

- There are many schemes for block design
  - Latin Square
  - Greco-Latin Square (also just called “Greco”)
  - Many others...

|    | P1 | P2 | P3 | P4 |
|----|----|----|----|----|
| L1 | A  | B  | C  | D  |
| L2 | D  | A  | B  | C  |
| L4 | C  | D  | A  | B  |
| L4 | B  | C  | D  | A  |

Latin

|    | P1        | P2        | P3        | P4        |
|----|-----------|-----------|-----------|-----------|
| L1 | $A\alpha$ | $B\gamma$ | $C\delta$ | $D\beta$  |
| L2 | $B\beta$  | $A\delta$ | $D\gamma$ | $C\alpha$ |
| L4 | $C\gamma$ | $D\alpha$ | $A\beta$  | $B\delta$ |
| L4 | $D\delta$ | $C\beta$  | $B\alpha$ | $A\gamma$ |

Greco

- The big ideas here are: be efficient, balance blocking factors, easier to detect signal from experiment

# Semiconductor Reliability

---

- Failures may not necessarily be immediate
- Infamous “bath tub curve” for reliability (REL)



# Generalized Reliability

Death rate (log scale)



Mortality by age in Sweden - Credit: ourworldindata.org

- The same is true for people...
- And basically any other complex system

# Burn-in and Testing

- Accelerated aging is done to predict REL: typically increased voltage and temperature
- Many standards exist for these procedures



| Qualification Test | JEDEC Reference | Applied Stress/Accelerant           |
|--------------------|-----------------|-------------------------------------|
| HTOL               | JESD22-A108     | Temperature and voltage             |
| Temperature cycle  | JESD22-A104     | Temperature and rate of temp change |
| Temp humidity bias | JESD22-A110     | Temperature, voltage, and moisture  |
| uHAST              | JESD22-A118     | Temperature and moisture            |
| Storage bake       | JESD22-A103     | Temperature                         |

# Accelerated Aging

---

- Devices wear out over time → Some stresses increase the rate of wear
- Idea: test device under elevated stress to predict how it would decay in the far future
- Power law is common:  $y(t) = A \cdot t^b$
- Concept of “takt time” i.e. how long it takes to process a chip → shorter is better

Example: stress at elevated temperature to determine device lifetime



# High Temperature Operating Life

---

- So called “85/85” test is common → Catch failures that would take years to occur under normal conditions
- Low temperature tests are also sometimes useful
  - Silicon chips perform better at low temperature, but lithium batteries don’t
  - Thermal cycling may damage components
  - Some failure mechanisms only occur at low T
- Can be performed biased and unbiased
  - Simplicity vs. realism tradeoff

# Summary

---

- SPC tells you when something is wrong, DOE helps you fix it
- Yielding in a large and complicated process is ... well, complicated → But also important!
- When doing experiments, you typically want to minimize number of variables changing at one time → But often you have to change many
- There is a framework to detect the most sensitive parameters and to reach an optimization of each design
- A lot of jargon in this area, but mostly boils down to just common sense

---

Questions?