

# Wafer Defect Classifier

Naga Chandrasekaran, Daniel Chow, Lea Cleary, Scott Gatzemeier, and Erik Zinn

# Chip manufacturing is **complex**

Several opportunities for a defect to occur



**> 1000**  
Processing steps to  
finish a chip

**> 10**  
Weeks of  
processing time



Start



**> 40000**  
wafers started every week

**> 100s**  
Equipment used in  
manufacturing



# Yield maximization is critical for **cost reduction**

Defects are the primary factor impacting the number of good working chips on a wafer



**↓ Defects = ↑ Yield = \$ Cost ↓**



Wafer defect analysis is mainly performed manually by expert engineers

## Challenges

- Time-consuming
- Varying expertise
- Strict IP policies

# Problem Statement

## Automated Wafer Defect Classifier

Use a neural network to **identify** wafers with single defect patterns and **classify** them into different defect groups with >90% accuracy



# Where does our data come from?

WM-811K is the largest open source dataset from real production process



# What does our data **look** like?

None vs Defect Distribution



Defect Distribution



**Doubly Imbalanced**  
*Undersampling, data augmentation*



**Single-defect multi-class**  
*9 classes total, including none*



**Variable quality and resolution**  
*Resizing and pre-processing*

# Which pre-processing performs best?

Performance on Tandem CNN model from Yu, et al paper



No pre-processing  
96.63%



Median Filter (7x7)  
97.11%



Morphological Thinning (n=2)  
97.24%

# Model framework



# Model **architecture** exploration



| Architecture       | Test Set Accuracy (%) | MixedWM38 Accuracy (%) |
|--------------------|-----------------------|------------------------|
| 13-Layer CNN       | <b>97.56</b>          | <b>76.02</b>           |
| Tandem CNN         | 97.24                 | 70.39                  |
| Deformable CNN     | 96.37                 | 57.40                  |
| GoogLeNet          | 97.32                 | 66.65                  |
| ResNet (transfer)  | 92.17                 | 52.24                  |
| ViT L32 (transfer) | 94.44                 | 49.48                  |

# What does our model **see**?

Grad-CAM: Visualizes the class activation maps for the different defect classification types



# How well does our model **generalize**?

Most misclassified (MixedWM38 inference)



Recalculated Confusion Matrix

Edge-Loc + Edge-Ring,  
Loc + Center,  
Random + Near-full



Recalculated Accuracy = 90.62%

# How did we fare with manual labeling?

Overall inference accuracy on manually labeled data = 53.31%



# Production model **pipeline**

Customized data pipeline to meet the business need with real time inference



[Home](#)[About Project](#)[Value Proposition](#)[Model Description](#)[Demonstration](#)[Contact Us](#)

# Wafer Defect Classifier

---

Learn how we can automate your defect classification

A large, solid orange circle with the word "Start" centered in white text inside it.

Consistent



Time Efficient



Customizable



Cost Effective



Wafer Map Classifier for Early and Fast **Defect Detection** and **Defect Source** Identification

Model can be tuned for custom **Defect Patterns** for fast defect sourcing

# Acknowledgements

---

- **Professors:** Fred Nugen and Alberto Todeschini
- **TAs:** Andy Reagan, Kevin Hartman, and Danie Theron
- **Peers:** W210 Spring 2022 Section 6, special thanks to Seyfullah Oguz



Thanks for Watching  
Any Questions?