

# **Reliability study of Nanoscale Transistors**

A Seminar Report (Course Code: EE539) submitted in partial fulfillment of  
the requirements for the degree of  
  
Master of Technology

by

**Shamini P R**  
**(Entry No. 2023EEM1029)**

Under the guidance of  
**Dr. Pradeep Duhan**



DEPARTMENT OF ELECTRICAL ENGINEERING  
INDIAN INSTITUTE OF TECHNOLOGY ROPAR

November 10, 2023

# **Declaration**

I declare that this written submission represents my ideas in my own words and where others' ideas or words have been included, I have adequately cited and referenced the original sources. I also declare that I have adhered to all principles of academic honesty and integrity and have not misrepresented or fabricated or falsified any idea/data/fact/source in my submission. I understand that any violation of the above will be cause for disciplinary action by the Institute and can also evoke penal action from the sources which have thus not been properly cited or from whom proper permission has not been taken when needed.

Shamini P R

Entry No. 2023EEM1029

Date: November 10, 2023

# **Abstract**

Since the early days of electro-mechanical switches, designing reliable circuits using components that are unreliable has been a challenge and they are solved by coding and redundancy methods. In the context of modern CMOS technology, the aggressive scaling down of device geometries in comparison to supply voltages has increased local electrical stress. As a result, this has given rise to time-dependent degradation mechanisms, leading to performance degradation and a significant reduction in the lifespan of integrated circuits. The combined effect of these aging phenomena with process variations has made reliability a major design objective. Various design strategies have been proposed in the domain of analog circuits to address the performance challenges associated with device reliability. Robust design approaches demand a keen understanding and consideration of both variability and reliability. The presented work discusses the mechanisms behind device aging and provides an overview of different reliability-aware design approaches with their advantages and drawbacks.

# Contents

|                                                                  |           |
|------------------------------------------------------------------|-----------|
| <b>Abstract</b>                                                  | <b>i</b>  |
| <b>List of Figures</b>                                           | <b>v</b>  |
| <b>1 Introduction</b>                                            | <b>1</b>  |
| <b>2 Literature survey</b>                                       | <b>3</b>  |
| 2.1 Parametric variation . . . . .                               | 3         |
| 2.2 Parametric degradation . . . . .                             | 4         |
| 2.2.1 Hot Carrier Injection . . . . .                            | 4         |
| 2.2.2 Bias Temperature Instability (BTI) . . . . .               | 5         |
| 2.2.3 Time Dependent Dielectric Breakdown (TDDB) . . . . .       | 5         |
| 2.2.4 Electro migration . . . . .                                | 5         |
| 2.3 Transient Faults and Soft errors . . . . .                   | 6         |
| 2.4 Reliability-aware analog circuit design approaches . . . . . | 7         |
| 2.4.1 Precautionary Approaches . . . . .                         | 8         |
| 2.4.2 Sense and React Approaches . . . . .                       | 10        |
| 2.5 Monitoring Techniques of Reliability Degradations . . . . .  | 13        |
| 2.5.1 Direct Sensing . . . . .                                   | 13        |
| 2.5.2 Indirect Sensing . . . . .                                 | 13        |
| <b>3 Conclusion</b>                                              | <b>15</b> |

# List of Figures

|     |                                                                                                                                   |    |
|-----|-----------------------------------------------------------------------------------------------------------------------------------|----|
| 2.1 | Faults present in scaled device [1] . . . . .                                                                                     | 7  |
| 2.2 | Illustration of the differences between a nominal design and over-design (P1 and P2 represent performance metrics.) [1] . . . . . | 9  |
| 2.3 | Illustration of the differences between a nominal design and robust-design [1] .                                                  | 9  |
| 2.4 | Illustration of self-healing concept [2] . . . . .                                                                                | 11 |
| 2.5 | Circuit implementation of self-biasing structure [2] . . . . .                                                                    | 12 |

# Chapter 1

## Introduction

The practice of considering worst-case guard-band in VLSI design is common. However, this approach has many trade-offs. Process variations, time-dependent degradation, and radiation-induced soft and hard errors contribute to the trade-off between design performance and reliability, complicating the task of delivering products that meet performance, reliability, and variability criteria. Modern technology demands transistors to function under high electric fields and increased power density. The increase in electric fields has accelerated device degradation mechanisms, including trap generation and lattice interaction. Consequently, the reliability of transistors significantly diminishes, resulting in a shorter lifespan for high-performance circuits. Therefore, there is a need to develop accurate models and effective design methodologies to better describe the electrical behaviors of device variability and reliability. To address these challenges, design techniques have been introduced in both analog and digital circuit domains. These methods focus on making a balance between design reliability and performance while maintaining the overall design cost below a specific limit. In the following sections, a brief summary of transistor reliability physics and different reliability-aware design approaches are discussed along with their advantages and drawbacks.

# **Chapter 2**

## **Literature survey**

The process and reliability considerations within the electronics industry are complex and extensive. While it includes a wide range of factors, they can be categorized into four groups.

- Parametric Variation.
- Parametric Degradation.
- Transient Faults.
- Permanent Faults.

### **2.1 Parametric variation**

Process-induced variation introduces parametric variation within transistors. Even for nominally identical transistors, these variations arise from the early stage of the manufacturing process, at the submicron scale which results in variations in threshold voltage, gate leakage, series resistance, etc. It is also known as the Origin/Measure of ‘Time-Zero’ Variation.

Process-related parametric variation normally arises from:

- Random dopant fluctuations.
- Fluctuation of oxide thickness.
- Statistically distributed channel length due to line edge roughness.

Irrespective of the transistor type, the ongoing process of technology scaling ensures that these transistors remain subjected to variations in processing conditions. In brief, all domains of modern electronics, including microelectronics (such as logic and memory components), microelectronics, and bioelectronics, are impacted by the random behavior of design parameters. To create an integrated circuit using transistors with random parameters, the initial step involves calculating this randomness at the lowest level of abstraction, such as the process level. Then its impact on quantities like threshold voltage and leakage current can be calculated either through numerical methods. Subsequently, these effective electrical parameters are then extended to higher levels of abstraction, such as circuits and systems, utilizing numerical or analytical techniques.

## 2.2 Parametric degradation

As transistors age, they lead to performance degradation in the analog circuits where they are employed over time, leading to eventual circuit malfunction.

In contrast to the "time-zero" variation mentioned earlier, this type of variation can occur in two transistors that have undergone identical processing and possess the same initial characteristics. These changes in parametric values are typically permanent and cannot be reversed by simply turning off the integrated circuit (IC).

There are four major underlying time-based degradation mechanisms in CMOS circuits.

- Hot Carrier Injection (HCI)
- Bias Temperature Instability (BTI)
- Time Dependent Dielectric Breakdown (TDDB)
- Electromigration (EM)

### 2.2.1 Hot Carrier Injection

HCI typically occurs in n-channel devices when carriers, driven by the lateral electric field along the channel, acquire a significant amount of kinetic energy, transforming them into hot carriers. These hot carriers then undergo impact ionization close to the drain side, resulting in the generation of electron-hole pairs. In this process, electrons may attempt to tunnel through

the oxide layer, while holes are drawn toward the bulk of the device. When electrons lack the required energy for tunneling, they become trapped in the oxide, forming interface states. The accumulation of these trapped electrons leads to an elevation of the transistor's threshold voltage  $V_{th}$ . Additionally, the presence of interface traps increases scattering, causing a decrease in carrier mobility. In p-channel devices, the occurrence of HCI is less probable due to the lower mobility of holes. This reduced hole mobility hinders them from attaining the minimum energy required for impact ionization.

### **2.2.2 Bias Temperature Instability (BTI)**

The primary factor contributing to BTI (Bias Temperature Instability) is charge trapping. Defects generated both during and after the gate oxide formation process serve as interface states. These states can capture and retain carriers for a certain duration before releasing them back into the channel. The presence of filled interface states results in an increase in the threshold voltage ( $V_{TH}$ ) and a decrease in carrier mobility. When the electrical stress is removed, some of the interface traps can recover, leading to partial relaxation.

### **2.2.3 Time Dependent Dielectric Breakdown (TDDB)**

Time-dependent dielectric Breakdown occurs when the gate oxide breaks down over time due to prolonged exposure to high vertical electric fields. The two types of TDDB mechanisms are Soft breakdown and Hard breakdown. The degradation process initiates when traps begin to develop within the gate oxide. During this initial stage, the number of traps is very small, and their impact on the device is negligible. This is referred to as the soft breakdown stage. As the number of traps increases, they create conductive pathways within the oxide. This repeated conduction generates excessive heat along the pathway, leading to a thermal runaway. Within the breakdown region, the gate dielectric melts, releasing oxygen and forming a conductive path spanning the entire oxide layer. This latter phase is termed hard breakdown.

### **2.2.4 Electro migration**

Electromigration arises from the diffusion of metal atoms along a conductor, driven by the momentum transfer between electrons and metal atoms. This diffusion phenomenon tends to selectively fill metal ion vacancies present in crystal defects while creating a vacancy at the

location where the metal atom came from. Variations in atomic migration rates within different segments of the conductor can lead to imbalances in mass distribution, resulting in the formation of either voids or hillocks. What distinguishes Electromigration from other aging mechanisms is that it is an interconnect-related issue, and its occurrence cannot be predicted in advance. Therefore, Electro migration must be considered in the physical design of integrated circuits to ensure their reliability.

## 2.3 Transient Faults and Soft errors

With the miniaturization of transistors, the transient effects associated with radiation-induced strikes and latch-ups raised serious reliability concerns. Transient soft errors, which are computational errors occurring during runtime, don't result in permanent parameter degradation, nor are they related to process variations. However, the unpredictable nature of these transient upsets makes detecting and correcting them a challenging problem.

The modeling/characterization of soft errors involves a three-step process:

- **Understanding the Radiation Environment.**

The first step is understanding the radiation environment surrounding electronic components. This includes studying how the atmosphere interacts with radiation fluxes and exploring the physics of nuclear fission, and related phenomena.

- **Particle Interaction**

In the second step, the probability of particles, relative to their fluxes, interacting with a specific active volume of a transistor or similar electronic component is assessed. This evaluation helps in calculating the probability of radiation-induced incidents affecting the device.

- **Critical Charge Calculation**

The third step involves determining the critical charge required to disrupt a particular type of electronic device. This critical charge value is essential for assessing the device's vulnerability to soft errors caused by radiation-induced incidents.

Even though the recent advancements in semiconductor processing have significantly improved process reliability. It is still possible to have physical defects like shorts, opens, and resistive bridges as shown in Fig.2.1 .These defects can be resolved with process debugging or can be

detected at test time using standard test generation techniques.



Figure 2.1: Faults present in scaled device [1]

The usage of transistors for a prolonged period of time can result in 'stuck-at-zero' type faults, generated due to hard dielectric breakdown at the gate of NMOS transistors.

With the decrease in oxide thickness, there is an increasing chance of punch-through of the dielectric film by energetic particles causing (SEGR/SEB), which results in permanent damage to the transistors. SEB (Single Event Burnout)/SEGR (Single Event Gate Rupture) is a radiation-induced effect that is observed in power semiconductor devices. It is a condition that can cause device destruction due to a high current state. In brief a single high-energy particle strikes on a specific region of the device like the gate oxide, which can trigger unintended current flow resulting in a short circuit between the collector and emitter terminals of a power transistor, causing it to fail by conducting excessive current.

## 2.4 Reliability-aware analog circuit design approaches

In the design of analog circuits, the main focus is on reliability because numerous analog circuits are unable to fulfill their specified requirements over their prescribed lifespan, mainly due to degradations in Threshold Voltage ( $V_{th}$ ) and the device transconductance ( $gm$ ).

Reliability-aware analog circuit design approaches can be categorised into two, one is Precautionary Approaches and the other is Sense and React design approaches.

Table 2.1: Design Approaches

| Reliability aware Analog Circuit Design |                                           |
|-----------------------------------------|-------------------------------------------|
| Precautionary Approaches                | Sense and React Approaches                |
| Over-Design                             | Adaptive Biasing                          |
| Robust Design                           | Reconfigurable Design(Digitally Assisted) |
| Topology Selection                      |                                           |

### 2.4.1 Precautionary Approaches

Precautionary methods are based on designing the circuits with a consideration of how time-based degradation mechanisms can impact circuit performance, and they incorporate proactive measures to secure the intended product lifespan.

#### Overdesign

The concept of over-design Fig.2.2 refers to improving the robustness of analog circuits against aging-induced degradations by including extra margins or guard bands in circuit specifications. In essence, design requirements are satisfied with a large margin, ensuring that even if transistor parameters undergo changes, the circuit's performance can still remain within the original design specifications.

In the absence of an accurate model that can satisfy various design requirements, the introduction of excessive guard bands could affect its performance. The integration of overdesign into analog circuits will increase power dissipation and chip area and as a result, overall production cost will increase. Thus Over-design should be considered as a final measure for addressing time-based degradations.



Figure 2.2: Illustration of the differences between a nominal design and over-design (P1 and P2 represent performance metrics.) [1]

### **Robust design**

Robust design approach Fig.2.3 revolves around identifying a robust design point within the design space, ensuring that the solution meets both performance and reliability objectives.

The main drawback of this method is its large computational expense. Finding a solution that meets both performance and reliability criteria is a highly complex task, requiring multiple design iterations. When we consider the designer effort and design time, which are the most costly aspects of the overall expense, this approach becomes economically inefficient. To solve this issue automatic sizing algorithms are incorporated.



Figure 2.3: Illustration of the differences between a nominal design and robust-design [1]

## **Topology Selection**

This alternate approach to design reliability-aware analog circuits operates at the system level. After conducting aging analyses and sensitivity measurement to process, voltage, and temperature (PVT) variations for every circuit topology, a total reliability space is created which is made up of three primary groups.

- **Group 1 :** It consists of solutions satisfying both aging and PVT sensitivity analysis. So, in terms of total reliability, this is the most robust group.

- **Group 2 :** It has two sub groups.

The first subgroup contains solutions satisfying only aging analysis and sensitive to PVT changes.

The second subgroup contains solutions satisfying PVT sensitivity analysis only.

- **Group 3:** It contains solutions that are both sensitive to PVT changes and affected heavily by aging.

Designer's primary goal should be to select a topology from the most robust set of solutions, here it is the first group. It's quite likely that no solution will be discovered within this set. In such cases, prioritize solutions from the two subcategories within the second group. When choosing a topology designers may have to compromise on computation costs and reliability versus circuit performance. This drawback could be addressed by analog intellectual properties (IPs), which include many different solutions for various circuit topologies in addition to their experimentally validated specifications. It will be feasible to choose the topology using a design automation system, , if the reliability data for these analog IP blocks is provided in the reported properties list. This can reduce the design time.

### **2.4.2 Sense and React Approaches**

The idea behind the Sense and React (SR) technique is to detect a degradation in circuit performance and begin a healing mechanism to recover the circuit performance. This idea is illustrated in the Fig.2.4



Figure 2.4: Illustration of self-healing concept [2]

There are two types of Sense and React approaches:

- Adaptive Biasing Solutions.
- Re-configurable / Digitally-Assisted Analog Circuit Design.

### **Adaptive Biasing Techniques.**

Degradation over time primarily impacts biasing voltages and currents, resulting in changes to small-signal parameters. The implementation of adaptive biasing solutions can reduce performance degradation resulting from variations in biasing conditions. Some techniques are discussed below:

- **Use of Body effect of transistors**

An approach to reduce the variations in  $V_{th}$  can be achieved by making use of the body bias of transistors. This involves applying a positive voltage to the bulk terminal of an NMOS transistor, ensuring that  $V_{BS} > 0$  (or applying a negative voltage to the bulk of a PMOS transistor, such that  $V_{BS} < 0$ ). The body effect comes into play, reducing the overall  $V_{TH}$ . If this process can be adaptively adjusted throughout the lifespan of transistors, a constant threshold voltage can be maintained, thus preventing degradation. Although this approach makes sense theoretically, it raises several practical difficulties. To begin with, only triple-well techniques permit required bulk voltage variation, while twin-well processes impose a standard single voltage for the body bias. Second, each device's on-chip degradation is very difficult to detect and adjust. As a result, only a small number of critical transistors can use adaptive body biasing.

- **Use of a Self-biasing Structure**

An alternative method for implementing non-invasive adaptive biasing techniques involves the use of a self-biasing structure. Consider a basic differential amplifier configuration that comprises an NMOS current mirror, a PMOS differential pair, and a PMOS tail transistor, as illustrated in the Fig.2.5. As the result of time-based degradation the drain-source voltage ( $V_{DS}$ ) across the current mirror will drop over time. If this voltage is connected to the gate terminal of the PMOS tail transistor. There will be a linear increase in the gate-source voltage ( $V_{GS}$ ) of the PMOS in response to the time-based degradation. This design approach will make the circuit robust to aging. But this requires a very rigid biasing scheme for the circuit and much of the design flexibility will be lost.



Figure 2.5: Circuit implementation of self-biasing structure [2]

## **Re-configurable / Digitally-Assisted Analog Circuit Design**

Designing analog circuits in short-channel technologies presents significant challenges, making it difficult to meet all specifications. This puts forward the need of digital circuits to control analog structures, which can also reduce design limitations associated with time-based degradation.

## **Replacement technique:**

This make use of a additional transistors which is co-designed with the transistor under test and will act as a backup whenever the latter experiences a reliability degradation beyond the threshold. The demerit of this method is ,it is very challenging to determine whether the specified threshold is exceeded or not because it varies depending on the process fluctuations encountered by the actual transistor. Also The redundant transistor will consume chip area and also can add additional loading to the node it is connected to.

### **Redundancy Technique:**

Next is a method to solve the bias current degradation. In case of tail transistors in analog circuit a set of current sources added in parallel to ensure that when the bias current diminishes due to degradation of the tail transistor, an additional current source activates in parallel with the existing one, restoring the initial current level. Multiple copies of such current mirrors can be used when there is a chance of significant degradation in bias current. This method also has a drawback, the use of additional transistors can result in a decrease in output resistance of the overall current source and an increase in the total node capacitance, as a result CMRR will drop eventually. This design approach is very expensive and recommended only for highly sensitive applications where even a minor decline in performance is unacceptable, also this method occupies lot of chip area and has loading effect.

## **2.5 Monitoring Techniques of Reliability Degradations**

Monitoring is necessary for adaptive and digitally assisted time-based degradation reduction techniques. This can be achieved in two different ways:

### **2.5.1 Direct Sensing**

It involves the direct sensing of the desired parameters, which can be in the form of voltage, current, or frequency measurements. However, direct sensing often interferes with the normal circuit operation, resulting in performance losses. A possible method to solve this problem is Periodic sensing, where measurements are taken at regular intervals over time. In order to get a sense of how much degradation has occurred measurements are compared with a reference design.

### **2.5.2 Indirect Sensing**

This is an alternate monitoring technique which concentrates on different part of the circuit instead of the transistor under test. This technique can be used in cases where device under test is very sensitive and there is no feasible option to directly sense the required parameters. An operational amplifier's differential pair serves as an example of this idea. In certain cases, there can be some connection between the output of the transistor under test and the output of another

transistor that is situated in a different part of the circuit. The monitoring system can be designed based on the output of this particular transistor. Indirect sensing can be more difficult. Because it requires the maintenance of a constant correlation under a variety of process, voltage, and temperature circumstances.

# **Chapter 3**

## **Conclusion**

The aging of transistors leads to a gradual decrease in the performance of analog circuits over time, eventually resulting in circuit malfunctions. So device variability and reliability plays crucial role in optimizing the lifespan, performance, power and area of nanoscale CMOS circuits. A brief review of different approaches within the analog circuit domain aimed at addressing issues related to the degradation of circuit reliability over time are discussed here. The solutions that are presented here focuses on making changes at the circuit and system levels in order to maintain the product's performance above the specified limits throughout its lifespan. Out of various solutions presented here Sense and React approach seems to be more reliable, also when we consider loading effect indirect sensing is preferred. Along with that it is crucial to consider the influence of process variations, not just for circuits under test but also for calibration units. Use of computer-aided design tools are recommended to reduce Computation cost of reliability simulations.

# References

- [1] M. A. Alam, K. Roy, and C. Augustine, “Reliability- and process-variation aware design of integrated circuits — a broader perspective,” in *2011 International Reliability Physics Symposium*, 2011, pp. 4A.1.1–4A.1.11.
- [2] M. B. Yelten, P. D. Franzon, and M. B. Steer, “Surrogate-model-based analysis of analog circuits—part ii: Reliability analysis,” *IEEE Transactions on Device and Materials Reliability*, vol. 11, no. 3, pp. 466–473, 2011.
- [3] E. Afacan, M. Berke Yelten, and G. Dündar, “Review: Analog design methodologies for reliability in nanoscale cmos circuits,” in *2017 14th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD)*, 2017, pp. 1–4.

[1] [2] [3]