

# A DNN-Based Compact Modeling Technique for GAA Si NS FETs and Its Application in CMOS Circuit Simulation\*

Rajat Butola<sup>1,2</sup>, Sekhar Reddy Kola<sup>1,2</sup>, and Yiming Li<sup>1-7,\*</sup>

<sup>1</sup>Parallel and Scientific Computing Laboratory; <sup>2</sup>Electrical Engineering and Computer Science International Graduate Program; <sup>3</sup>Institute of Communications Engineering; <sup>4</sup>Institute of Biomedical Engineering; <sup>5</sup>Institute of Pioneer Semiconductor Innovation; <sup>6</sup>Institute of Artificial Intelligence Innovation, <sup>7</sup>Department of Electronics and Electrical Engineering.

National Yang Ming Chiao Tung University, Hsinchu 300093, Taiwan

\*Tel: +886-3-5712121 ext. 52974; Fax: +886-3-5726639; \*E-mail: yml@nycu.edu.tw

## Abstract

This paper presents a novel machine learning (ML) based compact modeling framework using dynamic neural network (DNN). This framework analyzes the effects of process variation (PV) in gate-all-around (GAA) silicon (Si) nanosheet (NS) transistors. DNN model utilizes a composite dataset from TCAD and SPICE simulations for training and takes advantage of both techniques to develop a fast and accurate compact model (CM). It is further implemented into SPICE and evaluated for accuracy through simulations of some circuits including inverter and logic gates. Results with error < 1% explicate a great potential of proposed model for emerging future devices.

## 1. Introduction

Due to possessing the preferable electrical properties such as improved electrostatics and excellent short channel control NS have been extensively used in IC fabrication [1]. However, highly scaled devices possess complex underlying physics [2] [3]. Therefore, traditional CM development becomes a very complicated task that takes a long time and requires special expertise of device. Moreover, for such nanoscale devices, PV becomes more significant, and the CM lacks PV robustness [4] [5]. To overcome these challenges, a DNN-based CM is proposed that used a hybrid dataset from TCAD and SPICE simulations for training owing to two main reasons: (1) TCAD provides reliable device physics, but data generation process is time-consuming, in contrast (2) SPICE offers rapid simulations but a slight compromise in accuracy.

## 2. Data Generation and DNN Framework

The 3D schematic view of GAA Si NS device structure is shown in Fig. 1(a). A fine calibration of the TCAD model is performed against experimental data for both nFET and pFET NS devices as shown in Fig. 1(b), before performing simulations to generate the dataset for ML-model. Similarly, we develop nominal device models based on SPICE by calibrating design parameters against the accurate TCAD model shown in Figs. 1(c) and (d). The nominal device parameters are listed in Table I. The variability sources i.e. gate length ( $L_G$ ), NS width ( $W_{NS}$ ), and thickness ( $T_{NS}$ ) are varied by  $\pm 10\%$  across their nominal values to obtain fluctuated devices. The terminal biases  $V_G$  and  $V_D$  are varied from 0 to 0.7 V with step size = 0.02 V and dataset of approx. 2500 devices is generated.

The proposed DNN model is illustrated in Fig. 2 and overall workflow is described in Fig. 3. The hyperparameters of the DNN model are listed in Table II. The DNN model consists of 3 ANNs and works in two phases. The first is the general learning phase where ANN<sub>1</sub> gets trained from SPICE data to produce initial weights. However, these weights lack accurate representation power due to inaccuracy associated with SPICE data. Therefore, in the second phase i.e., the adaptive weighting phase, the TCAD data is included to enhance the accuracy. In this phase, new weights are generated by using three different drain ( $I_D$ ) currents as shown in Fig. 2. The new weights are then trained according to input parameters by using two different ANNs i.e. ANN<sub>2</sub> and ANN<sub>3</sub>. The final outcome of DNN model is the sum of predicted values

through ANN<sub>2</sub> and ANN<sub>3</sub> multiplied by their corresponding  $I_D$  values and calculated as below:

$$I_{D,DNN} = I_{D,ANN} \times W_{ANN,Pred} + I_{D,TCAD} \times W_{TCAD,Pred}.$$

Convergence is a critical issue associated with the circuit simulation which needs to be addressed. Therefore, the following conversion function is used to solve this problem.

$$I_{D,DNN} = V_D \times \exp(I_{D,DNN}).$$

## 3. Results and Discussion

Firstly, the data is split into training and testing components. After the successful training, the DNN model is evaluated on the TCAD-based test dataset and predicted results are verified against these simulated values. The DNN predicted and TCAD simulated transfer characteristics ( $I_D-V_G$ ) plots of NS nFET and pFET devices for different PV sources are shown in Figs. 5(a) and (b) in both linear and logarithmic scales, respectively. The proposed model successfully estimates the non-linearity of process variability. The DNN model is evaluated using the RMS error which exhibits its fitting accuracy. The corresponding RMS error for both nFET and pFET devices is reported approximately 0.1%. Similarly, the output characteristics ( $I_D-V_D$ ) plots are also estimated using the DNN model and results are presented in Figs. 6(a) and (b). The results of the proposed model exhibit a good agreement with simulated curves. The overall performance of the model is excellent and RMS error is reported below 0.4 %.

We further explore the potential of the DNN model for circuit analysis by porting it into the SPICE simulator. To develop a neural compact model, the weights and biases of the trained DNN model are extracted and transformed into the Verilog-A program. The performance of DNN-CM is verified by applying it to multiple digital circuits such as inverter and logic gates (OR, NOR, AND, and NAND). The modeling results of DNN-based CM are compared against the SPICE-generated samples for the inverter and various logic gates. Fig. 6(a) shows the NS-based inverter circuit and Fig. 6(b) is the voltage transfer characteristics (VTC) plots comparison between the SPICE and DNN model. Similarly, Figs. 7(a)-(f) show the SPICE vs DNN model results for logic gates and model depicts overall good agreement with SPICE samples.

## 4. Conclusions

A novel ML-based DNN model is proposed to comprehensively analyze the process variability sources for NS devices and circuits. The model takes advantage of highly accurate TCAD and time-efficient SPICE techniques to develop a fast and accurate CM with RMS error of less than 1%.

## Acknowledgments

This work was supported in part by the National Science and Technology Council (NSTC), Taiwan, under Grant MOST 111-2221-E-A49-181 and in part by the “2022 Qualcomm Taiwan Research Program (NYCU)” under Grant NAT-487835 SOW.

## References

- [1] C.-Y. Huang et al., IEDM, 2020, pp. 20.6.1-20.6.4. [2] H. Xu et al., IEEE TED, 2022, pp. 3568-3574. [3] S. Sato et al., ECS Trans. 2015, 66, 171. [4] X. Feng et al., Jpn. J. Appl. Phys. 2013, 52, 04CC17. [5] J. Wang, et al., IEEE TED, 2021, pp.1318-1325.



**Fig. 1** (a) The schematic of 3D device structure of gate-all-around (GAA) silicon (Si) nanosheet (NS) field effect transistor (FET). (b) shows the simulated and the experimental characteristics curves for the accurate calibration for both GAA Si NS nFET and pFET devices. Solid line represents the simulation whereas, symbol represent the experimental curves. (c) and (d) show the developed nominal device models based on SPICE by calibrating design parameters against the accurate TCAD model for NS nFET and pFET devices, respectively.

**Table I**  
List of gate-all-around (GAA) silicon (Si) nanosheet (NS) field effect transistor (FET) nominal device parameters corresponding to sub-3-nm technology node and crucial process variability sources.

| Parameters                                  | Value              |
|---------------------------------------------|--------------------|
| Gate length ( $L_G$ ) (nm)                  | 16                 |
| Channel doping ( $\text{cm}^{-3}$ )         | $6 \times 10^{17}$ |
| NS width ( $W_{NS}$ ) (nm)                  | 25                 |
| NS thickness ( $T_{NS}$ ) (nm)              | 5                  |
| $S_{\text{ext}}/D_{\text{ext}}$ Length (nm) | 5                  |
| EOT (nm)                                    | 0.66               |



**Fig. 2** Illustration of the proposed DNN model architecture. It consists of 2 ANN models. First, ANN<sub>1</sub> trains using SPICE samples. The predicted  $I_D$  along with TCAD and SPICE simulated  $I_D$  are used to generate new weights. These new weights are then trained with ANN<sub>2</sub> and ANN<sub>3</sub> models. At last, weighted average sum is calculated which gives the final output.

**Table II**  
List of hyperparameters utilized for the training of DNN model.

| Hyperparameters              | DNN                                                               |
|------------------------------|-------------------------------------------------------------------|
| (I/P, O/P) Dimension         | (5, 1)                                                            |
| No. of Hidden Layers         | 1                                                                 |
| No. of Neurons in each layer | ANN <sub>1</sub> = 15, ANN <sub>2</sub> = 5, ANN <sub>3</sub> = 5 |
| Activation Function          | Tanh                                                              |
| Learning Rate                | 0.001                                                             |
| Optimizer                    | Adam                                                              |
| Batch Size                   | 25                                                                |
| Epochs (Fixed)               | 3000                                                              |
| Early stopping               | 648                                                               |



**Fig. 3** Schematic flowchart of proposed methodology. It illustrates that the process starts with calibration of TCAD and SPICE models before data generation, next the training of DNN model is done and at last the weights of DNN are extracted to replicate into the CM.



**Fig. 4** Comparison of  $I_D$ - $V_G$  characteristics between TCAD simulated and DNN model predicted curves for various process variability sources. (a) and (b) show the  $I_D$ - $V_G$  curves for p-FET and n-FET in linear and log scale, respectively. DNN achieved RMSE below 0.2%.

