

# Noise and Process Variation Tolerant, Low-Power, High-Speed, and Low-Energy Full Adders in CNFET Technology

Yavar Safaei Mehrabani and Mohammad Eshghi, *Member, IEEE*

**Abstract**—In this paper, a number of novel 1-bit full adder cells using carbon nanotube field-effect transistor devices are presented. First of all, some two-input XOR/XNOR circuits are proposed, and then, they are employed to form 1-bit full adders. Totally, five full adders with driving power and one without driving power are proposed in this paper, each of which has its own merits. Simulations with regard to supply power scaling and different load conditions confirm the superiority of the proposed cells compared with the previously reported ones in terms of power, delay, power-delay product (PDP), and Energy-delay product (EDP). Also embedding the proposed full adders in the large circuits, such as ripple carry adder (RCA), with a wide word length shows that they have better power, speed, and PDP with regard to their counterparts. Furthermore, the susceptibility of the full adders against both input noise and process variations (diameter deviations of carbon nanotubes) is studied. In terms of noise, the proposed cells have a close competition to their counterparts, and they are robust against high amplitude of noises. In terms of process variation, the proposed cells with driving power display the most robustness compared with their counterpart.

**Index Terms**—Carbon nanotube field-effect transistor (CNFET), full adder, low-energy, noise, process variation.

## I. INTRODUCTION

**E**XPLOSIVE usage of battery-based portable electronic systems, such as laptops, notebooks, tablets, personal communication systems, personal digital assistants, and so many others, demands high performance circuits with respect to power consumption and operation speed to save energy and process the input data with high performance. Power consumption plays a pervasive role in VLSI systems. More power dissipation results in more heat that is taken out of a system. Thus, it not only decreases the battery life but also needs a stronger fan to cool the system. Therefore, power consumption directly affects on cost, weight, size, and battery life of a system. On the other hand, operation speed is a vital task in today's high-speed electronic applications. Power consumption is a limiting factor to increase clock rate and density

Manuscript received October 15, 2015; revised January 9, 2016; accepted February 15, 2016.

Y. Safaei Mehrabani is an Independent Researcher (e-mail: advancedcomparch@gmail.com).

M. Eshghi is with the Faculty of Electrical and Computer Engineering, Shahid Beheshti University, Tehran 198396-3113, Iran (e-mail: m-eshghi@sbu.ac.ir).

Color versions of one or more of the figures in this paper are available online at <http://ieeexplore.ieee.org>.

Digital Object Identifier 10.1109/TVLSI.2016.2540071

of circuits. There is a quantitative metric that compromises between power dissipation and speed of circuit. In fact, a power-delay product (PDP) is widely used as a figure of merit to evaluate the overall performance of a designed circuit.

There are many VLSI applications, such as digital signal processing, image and video processing, microprocessors, and microcontrollers that extensively use arithmetic operations. Among various arithmetic operations, addition, subtraction, multiplication, and multiply and accumulate are mostly used. All aforementioned operations could be realized using the addition function. Therefore, a 1-bit full adder cell still plays an important role in determining the performance metrics of an entire system. Furthermore, a 1-bit full adder cell is used in arithmetic logic unit of a processor, in floating-point unit, and address generation for cache or memory access. Until now many full adder cells with different logic styles are reported in the literature using the metal–oxide–semiconductor FET (MOSFET) technology, each having its own merits and demerits [1]–[11].

In 1965, Moore [12] predicted that the number of transistors per square inch of an integrated circuit doubles nearly every 18 months. This law implies to scale silicon bulk transistors (i.e., MOSFETs) down continuously. Historically, the dimensions of transistors have been scaled down, such that nowadays, they have entered to a nanometer region. There are some issues that limit the shrinking of MOSFETs further. For instance, nanometer silicon bulk transistors experience new problems, such as short channel effects, drain-induced barrier lowering (DIBL), decreased gate controllability, hot electron effect, and so on [13]. As a result, there has been enormous motive to develop new devices, such as carbon nanotube field-effect transistor (CNFET), single electron transistor, FinFET, and so forth. It is worthwhile to note that CNFET is one of the promising successors to the conventional MOSFETs due to its unique electrical characteristics, such as ballistic transport and low OFF-current, which enables high-speed and low-power circuit designs [14]–[16].

Carbon nanotubes (CNTs) are sheets of one-atom-thick layer of graphite, called graphene, which are rolled into tubes. They are classified into two groups called single-wall CNT (SWCNT) and multiwall CNT (MWCNT). SWCNTs consist of one cylinder, whereas MWCNTs consist of more than one cylinder. An SWCNT can behave as semiconductor or conductor with respect to its chirality vector. Chirality vector determines the angle of the arrangement of carbon

atoms along the CNT, shown by integer pair ( $n_1, n_2$ ) [16]. If  $|n_1 - n_2| = 3K$  ( $K$  is an integer number) or  $n_1 = n_2$ , then CNT is metallic, and otherwise, it is semiconducting [16]. The CNT diameter is obtained using the following [16]:

$$D_{\text{CNT}} = 0.0783 \times \sqrt{n_1^2 + n_2^2 + n_1 n_2}. \quad (1)$$

CNFETs utilize SWCNTs as transistor channels on the substrate. The CNT channel region is undoped, while the other regions are heavily doped (i.e., source and drain regions). By the voltage applied to the gate terminal of CNFET, drain current ( $I_D$ ) can be controlled similar to that of MOSFETs. Threshold voltage of MOSFET-like CNFET can be calculated using (2) [17]. The diameter values of tubes change the threshold voltage of CNFETs

$$V_{TH} = \frac{0.43}{D_{\text{CNT}} (\text{nm})} (\text{V}). \quad (2)$$

In this paper, we use MOSFET-like CNFETs to develop novel 1-bit full adder cells. First, some novel two-input XOR/XNOR circuits are presented, and their performances are compared with other classical and state-of-the-art circuits. Then, using them, different novel structures are presented to construct 1-bit full adder cells, each having its own figure of merit. Comprehensive computer simulations are conducted to scrutinize the performance of the proposed full adders in various situations. Simulation results confirm the superiority of the proposed cells against other designs. To have fair comparisons, all MOSFET oriented full adder cells have simulated in the CNFET technology. We provide a detailed performance analysis for full adders with different logic styles. There are antithesis performance results for CNFET technology compared with MOSFET technology due to dissimilarities between them. For instance, we consider two full adders namely FA1 and FA2. Simulations provide different results for MOSFET technology and CNFET technology. In MOSFET technology, FA1 has better performance, while in CNFET technology, FA2 has better performance. Therefore, we aim at revise a set of full adders with different logic styles in CNFET technology.

The remainder of this paper is organized as follows. In Section II, different logic styles of full adder cells in the literature are reviewed in detail. In this section, benefits and bottlenecks of cells are considered. In Section III, first, the structure of novel two-input XOR/XNOR circuits is presented. Then, novel circuits are presented for a 1-bit full adder cell using these XOR/XNOR circuits. Simulation environments are proposed in Section IV. Simulation results and performance analysis are shown in Section V. Finally, Section VI provides the conclusion.

## II. ANALYSIS OF PREVIOUS FULL ADDERS

Full adder circuit provides a Sum and a Carry output ( $C_{\text{out}}$ ) as the result of addition for three input binary digits, named A, B, and  $C_{\text{in}}$ . The  $C_{\text{out}}$  signal is passed to the higher significant position if it exists. Since a full adder plays a key role in determining the performance parameters of an entire digital system, various designs have been addressed in the literature over years. Each of these designs has its own merits and demerits, in terms of power consumption and speed.

In general, a full adder belongs to either a dynamic or a static style. Although a dynamic style full adder has small on-chip area and high-speed operation compared with a static style full adder, but it suffers from some inherent handicaps, such as susceptibility to leakage, charge sharing, high clock load, and low noise immunity [9]. Therefore, we do not include dynamic full adders in our discussions in this paper. Some recent published static style full adder cells are deeply reviewed in the following.

The full adder, based on classical standard complementary metal-oxide-semiconductor (CMOS) logic design style, is reported in [1]. This cell is called C-CMOS and consists of 28 transistors. The C-CMOS design does not use complement of input signals, and therefore, the short-circuit current is reduced. Another benefit of C-CMOS is that it produces full voltage swing outputs. The critical path of C-CMOS circuit contains five transistors, which results in long propagation delay. In MOSFET technology, the ratio of size of pMOS to nMOS should be almost three to have equal switching speed between them and good noise margin for the circuit. This causes the existence of large input capacitance in MOSFET technology and consequently more delay and power dissipation. On the other hand, since in CNFET technology, the both p-type (pCNFET) and n-type (nCNFET) transistors have equal current driving capability with the same transistor dimension [18], their sizes are set, such that to be equal. In conclusion, in the CNFET technology, the input capacitance of C-CMOS cell will be small. Another complementary design, namely Mirror, is reached with a cleverly change in the structure of C-CMOS. The Mirror full adder cell is reported in [2]. The Mirror uses the same number of transistors as used in the C-CMOS and consumes almost equal power dissipation. The main difference is that the maximum propagation path of the Mirror consists of four transistors, which is smaller than that of the C-CMOS. Therefore, the Mirror design is expected to be faster than the C-CMOS design.

The full adder based on a transmission function full adder (TFA) is reported in [3]. The TFA circuit is based on transmission function theory. The maximum propagation path of the TFA contains four transistors. The TFA is constructed using 16 transistors. The power consumption of the TFA is expected to be lower than that of the C-CMOS and Mirror cells due to lower transistor count and lower input capacitances. It is worth to point out that the TFA has not driving outputs, and its performance will significantly degrade either in the presence of large fan-outs or in a cascaded configuration. This is due to coupling the inputs to the outputs of the circuit.

The design of a transmission gate full adder (TGA), which is formed using TGs, is reported in [1]. The TGA comprises 20 transistors, and its critical path has four transistors. The same as the TFA, since the inputs of the TGA cell are coupled to the outputs, it suffers from lacking driving power. Lacking driving capability drastically degrades the performance of the TGA, where it is employed in a cascaded structure or where there are large fan-outs. One solution to alleviate the performance degradation is to put buffers at the output nodes between consecutive cascaded stages. However, this approach results in transistor count overhead and removes the

advantage of small transistor count in large cascaded circuits. The TGA cell performs better than the TFA due to employing TGs at the output nodes instead of pass transistors. In fact, its delay and power dissipation are less than that of TFA, especially in larger circuits.

The structure of a complementary pass-transistor logic TG (CPL-TG) is reported in [4]. The CPL-TG not only is dual rail but also provides driving capability at the expense of large transistor count. Then, it is advantageous in the cascaded circuits. To generate the XOR/XNOR functions, it uses CPL. Thus, the input capacitance reduces because of small number of transistors. On the other hand, it uses TGs to drive the static inverter existence at the output nodes. Then, the leakage power of inverters is removed due to full voltage swing signals are applied to drive them. When the CPL-TG is solely used, it contains 36 transistors; but when it is used in a cascaded structure, the transistor count will be decreased due to the elimination of static inverters to provide complementary signals. The critical path of the CPL-TG as a standalone unit consists of four transistors, while in a cascaded configuration consists of three transistors. Therefore, the CPL-TG is expected to show good performance in larger structures.

The structure of static energy recovery full adder (SERF) is reported in [5]. The SERF full adder contains only ten transistors (10T) in which three of them are along the critical path from the input to the output. To produce Sum signal, it employs two cascaded XNOR circuits. It is worth to note that there is a threshold-loss problem for an XNOR circuit. In fact, the logic value high is equal to  $V_{dd} - V_{tn}$  ( $V_{tn}$  denotes the threshold voltage of the n-type transistor). Therefore, both internal signals and output signals (i.e., Sum and  $C_{out}$ ) are nonfull swing. As a result, the SERF design is not good at all for using in cascaded configuration. Due to the presence of nonfull-voltage-swing signals of SERF, both the power dissipation and the delay are increased. The advantage of SERF is that it has the least number of transistors.

The structure of a 13A full adder cell is reported in [6]. Like the SERF, 13A also uses 10T in its structure. Both SERF and 13A cells use two cascaded two-input XNORS' to provide Sum output. The circuit for the first XNOR is the same in both SERF and 13A full adder cells. They use two different designs for an XNOR circuit at the second stage. The SERF uses CMOS style to realize a two-input XNOR circuit, while the 13A uses a combination of CMOS (to realize a NOT gate) and pass-transistor logic (PTL) (to realize a 2-to-1 multiplexer) styles. It is apparent that the PTL-based XNOR circuit embedded in a 13A full adder has not enough driving power. The presence of an inverter circuit in the structure of 13A will result in short-circuit power dissipation. Therefore, the power consumption of 13A is expected to be more than the SERF.

The structure of a complementary and level restoring carry logic (CLRCL) full adder cell is reported in [7]. The CLRCL is another 10T cell, which is based on PTL. Theoretical dc analysis demonstrates that the minimum power supply ( $V_{dd}$ ) for the both SERF and 13A cells is equal to  $2V_{tn} + |V_{tp}|$  (note that  $V_{tn}$  and  $V_{tp}$  stand for the threshold voltage of n- and p-type transistors, respectively) [7]. This is because

of connecting incomplete signals to the control select signal of PTL-based multiplexers. The CLRCL design eliminates this problem by applying full voltage swing signals to the select control of multiplexers. The minimum  $V_{dd}$  that the CLRCL can function is equal to  $V_{tn} + |V_{tp}|$ . In conclusion, the delay of the CLRCL is less than that of SERF and 13A full adders. With regard to power consumption, SERF consumes the least power compared with 13A and CLRCL 10T full adders, since it does not use any inverter, and consequently, short-circuit power is removed. On the other hand, the CLRCL design due to using more inverters consumes more power consumption compared with 13A and SERF 10T cells.

The structure of hybrid pass-transistor logic with static CMOS output drive full adder (HPSC) is reported in [8]. This design with 26 transistors is placed among high-gate-count full adders. It exploits different logic styles, such as TG logic (TGL), PTL, and static CMOS to obtain the best performance. It uses four transistors to generate two-input XOR/XNOR functions. Furthermore, to overcome the threshold-loss problem existence at the outputs of XOR/XNOR circuits, it uses feedback loop transistors and pass transistors to strengthen the outputs of the XOR and XNOR circuits. The HPSC design has driving power due to employing inverters at the output nodes. Then, it functions properly in cascaded configurations. The maximum critical path of HPSC belongs to  $C_{out}$  with four transistors, which make a long propagation delay.

The structure of new hybrid pass-transistor logic with a static CMOS output drive full adder (NEW-HPSC) is reported in [9]. It consists of 24 transistors, and its maximum critical path contains four transistors. Regardless of smaller transistor count of the NEW-HPSC compared with HPSC, its power consumption is higher than HPSC. This is due to it applies one inverter more than HPSC, which results in larger short-circuit current. Furthermore, from the delay point of view, the delay of NEW-HPSC is expected to be higher than the HPSC. However, the critical path of both the NEW-HPSC and HPSC circuits is equal to four transistors. If we take a look at the XOR/XNOR circuits of both cells, it is clear that HPSC benefits more strength outputs at the expense of two more transistors. Employing feedback transistors along with pass transistors restores full-swing output signals of XOR/XNOR circuits with the less delay compared with NEW-HPSC. The NEW-HPSC employs only cross-coupled pCNFETs to restore outputs of XOR/XNOR circuits. Note that Goel *et al.* [9] reported that employing nMOS transistors at the first stage of an XOR/XNOR circuit leads to an increase in speed compared with that of existing in HPSC. They state that nMOS transistors have higher mobility than pMOS transistors. Thus, the speed of XOR/XNOR circuit existence in an HPSC full adder is less than that of existence in the NEW-HPSC, because the pMOS transistors have low mobility than nMOS transistors. It is worth to notice that this statement is false in CNFET technology. In fact, in CNFET technology, both the p- and n-type transistors have equal mobility. In conclusion, the HPSC and the NEW-HPSC are two full adders that in MOSFET technology, the NEW-HPSC is dominant, while in CNFET technology, the HPSC has better results.

TABLE I  
CHARACTERISTICS OF DIFFERENT FULL ADDER CELLS

| Full Adder | Transistor Count | Critical Path | Dual Rail | Driving Capability | Threshold-Loss Problem |
|------------|------------------|---------------|-----------|--------------------|------------------------|
| C-CMOS     | 28               | 5             | No        | Yes                | No                     |
| Mirror     | 28               | 4             | No        | Yes                | No                     |
| TFA        | 16               | 4             | No        | No                 | No                     |
| TGA        | 20               | 4             | No        | No                 | No                     |
| CPL-TG     | 36               | 4             | Yes       | Yes                | No                     |
| SERF       | 10               | 3             | No        | No                 | Yes                    |
| 13A        | 10               | 3             | No        | No                 | Yes                    |
| CLRCL      | 10               | 5             | No        | No                 | Yes                    |
| HPSC       | 26               | 3             | No        | Yes                | No                     |
| NEW-HPSC   | 24               | 4             | No        | Yes                | No                     |
| Ours1      | 28               | 3             | No        | No                 | No                     |
| HCTG       | 16               | 4             | No        | No                 | No                     |

The schematic of the Ours1 full adder cell is reported in [10]. The Ours1 cell uses double pass-transistor logic (DPL) to realize a full adder cell. It contains 28 transistors and is placed among high-gate-count full adders. Although the Ours1 has the large number of transistors, due to using DPL logic style, its power dissipation is very low. Furthermore, its maximum propagation path consists of three transistors, which cause high-speed operation. To produce output signals of Sum and  $C_{out}$ , it employs TGs at the last stage of the circuit. TGs result in full voltage swing outputs but they have not driving capability. In conclusion, the Ours1 will perform well against small fan-outs. Its performance will be drastically decreased when it is used in the cascaded mode of operation because of lacking driving power.

The structure of hybrid CMOS logic with a TG logic (HCTG) full adder cell is reported in [11]. In [11], a modified design for an XNOR circuit is presented. This XNOR circuit consists of six transistors and offers low-power consumption and high-speed operation. Level restoring transistors MP3 and MN3 guarantee full voltage swing XNOR signal. Then, using an inverter gate, the XOR signal is generated. To generate  $C_{out}$  and Sum outputs, TGL and CMOS logic style are employed, respectively. The HCTG has 16 transistors, which result in the reduction of power consumption. One essential drawback of the HCTG is that it suffers driving power by coupling inputs to outputs. Therefore, its performance will degrade in the presence of large fan-outs or when it is embedded in wide cascaded configuration without inserting buffers between cascaded stages.

Table I summarizes the characteristics of these full adder cells. Among different cells, only CPL-TG is dual rail and the others are single rail. Driving capability does not exist when either inputs are coupled to the output signals or TGs are used at the output nodes. Considering Table I, only C-CMOS, Mirror, CPL-TG, HPSC, and NEW-HPSC have driving capability. A threshold-loss problem means that at least one output signal of Sum or  $C_{out}$  is not full voltage swing. Table I shows that all 10T cells have the threshold-voltage drop problem.



Fig. 1. Block diagram of the proposed full adder cell without driving capability.



Fig. 2. Block diagram of the proposed full adder cell with driving capability.

### III. THE PROPOSED FULL ADDER CELLS

Our proposed full adder cell is based on the following logic equations. In fact, we first need to generate two-input XOR/XNOR circuits, and then using a 2-to-1 multiplexer produce the Sum and  $C_{out}$  outputs. Fig. 1 shows the block diagram of the proposed 1-bit full adder cell, based on

$$\text{Sum} = (\overline{B \oplus C}) \cdot A + (B \oplus C) \cdot A' \quad (3)$$

$$C_{out} = (\overline{B \oplus C}) \cdot B + (B \oplus C) \cdot A. \quad (4)$$

Using multiplexers at the output nodes results in a low driving power for the design shown in Fig. 1. To enhance the driving capability, first, we complement (3) and (4) to obtain logic relations (5) and (6), respectively. Then, using NOT gates at the output nodes, the desired signals are produced

$$\overline{\text{Sum}} = (\overline{B \oplus C}) \cdot A' + (B \oplus C) \cdot A \quad (5)$$

$$\overline{C_{out}} = (\overline{B \oplus C}) \cdot B' + (B \oplus C) \cdot A'. \quad (6)$$

The block diagram of the proposed full adder cell, which is based on relations (5) and (6), is shown in Fig. 2. The presence of inverters at the output nodes decouples inputs and outputs and provides a sufficient driving power. Therefore, this structure can be efficiently used in any circuit configuration.

As the method proposed in [9], considering Figs. 1 and 2, we separately present and analyze corresponding circuits for XOR/XNOR and multiplexer modules to form the novel full adder cell. In Section III-A, we present the transistor level implementation of the novel XOR/XNOR functions. In Section III-B, the schematic of a 2-to-1 multiplexer is presented. Finally, in Section III-C, the transistor level implementations of the proposed full adder cells are presented.

#### A. Proposed XOR/XNOR Circuits

A novel design for a two-input XOR/XNOR circuit using PTL is shown in Fig. 3. This structure is called X-Design1 and consists of 12 transistors. The absence of direct path between power supply ( $V_{dd}$ ) and ground (Gnd) in X-Design1 leads to the reduction of short-circuit power dissipation. Another remarkable feature of X-Design1 is that it generates output signals simultaneously. This causes the elimination of glitches (unwanted switching of signals) and in turn the elimination



Fig. 3. Proposed structure of the XOR/XNOR circuit (X-Design1).

TABLE II  
OPERATION OF X-DESIGN1

| B | C | XOR      | XNOR     |
|---|---|----------|----------|
| 0 | 0 | Weak 0   | Strong 1 |
| 0 | 1 | Strong 1 | Strong 0 |
| 1 | 0 | Strong 1 | Strong 0 |
| 1 | 1 | Strong 0 | Weak 1   |

of short-circuit power dissipation. It is worth mentioning that the short-circuit power contributes  $\sim 5\%-20\%$  to the dynamic power [19]. The critical path of X-Design1 comprises two transistors.

Table II clarifies the operation of the proposed cell. Considering Table II, both XOR and XNOR circuits suffer from a threshold-loss problem. When there is an input transition that leads to the input vector  $BC = 00$ , then an XOR circuit generates a weak low signal. In fact, the voltage level of the low signal is more than Gnd,  $|V_{tp}|$ , the threshold voltage of a p-type transistor. In the same manner, when input vector  $BC = 11$  is applied to the XNOR circuit, then it generates a weak high signal,  $|V_{dd} - V_{tn}|$ . Since these circuits (i.e., XOR and XNOR) are a building block of larger circuits, such as full adders, the presence of threshold-voltage drop is not desired at all, because it negatively affects the power dissipation and switching delay of larger circuits. In conclusion, it is a vital task to improve the performance parameters of an XOR/XNOR circuit before embed it in the structure of a full adder cell.

As shown in Fig. 4, to cope with the threshold-loss problem of the proposed XOR/XNOR circuit, using feedback loop [8], [20], [21] and TG techniques, we present four alternative circuits.

Fig. 4(a) shows the structure of the X-Design2 cell that uses feedback loop transistors to restore weak outputs, such that it make them full swing. In case that the input vector is either  $AB = 00$  or  $AB = 11$ , feedback loop transistors switch ON, and pull weak output signals down to Gnd or up to  $V_{dd}$ , respectively, in order to provide enough driving power to the following circuits. In these two cases of inputs, poor signals are applied to the transistor gates of a feedback circuit, and causes feedback transistors to switch ON with a delay. However, using this technique, full-swing outputs are realized. The X-Design2 with decreasing the switching delay of outputs leads to the reduction of short-circuit power and finally to the



Fig. 4. Alternative circuits for XOR/XNOR with full-swing outputs. (a) X-Design2. (b) X-Design3. (c) X-Design4. (d) X-Design5.

reduction of power consumption compared with the X-Design1 cell, at the expense of two more transistors.

Another approach is using a TG method at the expense of four more transistors compared with the X-Design1. Fig. 4(b) shows the structure of the X-Design3 cell that uses this technique. In the case of  $AB = 00$ , transistors T1 and T2 switch ON and provide full-swing voltage level (good low signal) at the XOR output. On the other hand, in the case of  $AB = 11$ , transistors T3 and T4 switch ON and consequently provide perfect high voltage level at the XNOR output. In these cases (i.e.,  $AB = 00$  or  $AB = 11$ ), double path driving capability exists, because of two transistors being ON at the same time. As a result, driving current will increase and will lead to the reduction of delay compared with the X-Design1 and X-Design2 cells. Furthermore, TG theory is inherently low power, and therefore, the power dissipation of X-Design3 is lower than X-Design1 and X-Design2 circuits.

Taking a look at the design of X-Design3 circuit, it is apparent that the existence of transistors either T1 and T3 or T2 and T4 is sufficient to have good outputs. Therefore, by removing transistors T1 and T3 of X-Design3, we introduce another structure called X-Design4 [Fig. 4(c)]. Due to the smaller transistor count of X-Design4 compared with X-Design3, it is expected to have lower power consumption than the X-Design3. On the other hand, turning two double driving paths into two single ones (by removing transistors T1 and T3), the driving power of the XOR/XNOR circuit decreases. In conclusion, the delay and the power consumption of X-Design4 against large fan-outs are considerably increased, compared with X-Design3.

The last design for the XOR/XNOR circuit, called X-Design5, is shown in Fig. 4(d). By removing two transistors from the structure of the X-Design4, the X-Design5 is achieved. The X-Design5 consists of 12 transistors, two transistors less than the transistor count of X-Design4. The presence of TGs in both XOR and XNOR circuits guarantees the full-swing voltages at the output nodes. It is expected that X-Design5 has lower power consumption compared with the X-Design4 due to the smaller number of transistors.



Fig. 5. Structure of the TG-based multiplexer.

Five novel structures using different techniques were proposed in this section, each of which has its own merits and demerits over various amounts of fan-outs. Since XOR/XNOR circuits are the building block of 1-bit full adder cells, we have intensely studied their performances under different simulation conditions in Section V.

### B. Multiplexer Circuit

In order to implement expressions (5) and (6), we need a 2-to-1 multiplexer with XOR and XNOR as the select lines. To realize the multiplexer, TGL is used. The structure of the TG-based multiplexer shown in Fig. 5 is the most common way, and it inherently consumes low power. However, it suffers from lacking high driving power to drive cascaded full adders. It is worth to note that as shown in Fig. 2, the TG multiplexers are followed by inverters. Then, the high driving capability is guaranteed. One advantage of this kind of multiplexer is that it uses only four transistors. In conclusion, a TG multiplexer is used inside the proposed full adder.

### C. Proposed Full Adders

Fig. 6 shows six different designs of our proposed full adder cells, which employ hybrid logic style. Fig. 6(a)–(e) is produced by replacing the XOR/XNOR module existence in Fig. 2 with the proposed XOR/XNOR cells shown in Figs. 3 and 4. All these full adder cells provide good driving capability by decoupling inputs and outputs. Furthermore, Fig. 6(f) shows the full adder without driving outputs that we intentionally include in our simulations to show the effect of output driving power in determining performance measures of full adders. This design is realized by replacing the XOR/XNOR module presented in Fig. 1 with the XOR/XNOR circuit shown in Fig. 4(c). We are later intended to clarify the advantage of full adders that are constructed based on block diagram shown (in Section V) in Fig. 2 over the block diagram shown in Fig. 1.

Fig. 6(a) clarifies the schematic of the first proposed full adder cell, which employs PTL to provide XOR/XNOR functions. The new pass-transistor full adder (NEW-PT-FA) consists of 26 transistors. The critical path of the NEW-PT-FA consists of four transistors. This structure provides a good driving capability, using NOT gates at the output nodes. In fact, it guarantees the proper functionality when it is embedded inside large circuits in a cascaded form. Nonfull-swing outputs of XOR/XNOR circuits embedded in the NEW-PT-FA cause leakage current and consequently large power consumption.

Feedback loop transistors in the XOR/XNOR circuit of the NEW-Feedback Loop-Full Adder (NEW-FL-FA) full adder,



Fig. 6. Proposed hybrid-CNFET full adders. (a) NEW-PT-FA. (b) NEW-FL-FA. (c) NEW-DD-FA. (d) NEW-SD-FA. (e) NEW-RSD-FA. (f) NEW-ND-FA.

shown in Fig. 6(b), restore the nonfull-swing outputs of the XOR/XNOR circuit. The NEW-FL-FA contains more transistors compared with the NEW-PT-FA, 28 transistors. The critical path of the NEW-FL-FA consists of four transistors. The same as the NEW-PT-FA, inverters at the output nodes of NEW-FL-FA provide desired driving power for the following cells. Feedback loop transistors decrease power consumption by generating full output voltage swing and removing static power dissipation, while feedback loop decreases the delay of circuit as well.

The other proposed full adder, which contains double driving path for XOR/XNOR circuits, called NEW-Double Driving-Full Adder (NEW-DD-FA), is shown in Fig. 6(c). The NEW-DD-FA employs transmission function along with PTL logic to produce full-swing outputs for the XOR/XNOR circuit. It contains 30 transistors, and the propagation delay consists of four transistors. Since TG is inherently a low-power circuit, the power consumption of the NEW-DD-FA is less than the NEW-PT-FA and NEW-FL-FA circuits.

The fourth proposed full adder shown in Fig. 6(d) that contains single driving path for the XOR/XNOR circuit is called NEW-Single Driving-Full Adder (NEW-SD-FA), and has 28 transistors. It uses TG and PTL logics to produce desired outputs, the same as the NEW-DD-FA. The difference between NEW-DD-FA and NEW-SD-FA is that the NEW-

SD-FA cell employs only two TGs to generate full voltage swing intermediate outputs (i.e., the outputs of the XOR/XNOR circuit). Thus, it has two fewer transistors than the NEW-DD-FA and consumes slightly lower power.

The design shown in Fig. 6(e), which removes two pass transistors of single driving path of the XOR/XNOR circuit, is called NEW-Removed Single Driving-Full Adder (NEW-RSD-FA). This design removes two pass transistors while at the same time maintaining the rail-to-rail outputs for the XOR/XNOR circuits. The same as the other proposed full adders, it benefits driving capability. The NEW-RSD-FA has 26 transistors, which is less than the transistors of NEW-FL-FA, NEW-DD-FA, and NEW-SD-FA. Therefore, it is expected to have the least power dissipation compared with others.

Fig. 6(f) embeds X-Design4 in the XOR/XNOR module of Fig. 1 to construct a full adder cell with non-driving power called NEW-Non Driving-Full Adder (NEW-ND-FA). The NEW-ND-FA produces full-swing outputs because of applying TGs in its structure. Unlike the previous ones, the NEW-ND-FA suffers from lacking driving power due to coupling inputs to outputs. On the other hand, a remarkable advantage of the NEW-ND-FA is that it is faster than the previous proposed full adders because of three transistors being in its critical path. The minimum transistor count, which is 24 transistors, belongs to the NEW-ND-FA design. Note that the NEW-ND-FA performs well when used solely in the presence of small fan-outs; but when it is used in the presence of larger fan-outs or cascaded structures, its performance is decreased dramatically. Laterally, we will demonstrate this statement with simulation results in Section V.

#### IV. SIMULATION ENVIRONMENT

In this section, first, the simulation environments are proposed. Then, comprehensive simulations are performed to scrutinize the performance of each full adder cell. We simulate different cells with various loads, and power supplies. Then, their efficiency is studied in a large test bed. Furthermore, their susceptibility to noise and manufacturing process variations is also considered.

##### A. Simulation Setup

In order to perform computer simulations, we use a well-known compact SPICE CNFET model proposed in [22] and [23]. This model is developed for MOSFET-like single-wall CNFETs. It considers nonidealities, such as the screening effect by the parallel CNTs for CNFET with multiple CNTs, Schottky-barrier resistance, parasitic gate capacitances, the elastic scattering in the channel region, the resistive source/drain (S/D), and so on. Table III shows some important parameters of the SPICE model with their values and brief descriptions.

We have used the chirality vector of (19, 0) [22] and also three tubes as the channel of each transistor [23]. By this selection, the diameter and the absolute value of threshold voltage of transistors are reached 1.487 nm and 0.28 V, respectively. Simulations are performed using Synopsys HSPICE tool for transistors with 32-nm channel length.

TABLE III  
SOME IMPORTANT PARAMETERS OF SPICE MODEL

| Parameter  | Value  | Description                                                           |
|------------|--------|-----------------------------------------------------------------------|
| $L_{ch}$   | 32nm   | Physical channel length                                               |
| $L_{geff}$ | 100nm  | The mean free path in the intrinsic CNT channel                       |
| $L_{ss}$   | 32nm   | The length of doped CNT source-side extension region                  |
| $L_{dd}$   | 32nm   | The length of doped CNT drain-side extension region                   |
| $K_{gate}$ | 16     | The dielectric constant of high-K top gate dielectric material        |
| $T_{ox}$   | 4nm    | The thickness of high-K top gate dielectric material                  |
| $C_{sub}$  | 40pF/m | The coupling capacitance between the channel region and the substrate |
| $E_{fi}$   | 0.6ev  | The Fermi level of the doped S/D tube                                 |



Fig. 7. Simulation test bed for XOR/XNOR.

Taking a look at the designs of different full adders, it is clear that they all employ two-input XOR/XNOR submodules in their structure except for the SERF design [5]. As a matter of fact, these submodules act as the heart of the full adders. Therefore, they will affect the performance of 1-bit full adders directly. In conclusion, it is worth before studying the performance of full adders we consider the performance of XOR/XNOR circuits. The SERF full adder uses two cascaded XNOR circuits to produce the desired outputs. Since it does not contain XOR submodule, we do not include it in our simulation. Instead, we add another XOR/XNOR circuit proposed in [24] to have more comprehensive comparisons with previously reported XOR/XNOR circuits. To do simulation, we use the test bed shown in Fig. 7. The input buffers (two cascaded inverters) simulate more realistic inputs. The output signals of the XOR/XNOR module should load the inputs of other modules in the full adder cell. Often this load consists of gate, source, and drain of transistors, which varies for each full adder design. The average load is three transistor gates and one S/D [25]. In conclusion, we use output load of fan-out of two inverters (FO2) as a standard load. To evaluate their performances, we apply all 16 possible combinations of input patterns to two-input XOR/XNOR circuits. For precise comparison, simulations are performed with a range of output fan-outs (from FO2 to FO16) at power supply of 0.9 V, 100-MHz operating frequency in the room temperature (27 °C). It is worth to note that as reported in the International Technology Roadmap for Semiconductors (ITRS), the nominal value for supply voltage at the 32-nm CNFET technology is 0.9 V [26]. This kind of simulation will help understanding the performance of the primary target cell, i.e., 1-bit full adder cell.



Fig. 8. Simulation test bed for full adder.

Fig. 8 shows the simulation test bed [9] for a full adder cell. It applies input buffers to provide more realistic input signals and output load of fan-out of four inverters (FO4) as a standard fan-out at the output nodes [9]. All 56 transitions from an input combination to another are fed to the circuit under test (CUT) to measure the performance of full adders, including the maximum delay, power dissipation, and PDP [27]. This pattern does not include input combinations that are not changed.

The propagation delay is measured from the moment that the input signal reaches one half of power supply voltage level (50%  $V_{dd}$ ) until the output signal reaches the same voltage level. The corresponding delay for each transition is considered for both Sum and  $C_{out}$  outputs. Then, the greater one is reported as the delay of CUT. Furthermore, power consumption is the average power, which is measured during all transitions of inputs. Finally, PDP is considered as a compromise between delay and average power consumption.

Two different simulations are performed using this generic test bed, shown in Fig. 8, for full adder cells. First, their performances are evaluated by changing the output load of fan-out of inverters ranged from FO4 to FO64 at the same frequency and temperature conditions. Second, the performances of all full adder cells are evaluated for various supply voltages ranged from 0.8 to 1 V at 100-MHz operating frequency and the room temperature.

#### B. Noise Immunity Test Setup

Digital circuits are inherently noise tolerant, and they are only affected by noises with high amplitude and wide width. In fact, they are not vulnerable to weak input noise pulses and intercept them to cause unnecessary switching and malfunction [9]. A noise immunity curve (NIC) is used to determine how much a digital circuit is tolerant against input noises. The horizontal axis and the vertical axis in the NIC correspond to noise width ( $T_{noise}$ ) and noise amplitude ( $V_{noise}$ ), respectively. Each point on the NIC is a pair of ( $T_{noise}$ ,  $V_{noise}$ ), which depicts that a noise with a given width and equal or higher than the amplitude value of this point will result in a glitch (spurious switching) on the output. The region above the curve implies unsafe zone, whereas the region below the curve is safe against noise pulses. Thereby, a circuit with higher NIC demonstrates a more noise tolerant circuit. There is a quantitative measure called average noise threshold energy (ANTE), which is computed from the NIC. It is equal to  $E(V_{noise}^2 T_{noise})$ , where  $E()$  denotes the expectation operator. It is apparent that the higher the ANTE measure, the more the immunity to input noise pulse.



Fig. 9. Noise injection circuit.

Fig. 10.  $N$ -bit RCA circuit.

Fig. 9 clarifies a circuit that produces a noise pulse with tunable amplitude and width [9]. The noise amplitude and width are controlled by  $V_{dd,n}$  and  $V_c$  voltages, respectively. These parameters are independent. In other words, the amplitude of a noise pulse is constant with different widths. To plot the NIC, we have to find points that cause glitch on the output of CUT. It is worth to consider that the noise should be strong enough to cause glitch on the output of the full adder circuit following the CUT.

#### C. Real Environment Test Setup

To evaluate the performance of full adders in a real environment, we embed those in a ripple carry adder (RCA) circuit. The width of the RCA circuit is different from 4 to 32 bit. The circuit structure is shown in Fig. 10 [25]. The delay metric is measured from the moment that the input signals are fed through buffers to the moment that the output signals are produced from the last stage full adder cell. Power consumption is the average power consumption of all the circuits, including input and output buffers. To evaluate performance metrics, we apply all 56 input transitions. Simulation is performed at 0.9 V power supply, 100-MHz frequency, and room temperature.

#### D. Monte Carlo Test Setup

The CNT diameter variation is one of the multiple problems of CNFET imperfections caused due to nonidealities in the CNT synthesis process, which negatively affects the performance of CNFET circuits [28]. In fact, various CNT diameters result in different current flowing, large bandgap variability, and high influence upon the CNT carrier transport properties [29]. Therefore, we study the robustness of different full adder cells in the presence of varying CNT diameters. For this purpose, we use the Monte Carlo (MC) transient analysis. The distribution of CNT diameters is assumed as Gaussian, and a standard deviation (STD) from mean CNT diameter in the range of 0.04–0.2 nm is considered [29]. To have precise results, we have performed  $N = 1000$  simulations for each STD and considered PDP of cells.

## V. SIMULATIONS AND PERFORMANCE ANALYSIS

In this section we discuss simulation results and compare the efficiency of different designs in the separate subsections

TABLE IV  
AVERAGE POWER CONSUMPTION (E-6 W), DELAY (E-11 S), PDP (E-17 J), NUMBER OF TRANSISTORS,  
AND EDP (E-28 J.S) RESULTS FOR DIFFERENT LOADS

| Load      |            | FO2     |        |         |         |         | FO16    |        |        |         |         |
|-----------|------------|---------|--------|---------|---------|---------|---------|--------|--------|---------|---------|
| Design    | Reference  | Power   | Delay  | PDP     | No. Tr. | EDP     | Power   | Delay  | PDP    | No. Tr. | EDP     |
| X-Design1 | [Proposed] | 0.57673 | 1.4380 | 0.82933 | 12      | 1.19257 | 17.533  | 11.096 | 194.54 | 12      | 2158.61 |
| X-Design2 | [Proposed] | 0.11960 | 1.2398 | 0.14828 | 14      | 0.18383 | 1.4075  | 6.4962 | 9.1431 | 14      | 59.3954 |
| X-Design3 | [Proposed] | 0.09096 | 0.4269 | 0.03884 | 16      | 0.01658 | 0.8686  | 1.9247 | 1.6718 | 16      | 3.21771 |
| X-Design4 | [Proposed] | 0.08608 | 0.4091 | 0.03522 | 14      | 0.01440 | 0.9817  | 2.0972 | 2.0588 | 14      | 4.31771 |
| X-Design5 | [Proposed] | 0.07624 | 0.4703 | 0.03585 | 12      | 0.01686 | 1.0269  | 2.4801 | 2.5469 | 12      | 6.31656 |
| CPL-TG    | [4]        | 0.12580 | 0.9059 | 0.11397 | 10      | 0.10324 | 1.7706  | 4.9764 | 8.8114 | 10      | 43.8490 |
| HPSC      | [8]        | 0.11573 | 1.7726 | 0.20514 | 10      | 0.36363 | 1.6474  | 8.8313 | 14.549 | 10      | 128.486 |
| NEW-HPSC  | [9]        | 0.12933 | 2.1758 | 0.28139 | 8       | 0.61224 | 2.0043  | 11.020 | 22.088 | 8       | 243.409 |
| Ours1     | [10]       | 0.07227 | 0.4500 | 0.03252 | 12      | 0.01463 | 0.95026 | 2.0005 | 1.9010 | 12      | 3.80295 |
| HCTG      | [11]       | 0.07114 | 0.8194 | 0.05829 | 8       | 0.04776 | 0.88218 | 4.9910 | 4.4029 | 8       | 21.9748 |
| TFA       | [3]        | 0.06900 | 0.8318 | 0.05739 | 8       | 0.04773 | 0.88958 | 4.7267 | 4.2048 | 8       | 19.8748 |
| TGA       | [1]        | 0.70007 | 6.4656 | 4.5264  | 10      | 29.2658 | 12.837  | 42.006 | 539.23 | 10      | 22650.8 |
| 13A       | [6]        | 1.6346  | 2.7016 | 4.4160  | 8       | 11.9302 | 48.783  | 22.546 | 1099.8 | 8       | 24796.0 |
| CLRCL     | [7]        | 1.3623  | 6.5932 | 8.9819  | 6       | 59.2194 | 27.310  | 45.276 | 1236.5 | 6       | 55983.7 |
| Wang      | [24]       | 0.13676 | 1.6974 | 0.23213 | 12      | 0.39401 | 0.90266 | 5.3881 | 4.8636 | 12      | 26.2055 |

(V-A, V-B, V-C, V-D, V-E, V-F). First of all, we are intended to study the performance of XOR/XNOR circuits, since it is the heart of a full adder cell and can directly affect to its performance. Second, we will focus on the studying of performance of full adders with different logic styles. For this purpose, extensive simulations with regard to varying power supply, output load, input noise pulse, and process variation are considered. Furthermore, all full adders are studied in the large structure, such as a wide word length RCA.

#### A. XOR/XNOR Performance Analysis

Table IV shows the simulation results of the XOR/XNOR circuit with regard to varying output loads. We have reported power, delay, PDP, number of transistors (No. Tr.), and energy-delay product (EDP) metrics for different circuits. It is worth to note that the fan-out of FO2 is enough for studying the performance metrics of XOR/XNOR that is used in a full adder cell. However, we have extended our simulations to the fan-out of FO16. Shaded boxes show the best results in Table IV.

When we consider fan-out of FO2, the proposed circuits of X-Design3, X-Design4, and X-Design5 offer a better PDP, compared with other designs. For instance, at load of FO2, the X-Design4 saves PDP up to 69%, 82%, 87%, 39%, 99%, and 74% compared with XOR/XNOR circuits, which have been embedded in the full adder cells CPL-TG, HPSC, NEW-HPSC, HCTG, CLRCL, and the one introduced in [24], respectively. However, there is a very close competition with regard to PDP between X-Design4 and Ours1 cells. Furthermore, the proposed X-Design4 offers the lowest EDP compared with other XOR/XNOR circuits. In the presence of large fan-outs, such as FO16, the proposed designs, such as the X-Design3, function well. The X-Design3 saves PDP of about 81%, 88%, 92%, 12%, 62%, 60%, and 65% as compared with CPL-TG, HPSC, NEW-HPSC, Ours1, HCTG, TFA, and the one introduced in [24], respectively. Therefore, it can be inferred that the PDP of the proposed circuits gets better when the output load increases. Either the XOR circuit or the XNOR circuit employed in the CLRCL, TGA, and 13A does not

function well with the load of FO16, since the PTL logic without driving power has been employed in their structure. In addition, there is a close competition between the X-Design4 and the Ours1 designs, and the PDP of the X-Design4 is slightly more than the Ours1. This simulation confirms that the proposed circuits have reasonable functionality with low PDP against vast range of fan-outs. Also in power, delay, and EDP point of view, the proposed X-Design3 offers the best results compared with other circuits.

To show the superiority of CNT technology over the CMOS, we have simulated X-Design3 as an instance in both technologies. The 32-nm predictive technology model [30] is used for CMOS technology. At load of FO2, the PDP of X-Design3 is achieved 5.8414 aJ (Atto Jule) and 0.3884 aJ in CMOS and CNFET technologies, respectively. In conclusion, about 93% improvements are achieved when we use the CNFET technology.

This simulation was the first step to show the efficiency of the proposed full adder cells. In the next subsections (V-B, V-C, V-D) we evaluate the performance of both 1-bit full adder cells and ripple carry adders.

#### B. Performance Analysis Against Load

In this section, the performance metrics of all full adders are examined against a vast range of output loads ranged from FO4 to FO64 at 100-MHz operating frequency, 0.9 V power supply, and room temperature. Fig. 11 clarifies the average power consumption measured in a long time when all 56 input patterns are fed to the CUT. To have the better illustration, in Fig. 11, the maximum value of power consumption on the vertical axis is limited to 30. The SERF, 13A, and CLRCL designs have the most power consumption, particularly in the presence of large fan-outs. This is because of lacking a driving power in these cells. In fact, incomplete Sum and  $C_{out}$  signals will result in to draw extra current from  $V_{dd}$  to Gnd of inverters in the load stage of CUT, while the inputs are in the stable manner, i.e., there is static power consumption. Furthermore, the incomplete signals result in



Fig. 11. Average power dissipation of full adders against load.



Fig. 12. Delay of full adders against load.

the pull-up and pull-down transistors of inverters switch ON or OFF slowly and, consequently, increase the short-circuit current. In conclusion, these full adders are not suitable to be used in low-power applications. The SERF design fails to function at FO32 and FO64, but the 13A cell fails at FO64 output load. The other designs have almost the same power dissipation. However, the proposed SD-FA full adder has the least power consumption. At load of FO64, for example, it consumes the lower power of about 0.008%, 1.3%, 1.05%, 0.9%, 1.2%, 55%, 46%, 47%, 36%, and 39% as compared with C-CMOS, Mirror, CPL-TG, HPSC, NEW-HPSC, Ours1, HCTG, TFA, TGA, and the proposed ND-FA, respectively. The proposed ND-FA cell consumes higher power among the proposed cells, because it has not driving power. In addition, because of lacking driving power, the designs Ours1, HCTG, TFA, and TGA consume very high power especially when the output load is large (e.g., FO64). The proposed designs and also the C-CMOS, Mirror, CPL-TG, HPSC, and NEW-HPSC consume low power, since they employ inverter gates at the output nodes with high driving power.

The delay results under different output loads are shown in Fig. 12. Once again, to have the better illustration, in Fig. 12, the maximum delay value on the vertical axis is limited to 25. The 10T cells, including the SERF, 13A, and CLRCL, are the slowest ones due to lacking driving capability. They have not reasonable functionality even in the presence of small fan-outs. It is worth to notice that the efficiency of the 10T cells sharply decreases when the load increases.



Fig. 13. PDP of full adders against load.

At the small fan-outs, the delay of cells under comparison is very close to each other, except for 10T full adder cells. Considering Fig. 12, it can be inferred that the delay of the HCTG, TFA, Ours1, and the proposed ND-FA cells is higher than others, except for 10T cells, against large fan-outs of FO32 and FO64. They all have not driving power at the output nodes to swiftly take the output signals to their 50% of their final voltage level. Although the TGA has not driving power, but its delay is less than the other nondriving cells. This is because of using TGs at the output nodes, while the TFA uses PTL at the output node. TGs in the TGA cell are controlled by full-swing internal signals, which make them to switch ON or OFF very fast. Fig. 12 demonstrates that the proposed cells are very fast in the vast range of loads. At the load of FO32, the proposed DD-FA, for example, is faster of about 8%, 5%, 11%, 22%, 31%, 50%, 62%, 53%, and 32% as compared with the C-CMOS, Mirror, CPL-TG, HPSC, NEW-HPSC, Ours1, HCTG, TFA, and TGA, respectively.

As stated before, the PDP is a very important quantitative metric to evaluate the efficiency of circuits due to compromising between power consumption and delay of circuits. Fig. 13 shows the PDP results of full adders against a wide range of loads. Once again, to have the better illustration, in Fig. 13, the maximum PDP value on the vertical axis is limited to 35. As it is expected from power and delay results, the 10T cells have not low PDP even at small loads and they all fail to work at large loads. The SERF fails to work at FO32 and FO64, and the 13A fails at the load of FO64. There is a close competition between the other full adder cells. At the loads of FO64, FO32, FO16, and FO4, the lowest PDP belong to the proposed RSD-FA, SD-FA, SD-FA, and RSD-FA, and at FO8, the C-CMOS has the least PDP. The proposed RSD-FA, for example, offers lower PDP of about 44%, 44.6%, 56%, 70%, 76%, 51%, 42%, 32%, and 36% as compared with the C-CMOS, Mirror, CPL-TG, HPSC, NEW-HPSC, Ours1, HCTG, TFA, and TGA, respectively, at the load of FO4. In conclusion, the proposed full adders are more superior to the other cells in term of energy consumption, which is an important concern for portable electronic systems.

### C. Performance Analysis Against $V_{dd}$

The efficiency of the different full adder cells is studied with regard to power supply variation. As reported in ITRS,

TABLE V  
AVERAGE POWER CONSUMPTION (E-6 W), DELAY (E-11 S), AND PDP (E-17 J) RESULTS FOR DIFFERENT POWER SUPPLIES

| Designs  |            | $V_{dd}$ (V)= 0.8 V |        |         | $V_{dd}$ (V)= 0.9 V |         |         | $V_{dd}$ (V)= 1 V |        |        |
|----------|------------|---------------------|--------|---------|---------------------|---------|---------|-------------------|--------|--------|
| Name     | Reference  | Power               | Delay  | PDP     | Power               | Delay   | PDP     | Power             | Delay  | PDP    |
| PT-FA    | [Proposed] | 0.14487             | 2.8200 | 4.0854  | 0.18437             | 1.9986  | 3.6849  | 0.28550           | 1.6453 | 4.6972 |
| FL-FA    | [Proposed] | 0.13943             | 1.5419 | 2.1499  | 0.19613             | 1.4632  | 2.8697  | 0.15744           | 1.5557 | 2.4492 |
| DD-FA    | [Proposed] | 0.08856             | 1.0918 | 0.96695 | 0.12837             | 0.9799  | 1.2579  | 0.92701           | 0.2228 | 2.0654 |
| SD-FA    | [Proposed] | 0.11646             | 1.0251 | 1.1939  | 0.15178             | 0.9332  | 1.4164  | 0.13651           | 0.8830 | 1.2054 |
| RSD-FA   | [Proposed] | 0.08799             | 1.0366 | 0.91215 | 0.09134             | 0.94278 | 0.86113 | 0.16702           | 0.8650 | 1.4448 |
| ND-FA    | [Proposed] | 0.10835             | 0.9174 | 0.99405 | 0.14939             | 0.76823 | 1.1477  | 0.23007           | 0.7089 | 1.6312 |
| C-CMOS   | [1]        | 0.08354             | 1.3193 | 1.1022  | 0.12460             | 1.2355  | 1.5395  | 0.21391           | 1.1912 | 2.5481 |
| Mirror   | [2]        | 0.10313             | 1.3076 | 1.3485  | 0.12632             | 1.2321  | 1.5564  | 0.18469           | 1.1347 | 2.0957 |
| CPL-TG   | [4]        | 0.10838             | 1.5927 | 1.7262  | 0.13931             | 1.4260  | 1.9865  | 0.21243           | 1.3414 | 2.8495 |
| HPSC     | [8]        | 0.08821             | 1.8678 | 1.6477  | 0.09536             | 3.0654  | 2.9234  | 0.24879           | 2.5647 | 6.3809 |
| NEW-HPSC | [9]        | 0.08352             | 3.4162 | 2.8533  | 0.12363             | 3.0232  | 3.7374  | 0.26033           | 1.6106 | 4.1928 |
| Ours1    | [10]       | 0.08817             | 1.2049 | 1.0624  | 0.16324             | 1.0866  | 1.7738  | 0.18927           | 0.9831 | 1.8608 |
| HCTG     | [11]       | 0.08667             | 1.3245 | 1.1479  | 0.12443             | 1.2116  | 1.5076  | 0.18633           | 1.1543 | 2.1508 |
| TFA      | [3]        | 0.11328             | 1.3232 | 1.4990  | 0.10938             | 1.1700  | 1.2798  | 0.16521           | 1.1837 | 1.9555 |
| TGA      | [1]        | 0.09291             | 1.1220 | 1.0425  | 0.13519             | 1.0104  | 1.3659  | 0.19402           | 0.9202 | 1.7855 |
| SERF     | [5]        | 2.0206              | 1000.0 | 2020.7  | 3.3263              | 985.27  | 3277.3  | 4.9375            | 823.08 | 4063.9 |
| 13A      | [6]        | 3.3872              | 999.70 | 3386.2  | 5.8195              | 950.78  | 5533.1  | 9.3502            | 738.00 | 6900.4 |
| CLRCL    | [7]        | 3.6135              | 35.654 | 128.84  | 5.9031              | 23.118  | 136.47  | 8.9722            | 16.295 | 146.20 |

the nominal  $V_{dd}$  for the 32-nm CNFET technology is 0.9 V. In this simulation, we examine the performance metrics of full adders at 0.8, 0.9, and 1 V with 100-MHz operating frequency and standard fan-out of FO4. The results are tabulated in Table V. The shaded boxes in Table V indicate the best results for PDP. Simulation results confirm that the proposed designs outperform their counterparts in terms of power, delay, and PDP. Among the proposed designs, the RSD-FA and the SD-FA have the least PDP. For example at 0.8 V supply power, the RSD-FA offers lower PDP of about 77%, 57%, 5%, 23%, and 8% as compared with PT-FA, FL-FA, DD-FA, SD-FA, RSD-FA, and ND-FA, respectively. Among the hybrid cells (i.e., HPSC, NEW-HPSC, and HCTG), the HCTG design has the lower PDP. At 1 V  $V_{dd}$ , it has roughly 66% and 48% lower PDP as compared with the HPSC and NEW-HPSC, respectively. At 0.8 V supply, the static C-CMOS offers lower PDP than the static Mirror, but the trend is reversed at higher supplies. Considering the results of Table V, there is an improvement in the PDP measure of the static cells, i.e., C-CMOS and Mirror, as compared with the hybrid cells. This is because of the existence of smaller input capacitances in static full adder cells compared with hybrid cells in CNFET technology; while in MOSFET technology, the input capacitances of hybrid cells are less than that of static cells. It is worth to note that in MOSFET technology, p-type transistors are sized, such that their width is about twice larger than the width of n-type transistors, and consequently, results in larger input capacitances. Whereas in CNFET technology due to equal mobility, the size of p- and n-type transistors is set to be equal. Therefore, it is expected to observe results in the CNFET technology that are different from MOSFET technology. All 10T full adders display higher PDP than others because of the threshold-loss problem at internal nodes and lack of driving capability at the output nodes. Among full adder cells, which they use PTL and TGL, i.e., the CPL-TG, Ours1, TFA, and TGA, only the CPL-TG has driving capability. Among the cells CPL-TG, Ours1, TFA, and TGA with

output load of FO4, at 0.8, 0.9, and 1 V supplies the TGA, the TFA, and the TGA cells have the lower PDP, respectively.

#### D. Real Environment

In this simulation, we embed all the full adders in an RCA with a word length of 4–32 bit without putting any buffers at intermediate cascaded stages. The RCA test bed simulates a real environment for full adders. It is so likely that there is a full adder cell with good performance results when it is used solely, but it does not function well in large circuits. Therefore, it is very important to study the efficiency of full adder cells in large circuits. A good full adder cell will operate well even if in the presence of either the large fan-outs or cascaded structure. Results obtained from this simulation are tabulated in Table VI. The index terms  $\Psi$ ,  $\Theta$ , and  $\Phi$  indicate the average power consumption, delay, and PDP metrics, respectively, for an n-bit RCA ( $n = 4, 8, 16, 32$ ). Furthermore, shaded boxes in Table VI display the best results in terms of power consumption, delay, and PDP metrics.

The both 10T full adders, i.e., the SERF and 13A, fail to work for all word length widths of RCA. They neither have a driving power nor full voltage swing in their output signals. In conclusion, they do not produce the desired output signals to the following full adders in the cascaded manner. Simulation results show that they fail to work even for narrow word length of RCA.

The other designs without driving capability except for the HCTG fail to work only when the length of RCA is 32-bit. In this case, output signals become weaker during propagation along the path of wide RCA from inputs to the outputs. This is because of lacking driving power in the circuits Ours1, TFA, TGA, and CLRCL. However, they function with the smaller width of RCA, but their PDP sharply increases with the increase in the width of RCA. The HCTG has reasonable functionality although it has not driving power. For example, it offers improvement in PDP of about 54%, 60%, and 99.97% as compared with the Ours1, TFA, and CLRCL, respectively,

TABLE VI

AVERAGE POWER CONSUMPTION (E-6 W), DELAY (E-11 S), AND PDP (E-17 J) RESULTS FOR DIFFERENT WORD LENGTHS OF RCA (\* MEANS THE PROPOSED DESIGN)

| Word Length  | 4-Bit   |        |          | 8-Bit   |        |          | 16-Bit  |        |          | 32-Bit |        |          |        |
|--------------|---------|--------|----------|---------|--------|----------|---------|--------|----------|--------|--------|----------|--------|
|              | Design  | $\psi$ | $\Theta$ | $\Phi$  | $\psi$ | $\Theta$ | $\Phi$  | $\psi$ | $\Theta$ | $\Phi$ | $\psi$ | $\Theta$ | $\Phi$ |
| PT-FA*       | 0.36565 | 3.1801 | 1.1628   | 0.64924 | 5.5700 | 3.6163   | 1.1767  | 10.424 | 12.266   | 2.2160 | 20.185 | 44.731   |        |
| FL-FA*       | 0.21114 | 2.9644 | 0.62592  | 0.34387 | 5.3048 | 1.8242   | 0.59237 | 9.9290 | 5.8816   | 1.0516 | 19.167 | 20.155   |        |
| DD-FA*       | 0.20904 | 3.4758 | 0.72659  | 0.36045 | 7.1279 | 2.5693   | 0.66387 | 14.406 | 9.5635   | 1.2738 | 28.975 | 36.908   |        |
| SD-FA*       | 0.19058 | 3.0672 | 0.58455  | 0.32409 | 6.2542 | 2.0269   | 0.59611 | 12.594 | 7.5072   | 1.1436 | 25.317 | 28.952   |        |
| RSD-FA*      | 0.17970 | 2.9141 | 0.52368  | 0.30697 | 5.9552 | 1.8280   | 0.54949 | 12.066 | 6.6302   | 1.0406 | 24.401 | 25.392   |        |
| ND-FA*       | 0.23624 | 4.2858 | 1.0125   | 0.90033 | 14.015 | 12.619   | 5.8077  | 52.536 | 305.11   | 46.321 | 200.01 | 9264.6   |        |
| C-CMOS [1]   | 0.18473 | 3.3881 | 0.62589  | 0.33045 | 6.7337 | 2.2252   | 0.61977 | 13.442 | 8.3312   | 1.1851 | 26.416 | 30.287   |        |
| Mirror [2]   | 0.18511 | 3.2498 | 0.6015   | 0.32717 | 6.2766 | 2.0535   | 0.62140 | 12.215 | 7.5905   | 1.2064 | 24.471 | 29.522   |        |
| CPL-TG [4]   | 0.30906 | 3.9342 | 1.2159   | 0.56457 | 7.7269 | 4.3624   | 1.0830  | 15.343 | 16.616   | 2.1120 | 30.677 | 64.790   |        |
| HPSC [8]     | 0.28060 | 11.963 | 3.3570   | 0.51341 | 24.192 | 12.421   | 0.97824 | 50.326 | 49.231   | 1.9362 | 107.37 | 207.88   |        |
| NEW-HPSC [9] | 0.29909 | 13.070 | 3.9092   | 0.57435 | 27.919 | 16.036   | 1.0724  | 56.267 | 60.343   | 2.0931 | 116.01 | 242.82   |        |
| Ours1 [10]   | 0.24135 | 8.9685 | 2.1646   | 0.59610 | 39.618 | 23.617   | 2.4307  | 223.62 | 543.56   | Failed | Failed | Failed   |        |
| HCTG [11]    | 0.20141 | 5.1467 | 1.0366   | 0.55235 | 19.542 | 10.794   | 2.7199  | 85.031 | 231.27   | 18.570 | 399.06 | 7410.6   |        |
| TFA [3]      | 0.21576 | 8.5345 | 1.8414   | 0.65132 | 42.454 | 27.651   | 3.7409  | 258.47 | 966.89   | Failed | Failed | Failed   |        |
| TGA [1]      | 0.22418 | 4.6391 | 1.0400   | 0.64551 | 15.823 | 10.214   | 2.9641  | 52.950 | 156.95   | Failed | Failed | Failed   |        |
| CLRCL [7]    | 13.270  | 509.96 | 6767.4   | 39.211  | 1025.1 | 40194    | 134.90  | 2083.4 | 281050   | Failed | Failed | Failed   |        |

in the case of 8-bit RCA. Its PDP is slightly larger than that of the TGA cell in 8-bit RCA.

Due to having smaller critical path, the static Mirror design is a bit faster than the C-CMOS. In power consumption point of view, they have very close competition. In total, the PDP of the Mirror design is less than the C-CMOS. Unlike the MOSFET technology, in CNFET technology, the Mirror and C-CMOS have less PDP compared with the CPL-TG, HPSC, and NEW-HPSC. Due to small input capacitance of the Mirror and C-CMOS cells in CNFET technology, their delay and power consumption are less than the CPL-TG, HPSC, and NEW-HPSC. Therefore, we observe an anomalous result in CNFET technology for the same design of full adders compared with the MOSFET technology. In 16-bit wide RCA, there is  $\sim 54\%$ ,  $84\%$ , and  $87\%$  improvement in PDP for the Mirror design as compared with CPL-TG, HPSC, and NEW-HPSC, respectively.

Table VI confirms that the proposed full adders have excellent results with regard to average power dissipation, speed, and PDP compared with previously classic and state-of-the-art full adders. The proposed FL-FA and RSD-FA full adders exhibit the best results compared with other ones. The worst results among the proposed cells belong to the ND-FA due to having nondriving power. Whatever the word length of RCA increases, the influence of having driving capability upon the performance measures of circuits becomes more eidetic.

Regardless of width of RCA, the power consumption of the RSD-FA is less than the power consumption of the other full adder cells. In the case of 4-bit RCA, the speed of the RSD-FA design is better than others, and consequently, it has the least PDP than other cells. But in the case of wider word length of RCA, the FL-FA outruns the RSD-FA in terms of speed, and consequently, owns the least PDP. For example, if we consider the proposed FL-FA design in a 16-bit RCA, it is inferred that it has lower PDP of about 29%, 22%, 64%, 88%, 90%, 98%, 97%, 99.39%, 96%, and 99.99% as compared with C-CMOS, Mirror, CPL-TG, HPSC, NEW-HPSC, Ours1, HCTG, TFA,



Fig. 14. NIC for the full adders.

TGA, and CLRCL, respectively. In addition, the proposed FL-FA is more superior to the other proposed cells. In the case of 32-bit RCA, for instance, the FL-FA saves PDP of about 54%, 45%, 30%, 20%, and 99% as compared with PT-FA, DD-FA, SD-FA, RSD-FA, and ND-FA designs, respectively. Therefore, the proposed designs can be effectively used in the larger structure of circuits.

#### E. Noise Immunity Analysis

The NICs and ANTE results for the all full adders are shown in Figs. 14 and 15, respectively. The CLRCL has the highest ANTE followed by TFA, HPSC, Ours1, and NEW-HPSC. The proposed designs have very close results to each other. The SD-FA has slightly larger ANTE than the other proposed ones. The proposed designs have a close competition in terms of noise immunity compared with the C-CMOS, Mirror, CPL-TG, HCTG, TGA, SERF, and 13A full adders. Therefore, from noise immunity point of view, the functionality of the proposed cells is reasonable.

#### F. Process Variation Analysis

Fig. 16 shows the simulation results of the full adders against the diameter variations of CNTs performed with



Fig. 15. ANTE for the full adders.



Fig. 16. PDP variations versus CNT diameter variations.

MC transient simulation. The maximum value of vertical axis of Fig. 16 is limited to 1.8. Therefore, all 10T full adders and also the proposed PT-FA cell are not shown in Fig. 16.

It is worth to notice that the CLRCL design is the most susceptible full adder against process variations of CNTs followed by 13A and SERF cells. Therefore, the 10T full adders are not reliable from this point of view. After the 10T full adders, the proposed PT-FA cell and CPL-TG are the most sensitive cells with regard to diameter variations of CNTs. The static C-CMOS and Mirror designs offer better robustness compared with CPL-TG, HPSC, NEW-HPSC, Ours1, HCTG, TFA, and TGA cells. And the designs Ours1, TGA, and TFA are more robust than the hybrid cells, i.e., the HPSC, NEW-HPSC, and HCTG.

The proposed designs except for the PT-FA are the most robust cells compared with other full adders. Among the proposed cells, the DD-FA and SD-FA have the better results. After them, the RSD-FA and FL-FA are the more robust cells. As a result, the proposed designs are very reliable in the extensive range of STD of diameter of CNTs from the mean value from 0.04 to 0.2 nm. This is a very remarkable feature in the CNFET technology that a circuit can work with high reliability with regard to process variations.

## VI. CONCLUSION

As MOSFET technology encounters serious problems in nanoscale region, some new technologies have been recently emerged, such as carbon nanotube field-effect

transistor (CNFET) devices, which is very likely to replace the conventional bulk MOSFET technology in the time to come. The CNFET devices have exceptional features, such as ballistic transport and low OFF-current, which enables high-speed and low-power circuit designs. Therefore, we have motivated to study the performance of full adders in the CNFET technology. We observed that in some cases, a contrary result was obtained for full adders. As an example, in the MOSFET technology, the PDP of the HPSC full adder is lower than the C-CMOS style; while in CNFET technology, a reversed trend was occurred.

Based on synergic combination of PTL and TGL, some novel two-input XOR/XNOR circuits were proposed. Then, they employed to form some novel low-power and high-speed full adder cells. In fact, five full adders with driving capability were proposed, each of which had their own merits.

In order to evaluate the performance measures of the proposed full adders, comprehensive simulations using Synopsys HSPICE tool in the 32-nm CNFET technology node were performed. The proposed cells perform well with supply voltage scaling and even under different output loads. When they are used in large adders, such as RCA, they outperform their counterparts and make them to be fitting to the large circuits. In addition, the susceptibility of full adders against both noise and process variations was considered. In noise point of view, we observed that the proposed designs had a close competition with other full adders, and in some cases, they outperformed them. Finally, to study the robustness of the full adders against the diameter variations of CNTs embedded in the channel of CNFET devices, we performed the MC transient analysis. Simulation results confirmed that the proposed cells are more robust than other cells with regard to process variation, making them suitable for implementing in the CNFET technology.

## REFERENCES

- [1] N. H. E. Weste and K. Eshraghian, *Principles of CMOS VLSI Design: A System Perspective*. Reading, MA, USA: Addison-Wesley, 1988.
- [2] R. Zimmermann and W. Fichtner, "Low-power logic styles: CMOS versus pass-transistor logic," *IEEE J. Solid-State Circuits*, vol. 32, no. 7, pp. 1079–1090, Jul. 1997.
- [3] N. Zhuang and H. Wu, "A new design of the CMOS full adder," *IEEE J. Solid-State Circuits*, vol. 27, no. 5, pp. 840–844, May 1992.
- [4] I. S. Abu-Khater, A. Bellaouar, and M. I. Elmasry, "Circuit techniques for CMOS low-power high-performance multipliers," *IEEE J. Solid-State Circuits*, vol. 31, no. 10, pp. 1535–1546, Oct. 1996.
- [5] R. Shalem, E. John, and L. K. John, "A novel low power energy recovery full adder cell," in *Proc. 9th IEEE Great Lakes Symp. VLSI*, Feb. 1999, pp. 380–383.
- [6] H. T. Bui, Y. Wang, and Y. Jiang, "Design and analysis of low-power 10-transistor full adders using novel XOR-XNOR gates," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 49, no. 1, pp. 25–30, Jan. 2002.
- [7] J.-F. Lin, Y.-T. Hwang, M.-H. Sheu, and C.-C. Ho, "A novel high-speed and energy efficient 10-transistor full adder design," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 54, no. 5, pp. 1050–1059, May 2007.
- [8] C.-H. Chang, J. Gu, and M. Zhang, "A review of 0.18-μm full adder performances for tree structured arithmetic circuits," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 13, no. 6, pp. 686–695, Jun. 2005.
- [9] S. Goel, A. Kumar, and M. Bayoumi, "Design of robust, energy-efficient full adders for deep-submicrometer design using hybrid-CMOS logic style," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 14, no. 12, pp. 1309–1321, Dec. 2006.
- [10] M. Aguirre-Hernandez and M. Linares-Aranda, "CMOS full-adders for energy-efficient arithmetic applications," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 19, no. 4, pp. 718–721, Apr. 2011.

- [11] P. Bhattacharyya, B. Kundu, S. Ghosh, V. Kumar, and A. Dandapat, "Performance analysis of a low-power high-speed hybrid 1-bit full adder circuit," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 23, no. 10, pp. 2001–2008, Oct. 2015.
- [12] G. E. Moore, "Progress in digital integrated electronics," in *IEDM Tech. Dig.*, 1975, pp. 11–13.
- [13] S. K. Sinha, K. Kumar, and S. Chaudhury, "CNTFET: The emerging post-CMOS device," in *Proc. Int. Conf. Signal Process. Commun. (ICSC)*, Dec. 2013, pp. 372–374.
- [14] J. Appenzeller, "Carbon nanotubes for high-performance electronics—Progress and prospect," *Proc. IEEE*, vol. 96, no. 2, pp. 201–211, Feb. 2008.
- [15] A. Rahman, J. Guo, S. Datta, and M. S. Lundstrom, "Theory of ballistic nanotransistors," *IEEE Trans. Electron Devices*, vol. 50, no. 9, pp. 1853–1864, Sep. 2003.
- [16] S. Lin, Y. B. Kim, and F. Lombardi, "CNTFET-based design of ternary logic gates and arithmetic circuits," *IEEE Trans. Nanotechnol.*, vol. 10, no. 2, pp. 217–225, Mar. 2011.
- [17] A. Raychowdhury and K. Roy, "Carbon-nanotube-based voltage-mode multiple-valued logic design," *IEEE Trans. Nanotechnol.*, vol. 4, no. 2, pp. 168–179, Mar. 2005.
- [18] J. Deng and H.-S. P. Wong, "A circuit-compatible SPICE model for enhancement mode carbon nanotube field effect transistors," in *Proc. Int. Conf. Simulation Semiconductor Process. Devices*, Sep. 2006, pp. 166–169.
- [19] H. J. M. Veendrick, "Short-circuit dissipation of static CMOS circuitry and its impact on the design of buffer circuits," *IEEE J. Solid-State Circuits*, vol. 19, no. 4, pp. 468–473, Aug. 1984.
- [20] D. Radhakrishnan, "Low-voltage low-power CMOS full adder," *IEE Proc.-Circuits Devices Syst.*, vol. 148, no. 1, pp. 19–24, Feb. 2001.
- [21] M. Zhang, J. Gu, and C.-H. Chang, "A novel hybrid pass logic with static CMOS output drive full-adder cell," in *Proc. IEEE Int. Symp. Circuits Syst.*, May 2003, pp. V-317–V-320.
- [22] J. Deng and H. S. P. Wong, "A compact SPICE model for carbon-nanotube field-effect transistors including nonidealities and its application—Part I: Model of the intrinsic channel region," *IEEE Trans. Electron Devices*, vol. 54, no. 12, pp. 3186–3194, Dec. 2007.
- [23] J. Deng and H. S. P. Wong, "A compact SPICE model for carbon-nanotube field-effect transistors including nonidealities and its application—Part II: Full device model and circuit performance benchmarking," *IEEE Trans. Electron Devices*, vol. 54, no. 12, pp. 3195–3205, Dec. 2007.
- [24] J.-M. Wang, S.-C. Fang, and W.-S. Feng, "New efficient designs for XOR and XNOR functions on the transistor level," *IEEE J. Solid-State Circuits*, vol. 29, no. 7, pp. 780–786, Jul. 1994.
- [25] A. M. Shams, T. K. Darwish, and M. A. Bayoumi, "Performance analysis of low-power 1-bit CMOS full adder cells," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 10, no. 1, pp. 20–29, Feb. 2002.
- [26] *Emerging Research Devices*, accessed on 2015. [Online]. Available: [http://www.itrs.net/links/2007ITRS/2007\\_chapters/2007\\_ERD.pdf](http://www.itrs.net/links/2007ITRS/2007_chapters/2007_ERD.pdf)
- [27] A. Shams and M. Bayoumi, "Performance evaluation of 1-bit CMOS adder cells," in *Proc. IEEE ISCAS*, Jun. 1999, pp. 27–30.
- [28] J. Zhang, N. Patil, A. Hazeghi, H.-S. P. Wong, and S. Mitra, "Characterization and design of logic circuits in the presence of carbon nanotube density variations," *IEEE Trans. Comput.-Aided Design Integr. Circuits Syst.*, vol. 30, no. 8, pp. 1103–1113, Aug. 2011.
- [29] K. El Shabrawy, K. Maharatna, D. Bagnall, and B. M. Al-Hashimi, "Modeling SWCNT bandgap and effective mass variation using a Monte Carlo approach," *IEEE Trans. Nanotechnol.*, vol. 9, no. 2, pp. 184–193, Mar. 2010.
- [30] Arizona State University, Tucson, AZ, USA. *Predictive Technology Models*, accessed on 2015. [Online]. Available: [www.eas.asu.edu/~ptm/](http://www.eas.asu.edu/~ptm/)



**Yavar Safaei Mehrabani** received the B.S. degree in computer hardware engineering from Shomal University, Amol, Iran, in 2007, and the M.S. and Ph.D. degrees in computer architecture from the Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran, in 2011 and 2015, respectively.

His current research interests include nanoelectronics with an emphasis on carbon nanotube field effect transistor, design of ultra low-power and high-performance exact and inexact digital VLSI circuits, computer arithmetic circuits, and application specific instruction set processor design.

Dr. Safaei Mehrabani was a recipient of a number of awards from the Iran Nanotechnology Initiative Council and Iran's National Elites Foundation.



**Mohammad Eshghi** (M'16) was born in Shahroud, Iran. He received the B.S. degree in electrical engineering from Sharif University, Tehran, Iran, the M.S. degree from Ohio University, Athens, OH, USA, and the Ph.D. degree from Ohio State University, Columbus, OH, USA.

He has been with the Electrical and Computer Engineering Faculty, Shahid Beheshti University, Tehran, since 1994. His current research interests include VLSI design, digital signal processing, optical character recognition and digital circuit design and implementation on field-programmable gate array, chaotic systems, quantum computing, and application specific instruction set processor design.