

# High-Speed Hybrid-Logic Full Adder Using High-Performance 10-T XOR–XNOR Cell

Jyoti Kandpal, Abhishek Tomar, *Member, IEEE*, Mayur Agarwal<sup>✉</sup>, *Member, IEEE*, and K. K. Sharma

**Abstract**—Hybrid logic style is widely used to implement full adder (FA) circuits. Performance of hybrid FA in terms of delay, power, and driving capability is largely dependent on the performance of XOR–XNOR circuit. In this article, a high-speed, low-power 10-T XOR–XNOR circuit is proposed, which provides full swing outputs simultaneously with improved delay performance. The performance of the proposed circuit is measured by simulating it in cadence virtuoso environment using 90-nm CMOS technology. The proposed circuit reduces the power delay product (PDP) at least by 7.5% than that of the available XOR–XNOR modules. Four different designs of FAs are also proposed in this article utilizing the proposed XOR–XNOR circuit and available sum and carry modules. The proposed FAs provide 2%–28.13% improvement in terms of PDP than that of other architectures. To measure the driving capabilities, the proposed FAs are embedded in 2-, 4-, and 8-bit cascaded full adder (CFA) structures. Results show that two of the proposed FAs provide the best performance for a higher number of bits among all the FAs.

**Index Terms**—Cascaded full adder (CFA), full voltage swing, hybrid full adder (FA), XOR–XNOR circuit.

## I. INTRODUCTION

NOWADAYS, portable electronic gadgets, such as cellular phones, personal digital assistants (PDAs), and notebook, form the integral part of life. For harnessing best out of these electronic systems, designers strive for small size, high speed, and energy-efficient circuits. These electronic systems mostly comprise arithmetic circuits. An adder is a fundamental component of most of the arithmetic circuits such as multipliers [1]. These arithmetic circuits are extensively used in the data paths consuming almost one-third of power in the high-performance microprocessors [2]. Therefore, enhancing the performance of the adders improves the performance of the whole system significantly [1]–[3].

To realize a full adder (FA) circuit, several static CMOS logic styles have been presented [2]–[4]. These logic styles can be broadly classified into two categories: classical design style and hybrid design style. In classical design style,

Manuscript received July 14, 2019; revised October 31, 2019 and January 30, 2020; accepted March 17, 2020. Date of publication April 15, 2020; date of current version June 1, 2020. The work of Jyoti Kandpal was supported by a scholarship from the Ministry of Human Resource and Development, Government of India, through the project Technical Education Quality Improvement Programme-III (TEQIP-III). (*Corresponding author:* Mayur Agarwal.)

The authors are with the Department of Electronics and Communication Engineering, College of Technology, G. B. Pant University of Agriculture and Technology, Pantnagar 263145, India (e-mail: agrmayur@gmail.com).

Color versions of one or more of the figures in this article are available online at <http://ieeexplore.ieee.org>.

Digital Object Identifier 10.1109/TVLSI.2020.2983850



Fig. 1. Block diagram of hybrid logic FA circuit [6].

the FA is designed in a single module using MOS transistors. The complementary CMOS (C-CMOS) FA [2] is an example of this approach. This design uses 28-transistors to realize pull-up and pull-down networks of a FA. It provides full swing outputs and robustness against voltage scaling and transistor sizing. The main drawback of this circuit is high input capacitance as each of the input is connected to the gates having at least a pMOS and an nMOS transistor which degrades the speed of the adder. Another example of the classical approach is complementary pass-transistor logic (CPL) FA [2]. This structure provides high speed, full swing output, and good driving capability because of the high-speed differential stage, cross-coupled pMOS structure, and static inverter at the output. However, the power dissipation of this circuit is high due to a large number of internal nodes in the circuit. Also, the layout of this circuit is not symmetrical because of irregular transistor arrangement.

In the classical approach, FA can also be designed using pass transistors [5]. However, the pass transistors have an inherent threshold voltage drop problem. When logic “1” and logic “0” are passed through nMOS and pMOS, respectively, full swing logic “1” and logic “0” are not obtained at the output. To resolve this issue, a transmission gate (TG)-based approach has been developed. In this style, an nMOS transistor and a pMOS transistor are connected in parallel and controlled by complementary control signals. These pMOS and nMOS are turned on simultaneously and provide paths to both the logic (logic “1” and logic “0”) to provide full swing outputs. TG-based adder consumes low power; however, it has weak driving capability. Performance of this circuit can be enhanced by using the buffer at the output.

In hybrid design style, FA structure is divided into three modules [3], [6], [7] as shown in Fig. 1. Module I generates full swing XOR and XNOR outputs of two input signals (A and B) simultaneously. These XOR–XNOR signals must

have good driving capabilities as these signals have to drive other two modules. Module II and Module III are the sum and carry circuits which produce the sum and carry outputs ( $C_{OUT}$ ), respectively, using the outputs of Module I and third input signal ( $C_{IN}$ ). The main advantage of hybrid style is that all the modules can be optimized at the individual level, and the number of transistors can be reduced, which reduces the internal power dissipating nodes. The performance of hybrid style FAs is as good as a single unit or small chains; however, they lack driving capability in higher bit adders implemented through cascading stages [8].

Various researchers have presented FA designs using hybrid logic style which provide optimum performance without degrading the output. Vastrabacka *et al.* [9] presented an XOR–XNOR module using pass transistors logic (PTL) in which sum and carry modules have been realized using 2 to 1 multiplexer circuit. Zhang *et al.* [10] presented a hybrid FA cell in which PTL is used to generate XOR–XNOR outputs simultaneously, and C-CMOS style is used to implement carry module. In another design, New low-power and high-speed (LPHS) adder [11], XOR–XNOR circuit has been implemented using feedback transistor, while sum and carry modules are implemented using PTL and 2 to 1 multiplexer, respectively.

Performance of XOR–XNOR circuit plays a vital role in the performance of hybrid FA design. Various approaches to designing XOR–XNOR circuit are presented in recent years. These approaches can be broadly classified into two categories. In the first approach, the XOR circuit is synthesized initially, and then XNOR output is generated using an inverter [e.g., transmission gate adder (TGA)]. This approach has a drawback that XOR and XNOR outputs are not generated simultaneously, which increases the chance of generating false switching and glitches in the outputs of the modules II and III [12]. In another approach, the XOR–XNOR circuit is designed such that XOR and XNOR outputs are generated simultaneously. In this approach, the delay difference between XOR and XNOR signals is tried to be minimized. An XOR–XNOR circuit using CPL is presented in [6] that gives the simultaneous generation of XOR–XNOR outputs. The output voltage levels in this circuit are recovered using the feedback transistors. However, the delay and power of this circuit remain higher due to feedback transistor. Goel *et al.* [3] eliminated the NOT gate from the critical path, however, the circuit delay remains higher due to the cross-coupled structure.

Radhakrishnan [13] had presented a design of XOR–XNOR circuit with only six transistors. In this design, two complementary feedback transistors are used to restore the weak logic in complementary output nodes (XOR and XNOR) when both the inputs have same values (either “00” or “11”). This circuit suffers from the high worst case delay for the inputs “11” or “00” as in these cases; outputs reach their final voltage levels in two steps. This issue of slow response is resolved by Chang *et al.* [4] who used two additional nMOS and pMOS transistors at the XOR and XNOR output nodes, respectively. This circuit provides good driving capability and full output swing. However, the cross-coupled structure adds an extra parasitic capacitance to the XOR–XNOR output nodes.



Fig. 2. XOR–XNOR circuits given in (a) [13] (b) [15].

The structure of a XOR–XNOR circuit is further improved by Valashani and Mirzakuchaki [14] who used an inverter. This structure provides lower critical path delay, however, power dissipation remains higher.

Naseri and Timarchi [15] presented an improved design of the XOR–XNOR circuit, which is implemented using 12 transistors. This circuit consumes low power and provides better delay performance than those of other circuits. However, this circuit requires an external inverter. The performance of this circuit can be improved further by eliminating this external inverter. In this article, a new XOR–XNOR circuit is proposed, which provides good driving capabilities and full swing XOR–XNOR outputs without using any external inverter. In this design, a feedback circuitry and internal NOT gate help in getting full swing output for all the transitions. By proper sizing of different transistors, not only the delay but also the power consumption of the proposed circuit is reduced. Using the proposed XOR–XNOR circuit, four different designs of FAs are also presented in this article. The proposed FAs show improvement in terms of power delay product (PDP) and driving capability than those of other structures.

The remainder of this article is organized as follows. In Section II, a new two input XOR–XNOR structure is proposed. In Section III, different types of modules—module II and module III—are presented. After that, four different FA designs using the proposed XOR–XNOR circuit are also proposed in the same section. In Section IV, the performance of the proposed XOR–XNOR circuit and FAs in terms of power, speed, and PDP is compared with those of available XOR–XNOR circuits and FAs. Finally, this article is concluded in Section V.

## II. PROPOSED XOR–XNOR CIRCUIT

In recent years, various approaches to design XOR–XNOR circuit are presented which are discussed in Section 1. An XOR–XNOR circuit [13] which provides full swing outputs using only six transistors is shown in Fig. 2(a). This configuration is realized using the CPL logic and feedback restorer transistors. It provides good delay performance for the inputs AB: “01” and “10.” However, a switching delay arises at the output for the inputs AB: “11” and “00.” The issue of higher delay for these inputs is resolved by the method used by Naseri and Timarchi [15] by adding 2 nMOS and 2 pMOS transistors in the circuit as shown in Fig. 2(b). The latter design not



Fig. 3. Proposed XOR-XNOR circuit.

TABLE I

CHARGING AND DISCHARGING PATHS FOR THE DIFFERENT INPUT COMBINATIONS IN THE PROPOSED XOR-XNOR CIRCUIT

| Inputs<br>AB | Path                   |                           | Path                   |                            |
|--------------|------------------------|---------------------------|------------------------|----------------------------|
|              | XOR<br>(Full<br>Swing) | XOR<br>(Partial<br>Swing) | XOR<br>(Full<br>Swing) | XNOR<br>(Partial<br>Swing) |
| 00           | N3                     | P1, P2                    | P4, P5                 | -                          |
| 01           | P2                     | -                         | N1                     | P4, P3                     |
| 10           | P1                     | N4, N3                    | N2                     | -                          |
| 11           | N4, N5                 | -                         | P3                     | N1, N2                     |

only resolves the problem of slow response but also reduces the power consumption of the circuit. However, it uses  $\bar{A}$  as an input which needs an extra inverter. A new XOR-XNOR circuit is proposed in this section, which generates the inverted input internally without using any external inverter. This arrangement not only reduces the number of transistors in the circuit but also the number of internal nodes which reduces the overall delay and power consumption of the XOR-XNOR circuit.

The proposed XOR-XNOR circuit using ten transistors (10-T) is shown in Fig. 3. The proposed XOR-XNOR circuit is based on CPL and cross-coupled structure. It uses two pMOS (P1 and P2) and three nMOS (N3, N4, and N5) transistors at the XOR output side and two nMOS (N1 and N2) and three pMOS (P3, P4, and P5) at the XNOR output side. At the XOR side, P1 and P2 are connected in parallel as PTL, N4, and N5 as a restorer to provide a full swing output and N3 as feedback transistor. Similarly, at the XNOR output side N1 and N2 transistors are connected in parallel as PTL, P4, and P5 as a restorer to provide a full swing output and the P3 as a feedback transistor. This circuit provides full swing XOR-XNOR outputs simultaneously.

For understanding the operation of the proposed design, charging and discharging paths for XOR and XNOR outputs are shown in Table I. It includes all the paths which provide partial swing and full swing at the output nodes. For the input AB: '01,' the transistors P2, N1, and P4 turn on. Transistors P2 and N1 pass logic '1' and logic '0' at XOR and XNOR outputs, respectively, while transistor P4 turns on the transistor P3 to pass the weak logic '0' (-Vthp) at the XNOR output. Similarly, for the input AB: '10,' transistors P1, N2, and N4 turn on.

Transistors P1 and N2 pass logic '1' and logic '0' at XOR and XNOR outputs, respectively, and transistor N4 turns on the transistor N3 which passes the weak logic '1' (VDD-Vthn) at the XOR output. For these inputs (AB: '01' and '10'), the weak logic outputs will not affect the output swing as paths are available for full swing outputs.

For the input AB: '00,' the transistors P1, P2, P4, and P5 turn on. P1 and P2 pass weak logic '0' (-Vthp) at the XOR output, while P5 and P4 pass full logic '1' at XNOR output and the internal node X. Logic '1' at node X turns on the transistor N3 and a strong logic '0' is passed at the XOR output to make it full swing. Similarly, for input AB: '11,' transistors N1, N2, N4, and N5 turn on. The transistors N1 and N2 pass weak logic '1' (VDD-Vthn) for the XNOR output, while the XOR node discharges completely through N4 and N5. Logic '0' also passes to the internal node X, which causes transistor P3 to be turned on and passes the full logic '1' at the XNOR output.

#### A. Transistor Sizing

The proposed XOR-XNOR gate is symmetrical and NOT logic has been incorporated into it without using external NOT gate. Due to this incorporated NOT gate, the capacitance equivalent to one inverter at the input node is reduced which reduces the overall delay. The output nodes of this circuit are charged or discharged through different full swing and partial swing paths for different input combinations. Therefore, it has different delay for different input combinations. The delay for an input combination can be decreased by reducing the sizes of the transistors along its path. However, increasing the size of a transistor increases the capacitive load at the node, which increases the delay for other input combinations. Therefore, the size of the transistor cannot be increased indefinitely. The capacitances across internal node X ( $C_X$ ) and XOR output ( $C_{XOR}$ ) can be calculated as given, respectively, in

$$C_X = Cd_{N5} + Cg_{P3} + \frac{Cd_{N4}}{2} + Cd_{P5} + Cg_{N3} + \frac{Cd_{P4}}{2} \quad (1)$$

$$C_{XOR} = Cd_{P1} + Cd_{P2} + \frac{Cd_{N4}}{2} + Cd_{N3} \quad (2)$$

where  $Cd$  and  $Cg$  represent the diffusion and gate capacitances, respectively.

The delay of the overall circuit can be minimized by finding the worst case delay paths and increasing the size of the transistors along these paths. Delay for an input combination can be calculated by creating the RC delay models for the input combinations and analyzing it. RC delay models of the proposed XOR-XNOR circuit for different input combinations are shown in Fig. 4. For input AB: '00,' there are two discharging paths as shown in Fig. 4(a). A discharging path through two pMOS transistors P1 and P2 will be faster as both the resistances for this input combination are in parallel. However, the output will not be discharge fully through this path. The other path through the transistor N3 will discharge the output fully. However, discharging through this path will start only after the charging of node X. Therefore, for input



Fig. 4. RC delay model of the proposed XOR-XNOR circuit for (a) AB: "00," (b) AB: "01," (c) AB: "10," and (d) AB: "11."

AB: "00," the delay  $T_{\text{XOR}00}$  in terms of  $R$  and  $C$  can be given as in

$$T_{\text{XOR}00} = R_{P5}C_X + (R_{P1} \parallel R_{P2} \parallel R_{N3})C_{\text{XOR}}. \quad (3)$$

In (3),  $R_{N3}$  may be considered as a variable resistor depending upon the charge dumped on node X. For input AB: "01," only one charging path is available through transistor P2, as shown in Fig. 4(b). The delay  $T_{\text{XOR}01}$  for this input combination can be given as

$$T_{\text{XOR}01} = R_{P2}C_{\text{XOR}}. \quad (4)$$

For input AB: "10," the output can be charged through transistor P1, however, there is a charging path for node X that

is also available through transistor N4 as shown in Fig. 4(c). The other charging paths through transistor P5 and N3 are also available for this input combination. The delay  $T_{\text{XOR}10}$  for this input can be given as

$$\begin{aligned} T_{\text{XOR}10} = & (R_{P5} \parallel (R_{P1} + R_{N4}))C_X \\ & + (R_{P1} \parallel (R_{P5} + R_{N4}) \parallel R_{N3})C_{\text{XOR}}. \end{aligned} \quad (5)$$

Similarly, the delay  $T_{\text{XOR}11}$  for the input combination AB: "11" can be given as

$$T_{\text{XOR}11} = R_{N5}C_X + (R_{N5} + R_{N4})C_{\text{XOR}}. \quad (6)$$

The overall delay of the circuit is optimized by selecting proper sizes of all the transistors. For the minimum size transistor, capacitances of all the transistors and resistances of nMOS transistors is considered as 1 unit, whereas the resistances of pMOS transistors is considered as 2.5 units. Initially, a minimum size for all the transistors is selected, and all the delays are calculated using (1)–(6). Then, the size of the transistors which are responsible for higher delay is increased and its impact on all the delay is observed. An increment in the size of a transistor with  $k$  is considered as increment in the capacitance by  $k$  unit and decrement in resistance by  $1/k$  unit. Now, the sizes of all the transistors for optimal delay are calculated using a simple MATLAB program. The sizes of the transistors for optimal delay are given in the Fig. 3. The sizes of the transistors can also be optimized by using power dissipation equation for getting optimal PDP.

### III. FA CELL

In hybrid logic design style, the FA is designed using XOR–XNOR circuit, sum circuit, and carry circuit as discussed in Section I. The performance of the FA depends upon all three modules. An efficient XOR–XNOR circuit is proposed in Section II. In this section, different strategies of designing sum circuit (Module II) and carry circuit (Module III) are discussed and using four different sum and carry circuits, and the proposed XOR–XNOR circuits, four FA designs are proposed.

#### A. Module II

Module II (SUM circuit) can be implemented using (7) by considering  $C_{\text{IN}}$  and the outputs of XOR–XNOR circuit as input signals. The most important prerequisite of this module is to provide enough driving power to the following gates:

$$\text{SUM} = (A \oplus B) \oplus C'_{\text{IN}} + (A \oplus B)' \oplus C_{\text{IN}}. \quad (7)$$

There are four different designs of Module II that are shown in Fig. 5. The SUM circuit, shown in Fig. 5(a), is implemented using TG as 2 to 1 multiplexer and employed using four transistors. In this circuit, XOR and XNOR signals are used as the inputs to the gate and  $C_{\text{IN}}$  and  $\bar{C}_{\text{IN}}$  are used as the input to the sources of two TGs. This circuit provides low power consumption and high speed with full output swing. However, this circuit has a driving capability problem due to the creation of parasitic capacitance and resistance during fabrication [15] and provide the worst performance in the cascading systems.



Fig. 5. Module II circuit using (a) TG [15], (b) TG with inverter design 1 [15], (c) TG with inverter design 2 [15], and (d) CMOS [8].

The problem of driving capability can be overcome by using an inverter at the output, as shown in Fig. 5(b). However, usage of an extra inverter causes higher power consumption and delay is also increased.

In another type of implementation of SUM circuit, a buffer ( $\overline{C_{IN}}$  followed by an inverter) is used at the input side as shown in Fig. 5(c). By using the buffer at the inputs, the driving capability of the circuit is increased in the cascading systems. The buffer at the input restores the level of degraded output of the previous stage. A CMOS logic style-based SUM circuit [8] is shown in Fig. 5(d). This circuit uses six transistors and provides good driving capability and high robustness.

### B. Module III

The third module of the FA is a carry circuit ( $C_{OUT}$ ). Output carry of the FA can be calculated using XOR and XNOR outputs of module I and previous carry  $C_{IN}$  using (8). In cascading systems, delay of this module affects the overall delay most as the output of this module depends upon the output carry of the previous FA

$$C_{OUT} = (A \oplus B)'A + (A \oplus B)C_{IN}. \quad (8)$$

There are various designs of this module presented by the researchers. The four designs of module III are discussed here. In most of the adders, this module is implemented using the multiplexer, as shown in Fig. 6(a). In this circuit,  $C_{OUT}$  is generated by passing the value of either A (or B) or  $C_{IN}$  at the output based on intermediate signals (XOR and XNOR outputs of module I). For better driving capability, a buffer is added at the output or input side, as shown in Fig. 6(b) and 6(c), respectively. However, it increases power consumption as well as delay of the overall circuit. A CMOS-based module III is



Fig. 6. Module III circuit using (a) TG [15], (b) TG with inverter design 1 [15], (c) TG with inverter design 2 [15], and (d) CMOS [8].

shown in Fig 6(d). It uses four transistors and consumes lesser power while providing better delay performance.

### C. Proposed FA Cell

A hybrid-logic FA can be designed by combining three modules (XOR–XNOR circuit as module I, sum circuit as module II and carry circuit as module III) as discussed in Section-I. A novel XOR–XNOR circuit is proposed in Section-II and some module II and module III circuits are discussed in previous sections. In this section, four designs of hybrid logic style-based FAs are presented by combining the proposed XOR–XNOR circuit and four different module II and module III circuits as illustrated in Fig. 7.

Fig. 7(a) shows the hybrid FA cell design-1 (FA1) designed using the proposed XOR–XNOR circuit as module I and TG-based module II and module III circuits. It consists of 20 transistors. This circuit gives the full output swing and high robustness; however, it has driving capability problems in the cascaded stages such as ripple carry adder. The design 2 of FA cell (FA2) is implemented using 26 transistors, as shown in Fig. 7(b). In this design, the driving capability problem is reduced by using buffers at the output of module II and module III. However, usage of the buffer increases the power consumption and delay of the circuit. The design-3 of FA cell (FA3), in which module II and module III use an inverter on the input side is shown in Fig. 7(c). This structure is also implemented using 26 transistors.

In the design-4 of FA cell (FA4), module II and module III are implemented using the CMOS logic style [8], as shown in Fig. 7(d). This circuit gives the best performance in terms of PDP among all the FA cells. This FA uses 20 transistors, for generating the sum and carry ( $C_{OUT}$ ). Module II of this FA cell is realized using the design shown in Fig. 5(d). In this circuit, a pMOS and an nMOS (P8 and N8) are gated with XOR and XNOR signals generated by Module I. Source and



Fig. 7. Proposed FA cell. (a) Design-1. (b) Design-2. (c) Design-3. (d) Design-4.



Fig. 8. Test bench for the simulation of the circuits.

drain terminals of these transistors (P8 and N8) are connected with  $C_{IN}$  and the output node (SUM), respectively. When XOR is at logic “0” and XNOR is at logic “1,” transistors P8 and N8 are in “ON” state and the output is connected to  $C_{IN}$ . On the other hand, when XOR and XNOR are at logic “1” and logic “0,” respectively, the output is connected to ground through N6 and N7 for high logic at  $C_{IN}$ . Logic “0” at  $C_{IN}$ , in this condition will connect sum output to  $V_{DD}$  through P6 and P7. Module III of this FA cell is realized using the design shown in Fig. 6(d). In this circuit,  $C_{IN}$  is passed through TG for XOR input to be at logic “1.” For XNOR input to be at logic “1,” load to the inputs is distributed by passing A and B from different transistors of TG in place of passing A from both the transistors as for XNOR to be at logic “1,” both the inputs will be the same.

#### IV. SIMULATION RESULTS AND DISCUSSION

##### A. Simulation Environment

All the circuits are simulated using cadence virtuoso in 90-nm generic process design kit (GPK) CMOS process technology. The supply voltage and maximum operating frequencies are taken as 1.2 V, 1 GHz, respectively. For the estimation of power and delay of the circuits in the real environment, a test bench as shown in Fig. 8 is used. It passes the input to the circuits through two inverters and uses a capacitive load equivalent to fan-out of four inverters (FO4) at the output. The size of these inverters is chosen such that there is a sufficient distortion in the input signals [3].

##### B. Simulation Results

1) **XOR-XNOR Circuit:** In the XOR–XNOR circuit, only two input A and input B are required which are taken as a piecewise linear (pwl) input signal [3], [16]. Fig. 9 shows the applied input pattern and the corresponding XOR–XNOR



Fig. 9. Input–output waveforms for the proposed XOR–XNOR circuit.

TABLE II  
PERFORMANCE COMPARISON OF XOR–XNOR CIRCUITS IN TERMS OF POWER DISSIPATION, DELAY, AND PDP AT THE POWER SUPPLY OF 1.2 V AND OPERATING FREQUENCY OF 1 GHz

| XOR-XNOR Circuits   | No. of Transistors | Delay (ps) |       | Power ( $\mu$ W) | PDP ( $10^{-18}$ ) |
|---------------------|--------------------|------------|-------|------------------|--------------------|
|                     |                    | XOR        | XNOR  |                  |                    |
| Aguirre [6]         | 12                 | 59.6       | 58.9  | 8.01             | 479.79             |
| Goel [3]            | 8                  | 140.9      | 141.8 | 10.37            | 1470.46            |
| Radhakrishnan [13]  | 6                  | 106        | 96.5  | 10.9             | 1144.4             |
| Chang [4]           | 10                 | 98.8       | 70.8  | 10.8             | 1057.16            |
| Valshani [14], [17] | 10                 | 73.7       | 52.2  | 10.7             | 788.59             |
| Naseri [15]         | 12                 | 52.9       | 49    | 8.98             | 475.04             |
| Proposed            | 10                 | 48.3       | 49.4  | 8.89             | 439.16             |

outputs. After passing the inputs through two inverters in the test bench, inputs (A1 and B1) have glitches because of current feed-through effect. Due to this effect when the gate is turned off, charge under the gate moves toward either drain or source sides and distorted output finds out. However, these glitches do not affect the XOR–XNOR outputs, and we get full swing at the outputs, as shown in Fig. 9. Also, we obtain the XOR–XNOR outputs simultaneously, which is a prime requirement of hybrid FA design.

Performance of the proposed XOR–XNOR circuit is compared with the existing circuits in terms of worst case delay, power consumption, and PDP in Table II. The delay is calculated from the 50% voltage level at the input to 50% voltage level at the output for all the rise and fall transition, and the worst case delay is selected. For the calculation of



Fig. 10. Input–output waveforms for the proposed FA design-4 circuit.

TABLE III

PERFORMANCE COMPARISON OF DIFFERENT FA CIRCUITS IN TERMS OF POWER DISSIPATION, PDP AT THE POWER SUPPLY OF 1.2 V AND OPERATING FREQUENCY OF 1 GHz

| Full adder Circuits | Number of Transistors | Delay (ps) | Power ( $\mu$ W) | PDP ( $10^{-18}$ ) |
|---------------------|-----------------------|------------|------------------|--------------------|
| Zavarei [18]        | 26                    | 72.3       | 27               | 1952.1             |
| Valashani [17]      | 18                    | 71.3       | 25               | 1782.5             |
| Bhattacharyya [12]  | 16                    | 65.59      | 22.6             | 1482.33            |
| H.Naseri [15]       | 22                    | 55.8       | 26.7             | 1489.86            |
| FA Design-1         | 20                    | 52.3       | 27.9             | 1459.17            |
| FA Design-2         | 26                    | 71.8       | 30.4             | 2182.72            |
| FA Design-3         | 26                    | 74.1       | 29.7             | 2200.77            |
| FA Design-4         | 20                    | 41.5       | 25.8             | 1070.7             |

PDP, the worst case delay of XOR and XNOR outputs is taken. The XOR–XNOR circuit, presented in [13] uses the least number of transistors. However, it has a high worst case delay. In [6], two XOR–XNOR circuits are presented in which the design which has least PDP is included in the table for comparison. The proposed XOR–XNOR circuit has the least worst case delay among all the designs. In terms of power consumption, the design presented in [6] gives the best performance. However, the PDP of the proposed circuit is least among all the circuits. The proposed XOR–XNOR circuit shows improvement in terms of delay and PDP up to 65.16% and 70.13%, respectively, than those of other designs.

2) FA: An input wave pattern as shown in Fig. 10 of three inputs A, B, and C is applied to the inputs of the proposed FA (design-4) through test bench. A1, B1, and C1 show the outputs of the inverter chain used in the test bench. The circuit provides full swing outputs with small glitches, as shown in Fig. 10.

A comparison of the proposed FA designs with other FA designs in terms of the number of transistors, delay, power consumption, and PDP is given in Table III. For the calculation of the delay,  $C_{IN}$  to  $C_{out}$  delay is considered as this delay is crucial for most of the high-level designs. In [15], six designs of FAs are presented. Among these designs, the FA design which has least PDP is included in the table. The proposed FA-4 provides best delay performance among all the FA designs. The delay of the proposed FA design-2 and design-3 is bit higher in single cell design due to the use of buffers at input–output. However, they provide good performance in



(a)



(b)



(c)

Fig. 11. (a) Delay, (b) power, and (c) PDP comparison of different FAs for supply voltages 0.6–1.8 V.

the FA chain. In terms of power consumption, the FA design given by Bhattacharyya *et al.* [12] gives the best performance. However, the proposed FA design-4 has the least PDP among all the designs.

The robustness of the proposed FA cells against power supply voltage variation is also tested and compared with the existing architectures by changing the supply voltage from 0.6 to 1.5 V, as shown in Fig. 11. The delay of the proposed FA cells is compared with that of existing FAs in Fig. 11(a). The proposed FA design-1 and FA design-4 along with the FA presented by Naseri and Timarchi [15] show best performance at different supply voltages. Fig. 11(b) shows the power consumption of different FA cells at different supply voltages. In terms of power consumption, FA cell presented by Bhattacharyya *et al.* [12] shows the best performance among



(a)



(b)



(c)

Fig. 12. (a) Delay, (b) power, and (c) PDP comparison of different FAs for under process corners variations.



Fig. 13. Block diagram of the n-bit CFA.

all the FA cells. However, in terms of PDP, the proposed FA design-1 and FA design-4 show best performance at all the supply voltages, as shown in Fig. 11(c).

The performance of the FA cell is also tested in different process corners by simulating the circuits using different



(a)



(b)



(c)

Fig. 14. (a) Delay, (b) power, and (c) PDP comparison of 2-, 4-, and 8-bit CFAs.

process corner files where the process corners considered are slow–slow (SS), slow–fast (SF), nominal–nominal (NN), fast–slow (FS), and fast–fast (FF). The comparison of worst case delay, power, and PDP at different process corners are shown in Fig. 12. The proposed FA design-4 shows the best delay performance while the FA presented by Bhattacharyya *et al.* [12] consumes the least power in all the process corners across all the FA circuits. The PDP is found to be least for the proposed FA design-4 at all the process corners other than FS. At the process corner, FS, PDP values for the proposed FA design-4 and FA cells presented by



Fig. 15. Layouts of the proposed (a) XOR-XNOR circuit and (b) FA design-4 cell.

Bhattacharyya *et al.* [12] and Naseri and Timarchi [15] are almost the same.

3) *Cascaded FA*: In most of the applications, the FA cells are used in the cascaded form. In this form, the driving capability of the FA plays a vital role. The performance of the proposed FA circuits is also investigated in a cascading structure. Fig. 13 shows the block diagram of the n-bit cascaded full adder (CFA) structure.

The comparison results in terms of delay, power, and PDP for 2-, 4-, and 8-bit CFA structures of different FAs are shown in Fig. 14. The comparison results show that as the number of bit changes, different adders show random delay and power consumption patterns. Some of the adders show better

performance in 2- and 4-bit CFAs but average performance in 8-bit CFA. Similarly, some of the adders show average performance in 2- and 4-bit CFAs while good performance in 8-bit CFA. Fig. 14(a) Shows the comparison results of worst case  $C_{IN}$  to  $C_{OUT}$  delay for 2-, 4-, and 8-bit CFAs. In 2- and 4-bit CFAs, the proposed design-1 and design-4 along with the design presented by Valashani *et al.* [17] show best performance. However, for the 8 bit CFA, the proposed design-2 and design-3 start dominating over the proposed design-1 and design-4, while the design presented by Valashani *et al.* [17] shows the best performance.

The power consumptions of different FA circuits for 2-, 4-, and 8-bit CFAs are shown in Fig. 14(b). For 2- and 4-bit CFAs, the proposed design-4, along with the design presented by Bhattacharya *et al.* [12], shows the best performance. However, for 8-bit CFA, the proposed design-3 along with the design presented by Zavarei *et al.* [18] dominates all other adders. The comparison graph of PDP of different circuits is shown in Fig. 14(c). In 2- and 4-bit CFAs, the proposed design-4 has the least PDP. In 8-bit CFA, the proposed design-3 has minimum PDP than that of all other circuits. From the above discussion, it can be concluded that for the lower bit adders, the proposed design-1/design-4 shows better performance, however, for the higher bit adders, the proposed design-3/design-2 shows better performance among all the designs.

4) *Layouts*: The proposed XOR-XNOR module is implemented with ten transistors, and it has a symmetrical structure which makes the layout of the proposed XOR-XNOR circuit less complex. A layout of the proposed XOR-XNOR circuit is shown in Fig. 15(a). In the proposed FA cells, design-4 shows the best performance among all the FAs; therefore, a layout of the proposed FA design-4 is designed only and shown in Fig. 15(b).

## V. CONCLUSION

In this article, a new 10-T XOR-XNOR circuit was proposed which provided full swing outputs simultaneously. Using the proposed XOR-XNOR circuit, four new FA cells based on hybrid logic design style were also proposed. The performance of the proposed XOR-XNOR circuit and the FA cells was tested by simulating them in virtuoso tool of cadence using GPKD 90 nm CMOS technology. The proposed XOR-XNOR circuit showed a reduction in terms of delay and PDP up to 65.16% and 70.13%, respectively, than those of other designs. The proposed FA design-4 showed 28%–45% improvement in terms of PDP than that of available FAs. The performance of the proposed FA cells was also tested in the cascaded connections. For 2-bit and 4-bit cascading chain, the proposed FA design-4 showed the best performance, while for 8-bit cascading chain the proposed FA-3 showed the best performance among the available FA cells.

## REFERENCES

- [1] A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, "Low-power CMOS digital design," *IEICE Trans. Electron.*, vol. 75, no. 4, pp. 371–382, 1992.

- [2] R. Zimmermann and W. Fichtner, "Low-power logic styles: CMOS versus pass-transistor logic," *IEEE J. Solid-State Circuits*, vol. 32, no. 7, pp. 1079–1090, Jul. 1997.
- [3] S. Goel, A. Kumar, and M. A. Bayoumi, "Design of robust, energy-efficient full adders for deep-submicrometer design using hybrid-CMOS logic style," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 14, no. 12, pp. 1309–1321, Dec. 2006.
- [4] C.-H. Chang, J. Gu, and M. Zhang, "A review of 0.18- $\mu$ m full adder performances for tree structured arithmetic circuits," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 13, no. 6, pp. 686–695, Jun. 2005.
- [5] N. Zhuang and H. Wu, "A new design of the CMOS full adder," *IEEE J. Solid-State Circuits*, vol. 27, no. 5, pp. 840–844, May 1992.
- [6] M. Aguirre-Hernandez and M. Linares-Aranda, "CMOS full-adders for energy-efficient arithmetic applications," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 19, no. 4, pp. 718–721, Apr. 2011.
- [7] V. Foroutan, M. Taheri, K. Navi, and A. A. Mazreah, "Design of two low-power full adder cells using GDI structure and hybrid CMOS logic style," *Integration*, vol. 47, no. 1, pp. 48–61, Jan. 2014.
- [8] M. Agarwal, N. Agrawal, and M. A. Alam, "A new design of low power high speed hybrid CMOS full adder," in *Proc. Int. Conf. Signal Process. Integr. Netw. (SPIN)*, Feb. 2014, pp. 448–452.
- [9] M. Vesterbacka, "A 14-transistor CMOS full adder with full voltage-swing nodes," in *Proc. IEEE Workshop Signal Process. Systems. Design Implement. (SiPS)*, Oct. 1999, pp. 713–722.
- [10] M. Zhang, J. Gu, and C.-H. Chang, "A novel hybrid pass logic with static CMOS output drive full-adder cell," in *Proc. Int. Symp. Circuits Syst. (ISCAS)*, vol. 5, May 2003, p. 5.
- [11] C.-K. Tung, S.-H. Shieh, and C.-H. Cheng, "Low-power high-speed full adder for portable electronic applications," *Electron. Lett.*, vol. 49, no. 17, pp. 1063–1064, Aug. 2013.
- [12] P. Bhattacharyya, B. Kundu, S. Ghosh, V. Kumar, and A. Dandapat, "Performance analysis of a low-power high-speed hybrid 1-bit full adder circuit," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 23, no. 10, pp. 2001–2008, Oct. 2015.
- [13] D. Radhakrishnan, "Low-voltage low-power CMOS full adder," *IEEE Proc.-Circuits, Devices Syst.*, vol. 148, no. 1, pp. 19–24, Feb. 2001.
- [14] M. A. Valashani and S. Mirzakuchaki, "A novel fast, low-power and high-performance XOR-XNOR cell," in *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, May 2016, pp. 694–697.
- [15] H. Naseri and S. Timarchi, "Low-power and fast full adder by exploring new XOR and XNOR gates," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 26, no. 8, pp. 1481–1493, Aug. 2018.
- [16] H. Tien Bui, Y. Wang, and Y. Jiang, "Design and analysis of low-power 10-transistor full adders using novel XOR-XNOR gates," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 49, no. 1, pp. 25–30, 2002.
- [17] M. Amini-Valashani, M. Ayat, and S. Mirzakuchaki, "Design and analysis of a novel low-power and energy-efficient 18T hybrid full adder," *Microelectron. J.*, vol. 74, pp. 49–59, Apr. 2018.
- [18] M. J. Zavarei, M. R. Baghbanmanesh, E. Kargaran, H. Nabovati, and A. Golmakani, "Design of new full adder cell using hybrid-CMOS logic style," in *Proc. 18th IEEE Int. Conf. Electron., Circuits, Syst.*, Dec. 2011, pp. 451–454.



**Jyoti Kandpal** received the B.Tech. degree in electronics and communication engineering from Bipin Chandra Tripathi, Kumaon Engineering College, Dwarahat, India, in 2012, and the M.Tech. degree in VLSI design from the Department of Electronics and Communication Engineering, Uttarakhand Technical University, Dehradun, India, in 2015. She is currently pursuing the Ph.D. degree with the College of Technology, G. B. Pant University of Agriculture and Technology, Pantnagar, India.

Her research interest includes VLSI design and high-performance digital circuits design.



**Abhishek Tomar** (Member, IEEE) was born in Haridwar, India. He received the B.Sc. (Engg.) degree in electrical engineering from Dayalbagh Educational Institute, Agra, India, in 2000, the M.Tech. degree in integrated electronic circuits from IIT Delhi, Delhi, India, and the Ph.D. degree in electronics from Kyushu University, Fukuoka, Japan under Monbu-Kagakusho Scholarship of the Japanese Government in 2010.

He is currently engaged in the study and design of high-performance digital circuit design, and RF

CMOS system large scale integration (LSI), as an Associate Professor with the Department of Electronics and Communication Engineering, G. B. Pant University of Agriculture and Technology, Pantnagar, India. He has authored or coauthored more than 40 publications in different journals and conference proceedings.



**Mayur Agarwal** (Member, IEEE) received the B.Tech. degree in electronics and communication engineering from the Dr. K. N. Modi Institute of Engineering and Technology, Modinagar, India, in 2007, under Uttar Pradesh Technical University, Lucknow, India, the M.Tech. degree in digital systems from the Department of Electronics and Communication Engineering, Motilal Nehru National Institute of Technology, Allahabad, India, in 2009, and the Ph.D. degree from IIT Kharagpur, Kharagpur, India, in 2018.

He joined as an Assistant Professor with the Department of Electronics and Communication Engineering, College of Technology, G. B. Pant University of Agriculture and Technology (GBPUAT) Pantnagar, in August 2018, under Technical Education Quality Improvement Programme-III (TEQIP-III). His current research interests include digital VLSI design, ultrasound beamforming, and digital signal processing.



**K. K. Sharma** received the B.Sc. (Engg.) degree in electrical engineering and the M.Sc. (Engg.) degree in instrumentation and control from Aligarh Muslim University, Aligarh, India, in 1987 and 1990, and the Ph.D. degree in electronics and communication engineering from the Govind Ballabh Pant University of Agriculture and Technology, Pantnagar, India, in 2009.

He is currently a Professor of Engineering with the Govind Ballabh Pant University of Agriculture and Technology. His research interests include low power analog and digital circuit design as well as high-speed low-power VLSI circuit design.