

# Carry Look Ahead Adder

Dhimant Bhuva  
 Electronic and Communication  
 Engineering  
 IIIT Hyderabad  
 Hyderabad, India  
 dhimant.bhuva@students.iit.ac.in

**Abstract**—The Carry Look-Ahead (CLA) Adder is designed and implemented using 180nm CMOS technology, reducing propagation delay through hierarchical blocks. Validated using NGSPICE simulations and MAGIC layouts, the design is verified post-layout. A Verilog-based structural description enables FPGA implementation for hardware validation. Key performance metrics, including delay, maximum frequency, and power efficiency, demonstrate the CLA adder's high-speed and reliable operation.

**Keywords**—*CLA, NGSPICE, MAGIC, Delay, Verilog, FPGA*

## I. INTRODUCTION

Carry Look Ahead Adder (*CLA*) are improved version of Ripple Carry Adder (*RPA*). The primary disadvantage of RPA is carry propagation delay. Input bits are  $A_i$ ,  $B_i$  and  $C_0$  ( $i = 0 - 3$ ). This problem is solved in CLA by pre-computing carries in parallel rather than waiting for “ripple”. Sequential dependency is removed by computing propagate( $P_i$ ) and generate( $G_i$ ) for each bit.

$$P_i = A_i \wedge B_i$$

$$G_i = A_i \cdot B_i$$

$$C_{i+1} = G_i + P_i \cdot C_i$$

Carry are computed in parallel as

$$C_1 = G_0 + P_0 \cdot C_0$$

$$C_2 = G_1 + P_1 \cdot G_0 + P_1 \cdot P_0 \cdot C_0$$

$$C_3 = G_2 + P_2 \cdot G_1 + P_2 \cdot P_1 \cdot G_0 + P_2 \cdot P_1 \cdot P_0 \cdot C_0$$

$$C_4 = G_3 + P_3 \cdot G_2 + P_3 \cdot P_2 \cdot G_1 + P_3 \cdot P_2 \cdot P_1 \cdot G_0 + P_3 \cdot P_2 \cdot P_1 \cdot P_0 \cdot C_0$$

Sum bits are calculated as

$$S_i = P_i \wedge C_i .$$

Also, positive edge triggered flip-flops are used to provide input and obtain output. If input bits are available before the rising edge of the clock, then output will be computed and present at the next rising edge of the clock.

## II. PROPOSED STRUCTURE

For implementing CLA we need following logic gates :

- NOT
- OR – 2,3,4 and 5 input.
- AND – 2,3,4 and 5 input.
- XOR – 2 input.

All logic gates have been designed using Complementary CMOS logic style. D flip-flops are designed using True-Single-Phase-Clock (TSPC) topology.

Firstly, all inputs – bits  $a_0$  through  $a_3$ ,  $b_0$  through  $b_3$  and

$c_0$  – are given as input to D flip-flop. Outputs of these flip-flops are then used to compute resulting sum bits,  $s_0$  through  $s_3$ , and carry bit,  $c_4$ , which itself are input to next stage of flip-flops. Overall design looks like 3 stage design , where 1<sup>st</sup> and 3<sup>rd</sup> stage are flip-flops and 2<sup>nd</sup> stage is logical block implementing CLA. All the gates have  $V_{DD}$  level of 1.8V.



Figure 1 : 3 stage design overview

### III. TOPOLOGY AND MOSFET SIZING

Minimum feature size is lambda,  $\lambda = 0.09\mu\text{m}$ . Minimum size not gate follows  $W_P/ W_N = 20 \lambda/10 \lambda$ .

#### A. NOT gate



Figure 2 : NOT gate layout

- Width of NMOS,  $W_N = 900\text{nm}$
- Width of PMOS ,  $W_P = 1800\text{nm}$

#### B. OR gates

I have implemented OR gates by cascading NOR gate with inverter. Same NOT gate has implemented as stated above. Sizing for NOR gate is provided below :

- 2 input
  - Width of NMOS,  $W_N = 900\text{nm}$
  - Width of PMOS ,  $W_P = 3600\text{nm}$
- 3 input
  - Width of NMOS,  $W_N = 900\text{nm}$
  - Width of PMOS ,  $W_P = 5400\text{nm}$
- 4 input
  - Width of NMOS,  $W_N = 900\text{nm}$
  - Width of PMOS ,  $W_P = 7200\text{nm}$
- 5 input
  - Width of NMOS,  $W_N = 900\text{nm}$
  - Width of PMOS ,  $W_P = 9000\text{nm}$

#### C. AND gates

Similarly, as OR gates , I have implemented AND gates by cascading NAND gate with inverter. Same NOT gate has implemented as stated above. Sizing for NAND gate is provided below :

- 2 input
  - Width of NMOS,  $W_N = 1800\text{nm}$
  - Width of PMOS ,  $W_P = 1800\text{nm}$
- 3 input
  - Width of NMOS,  $W_N = 2700\text{nm}$
  - Width of PMOS ,  $W_P = 1800\text{nm}$
- 4 input
  - Width of NMOS,  $W_N = 3600\text{nm}$
  - Width of PMOS ,  $W_P = 1800\text{nm}$
- 5 input
  - Width of NMOS,  $W_N = 4500\text{nm}$
  - Width of PMOS ,  $W_P = 1800\text{nm}$

#### D. XOR gate

- Width of NMOS,  $W_N = 1800\text{nm}$
- Width of PMOS ,  $W_P = 3600\text{nm}$

#### E. D flip-flop

- Width of NMOS,  $W_N = 900\text{nm}$
- Width of PMOS ,  $W_P = 1800\text{nm}$

### IV. NGSPICE SIMULATION

Using TSMC\_180nm.txt technology file, I have implemented complete CLA circuit.

#### A. NOT gate

```
** NOT GATE **
.subckt inv y x vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}
```

```

M1 y x vdd vdd CMOSP W={width_P}
L={2*LAMBDA}
+ AS={5*width_P*LAMBDA}
PS={10*LAMBDA+2*width_P}
+AD={5*width_P*LAMBDA}
PD={10*LAMBDA+2*width_P}
M2 y x gnd gnd CMOSN W={width_N}
L={2*LAMBDA}
+ AS={5*width_N*LAMBDA}
PS={10*LAMBDA+2*width_N}
+AD={5*width_N*LAMBDA}
PD={10*LAMBDA+2*width_N}
.ends inv

```



Figure 3 : NOT functionality

## B. OR gates

- 2 input

```

.subckt or2 a b out vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}

* 2input NOR gate Wn = width_N Wp = 2*width_P *
M1 x a vdd vdd CMOSP W={2*width_P} L={2*LAMBDA}
+ AS={5*2*width_P*LAMBDA} PS={10*LAMBDA+2*2*width_P}
AD={5*2*width_P*LAMBDA} PD={10*LAMBDA+2*2*width_P}
M2 nout b vdd CMOSP W={2*width_P} L={2*LAMBDA}
+ AS={5*2*width_P*LAMBDA} PS={10*LAMBDA+2*2*width_P}
AD={5*2*width_P*LAMBDA} PD={10*LAMBDA+2*2*width_P}

M3 nout a gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M4 nout b gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}

*not gate*
M5 out nout vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M6 out nout gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
.ends or2

```

Figure 4 : 2 input or gate netlist



Figure 5 : 2input or gate functionality

- 3 input

```

.subckt or3 a b c out vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}

* 3input NOR gate Wn = width_N Wp = 3*width_P *
M1 x a vdd vdd CMOSP W={3*width_P} L={2*LAMBDA}
+ AS={5*3*width_P*LAMBDA}
PS={10*LAMBDA+2*3*width_P}
AD={5*3*width_P*LAMBDA}
PD={10*LAMBDA+2*3*width_P}
M2 y b x vdd CMOSP W={3*width_P} L={2*LAMBDA}
+ AS={5*3*width_P*LAMBDA}
PS={10*LAMBDA+2*3*width_P}
AD={5*3*width_P*LAMBDA}
PD={10*LAMBDA+2*3*width_P}
M3 nout c y vdd CMOSP W={3*width_P} L={2*LAMBDA}
+ AS={5*3*width_P*LAMBDA}
PS={10*LAMBDA+2*3*width_P}
AD={5*3*width_P*LAMBDA}
PD={10*LAMBDA+2*3*width_P}

M4 nout a gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M5 nout b gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M6 nout c gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}

*not gate*
M7 out nout vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M8 out nout gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
.ends or3

```

Figure 6 : 3 input or gate netlist



• Figure 7 : 3 input or gate functionality

- 4 input



Figure 9 : 4 input or gate functionality

- 5 input

```
.subckt or4 a b c d e out vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}
* 4input NOR gate Wn = width_N Wp = 4*width_P *
M1 x a vdd vdd CMOSP W={4*width_P} L={2*LAMBDA}
+ AS={5*4*width_P*LAMBDA}
PS={10*LAMBDA+2*4*width_P} AD={5*4*width_P*LAMBDA}
PD={10*LAMBDA+2*4*width_P}
M2 y b vdd vdd CMOSP W={4*width_P} L={2*LAMBDA}
+ AS={5*4*width_P*LAMBDA}
PS={10*LAMBDA+2*4*width_P} AD={5*4*width_P*LAMBDA}
PD={10*LAMBDA+2*4*width_P}
M3 z c y vdd CMOSP W={4*width_P} L={2*LAMBDA}
+ AS={5*4*width_P*LAMBDA}
PS={10*LAMBDA+2*4*width_P} AD={5*4*width_P*LAMBDA}
PD={10*LAMBDA+2*4*width_P}
M4 nout d z vdd CMOSP W={4*width_P} L={2*LAMBDA}
+ AS={5*4*width_P*LAMBDA}
PS={10*LAMBDA+2*4*width_P} AD={5*4*width_P*LAMBDA}
PD={10*LAMBDA+2*4*width_P}

M5 nout a gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M6 nout b gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M7 nout c gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M8 nout d gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}

*not gate*
M9 nout nout vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M10 out nout gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
.ends or4
```

```
.subckt or5 a b c d e out vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}
* 5input NOR gate Wn = width_N Wp = 5*width_P *
M1 w a vdd vdd CMOSP W={5*width_P} L={2*LAMBDA}
+ AS={5*5*width_P*LAMBDA} PS={10*LAMBDA+2*5*width_P}
AD={5*5*width_P*LAMBDA} PD={10*LAMBDA+2*5*width_P}
M2 y b w vdd vdd CMOSP W={5*width_P} L={2*LAMBDA}
+ AS={5*5*width_P*LAMBDA} PS={10*LAMBDA+2*5*width_P}
AD={5*5*width_P*LAMBDA} PD={10*LAMBDA+2*5*width_P}
M3 z c x vdd CMOSP W={5*width_P} L={2*LAMBDA}
+ AS={5*5*width_P*LAMBDA} PS={10*LAMBDA+2*5*width_P}
AD={5*5*width_P*LAMBDA} PD={10*LAMBDA+2*5*width_P}
M4 z d y vdd CMOSP W={5*width_P} L={2*LAMBDA}
+ AS={5*5*width_P*LAMBDA} PS={10*LAMBDA+2*5*width_P}
AD={5*5*width_P*LAMBDA} PD={10*LAMBDA+2*5*width_P}
M5 nout e z vdd CMOSP W={5*width_P} L={2*LAMBDA}
+ AS={5*5*width_P*LAMBDA} PS={10*LAMBDA+2*5*width_P}
AD={5*5*width_P*LAMBDA} PD={10*LAMBDA+2*5*width_P}

M6 nout a gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M7 nout b gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M8 nout c gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M9 nout d gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M10 nout e gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}

*not gate*
M11 out nout vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M12 out nout gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
.ends or5
```

Figure 10 : 5 input or gate netlist

Figure 8 : 4 input or gate netlist



Figure 11 : 5 input or gate functionality

### C. AND gates

- 2 input

```
** 2 input AND GATE **
.subckt and2 a b out vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}
* 2input NAND gate Wn = 2*width_N Wp = width_P*
M1 nout a vdd vdd CMOSP W={width_P}
L={2*LAMBDA}
+ AS={5*width_P*LAMBDA}
PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA}
PD={10*LAMBDA+2*width_P}
M2 nout b vdd vdd CMOSP W={width_P}
L={2*LAMBDA}
+ AS={5*width_P*LAMBDA}
PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA}
PD={10*LAMBDA+2*width_P}

M3 nout a x gnd CMOSN W={2*width_N}
L={2*LAMBDA}
+ AS={5*2*width_N*LAMBDA}
PS={10*LAMBDA+2*2*width_N}
AD={5*2*width_N*LAMBDA}
PD={10*LAMBDA+2*2*width_N}
M4 x b gnd gnd CMOSN W={2*width_N}
L={2*LAMBDA}
+ AS={5*2*width_N*LAMBDA}
PS={10*LAMBDA+2*2*width_N}
AD={5*2*width_N*LAMBDA}
PD={10*LAMBDA+2*2*width_N}

*not gate*
M5 out nout vdd vdd CMOSP W={width_P}
L={2*LAMBDA}
+ AS={5*width_P*LAMBDA}
PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA}
PD={10*LAMBDA+2*width_P}
M6 out nout gnd gnd CMOSN W={width_N}
L={2*LAMBDA}
+ AS={5*width_N*LAMBDA}
PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA}
PD={10*LAMBDA+2*width_N}
.ends and2
```

Figure 12 : 2 input and gate netlist



Figure 13 : 2 input and gate functionality

- 3 input

```
** 3 input AND GTE **
.subckt and3 a b c out vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}
* 3input NAND gate Wn = 3*width_N Wp = width_P*
M1 nout a vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA}
PS={10*LAMBDA+2*width_P} AD={5*width_P*LAMBDA}
PD={10*LAMBDA+2*width_P}
M2 nout b vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA}
PS={10*LAMBDA+2*width_P} AD={5*width_P*LAMBDA}
PD={10*LAMBDA+2*width_P}
M3 nout c vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA}
PS={10*LAMBDA+2*width_P} AD={5*width_P*LAMBDA}
PD={10*LAMBDA+2*width_P}

M4 nout a x gnd CMOSN W={3*width_N} L={2*LAMBDA}
+ AS={5*3*width_N*LAMBDA}
PS={10*LAMBDA+2*3*width_N}
AD={5*3*width_N*LAMBDA}
PD={10*LAMBDA+2*3*width_N}
M5 x b y gnd CMOSN W={3*width_N} L={2*LAMBDA}
+ AS={5*3*width_N*LAMBDA}
PS={10*LAMBDA+2*3*width_N}
AD={5*3*width_N*LAMBDA}
PD={10*LAMBDA+2*3*width_N}
M6 y c gnd gnd CMOSN W={3*width_N} L={2*LAMBDA}
+ AS={5*3*width_N*LAMBDA}
PS={10*LAMBDA+2*3*width_N}
AD={5*3*width_N*LAMBDA}
PD={10*LAMBDA+2*3*width_N}

*not gate*
M7 out nout vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA}
PS={10*LAMBDA+2*width_P} AD={5*width_P*LAMBDA}
PD={10*LAMBDA+2*width_P}
M8 out nout gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA}
PS={10*LAMBDA+2*width_N} AD={5*width_N*LAMBDA}
PD={10*LAMBDA+2*width_N}
.ends and3
```

Figure 14 : 3 input and gate netlist



Figure 15 : 3 input and gate functionality

- 4 input

```
.subckt and4 a b c d out vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}
* 4input NAND gate Wn = 4*width_N Wp = width_P*
M1 nout a vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M2 nout b vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M3 nout c vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M4 nout d vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}

M5 nout a x gnd CMOSN W={4*width_N} L={2*LAMBDA}
+ AS={5*4*width_N*LAMBDA}
PS={10*LAMBDA+2*4*width_N}
AD={5*4*width_N*LAMBDA}
PD={10*LAMBDA+2*4*width_N}
M6 x b y gnd CMOSN W={4*width_N} L={2*LAMBDA}
+ AS={5*4*width_N*LAMBDA}
PS={10*LAMBDA+2*4*width_N}
AD={5*4*width_N*LAMBDA}
PD={10*LAMBDA+2*4*width_N}
M7 y c z gnd CMOSN W={4*width_N} L={2*LAMBDA}
+ AS={5*4*width_N*LAMBDA}
PS={10*LAMBDA+2*4*width_N}
AD={5*4*width_N*LAMBDA}
PD={10*LAMBDA+2*4*width_N}
M8 z d gnd gnd CMOSN W={4*width_N} L={2*LAMBDA}
+ AS={5*4*width_N*LAMBDA}
PS={10*LAMBDA+2*4*width_N}
AD={5*4*width_N*LAMBDA}
PD={10*LAMBDA+2*4*width_N}

*not gate*
M9 out nout vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M10 out nout gnd gnd CMOSN W={width_N}
L={2*LAMBDA}
+ AS={5*width_N*LAMBDA}
PS={10*LAMBDA+2*width_N} AD={5*width_N*LAMBDA}
PD={10*LAMBDA+2*width_N}
.ends and4
```

Figure 16 : 4 input and gate netlist



Figure 17 : 4 input and gate functionality

- 5 input

```
.subckt and5 a b c d e out vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}
* 5input NAND gate Wn = 5*width_N Wp = width_P*
M1 nout a vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M2 nout b vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M3 nout c vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M4 nout d vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M5 nout e vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}

M6 nout w gnd CMOSN W={5*width_N} L={2*LAMBDA}
+ AS={5*5*width_N*LAMBDA}
PS={10*LAMBDA+2*5*width_N}
AD={5*5*width_N*LAMBDA}
PD={10*LAMBDA+2*5*width_N}
M7 w b x gnd CMOSN W={5*width_N} L={2*LAMBDA}
+ AS={5*5*width_N*LAMBDA}
PS={10*LAMBDA+2*5*width_N}
AD={5*5*width_N*LAMBDA}
PD={10*LAMBDA+2*5*width_N}
M8 x c y gnd CMOSN W={5*width_N} L={2*LAMBDA}
+ AS={5*5*width_N*LAMBDA}
PS={10*LAMBDA+2*5*width_N}
AD={5*5*width_N*LAMBDA}
PD={10*LAMBDA+2*5*width_N}
M9 y d z gnd CMOSN W={5*width_N} L={2*LAMBDA}
+ AS={5*5*width_N*LAMBDA}
PS={10*LAMBDA+2*5*width_N}
AD={5*5*width_N*LAMBDA}
PD={10*LAMBDA+2*5*width_N}
M10 z e gnd gnd CMOSN W={5*width_N} L={2*LAMBDA}
+ AS={5*5*width_N*LAMBDA}
PS={10*LAMBDA+2*5*width_N}
AD={5*5*width_N*LAMBDA}
PD={10*LAMBDA+2*5*width_N}

*not gate*
M11 out nout vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M12 out nout gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
```

Figure 18 : 5 input and gate netlist



Figure 19 : 5 input and gate functionality

#### D. XOR gate

```
.subckt xor a abar b bbar out vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}

* 2input XOR gate with Wn = 2*width_N and Wp = 2*width_P *
M1 p a vdd vdd CMOSP W={2*width_P} L={2*LAMBDA}
+ AS={5*2*width_P*LAMBDA}
PS={10*LAMBDA+2*2*width_P}
AD={5*2*width_P*LAMBDA}
PD={10*LAMBDA+2*2*width_P}
M2 out bbar p vdd CMOSP W={2*width_P} L={2*LAMBDA}
+ AS={5*2*width_P*LAMBDA}
PS={10*LAMBDA+2*2*width_P}
AD={5*2*width_P*LAMBDA}
PD={10*LAMBDA+2*2*width_P}
M3 q abar vdd vdd CMOSP W={2*width_P} L={2*LAMBDA}
+ AS={5*2*width_P*LAMBDA}
PS={10*LAMBDA+2*2*width_P}
AD={5*2*width_P*LAMBDA}
PD={10*LAMBDA+2*2*width_P}
M4 out b q vdd CMOSP W={2*width_P} L={2*LAMBDA}
+ AS={5*2*width_P*LAMBDA}
PS={10*LAMBDA+2*2*width_P}
AD={5*2*width_P*LAMBDA}
PD={10*LAMBDA+2*2*width_P}

M5 out a r gnd CMOSN W={2*width_N} L={2*LAMBDA}
+ AS={5*2*width_N*LAMBDA}
PS={10*LAMBDA+2*2*width_N}
AD={5*2*width_N*LAMBDA}
PD={10*LAMBDA+2*2*width_N}
M6 r b gnd gnd CMOSN W={2*width_N} L={2*LAMBDA}
+ AS={5*2*width_N*LAMBDA}
PS={10*LAMBDA+2*2*width_N}
AD={5*2*width_N*LAMBDA}
PD={10*LAMBDA+2*2*width_N}
M7 out abas s gnd CMOSN W={2*width_N} L={2*LAMBDA}
+ AS={5*2*width_N*LAMBDA}
PS={10*LAMBDA+2*2*width_N}
AD={5*2*width_N*LAMBDA}
PD={10*LAMBDA+2*2*width_N}
M8 s bbar gnd gnd CMOSN W={2*width_N} L={2*LAMBDA}
+ AS={5*2*width_N*LAMBDA}
PS={10*LAMBDA+2*2*width_N}
AD={5*2*width_N*LAMBDA}
PD={10*LAMBDA+2*2*width_N}
.ends xor
```

Figure 20 : 2 input xor gate netlist



Figure 21 : 2 input xor gate functionality

#### E. D flip-flop

```
.subckt ffposedge d q clk vdd gnd
.param width_P = {20*LAMBDA}
.param width_N = {10*LAMBDA}
M1 x d gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M2 x clk a vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M3 a d vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}

M4 b clk gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M5 y x b gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M6 y clk vdd vdd CMOSP W={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}

M7 c y gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M8 qbar clk c gnd CMOS W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
M9 qbay vdd vdd CMOS P={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}

M10 q qbar vdd vdd CMOS P={width_P} L={2*LAMBDA}
+ AS={5*width_P*LAMBDA} PS={10*LAMBDA+2*width_P}
AD={5*width_P*LAMBDA} PD={10*LAMBDA+2*width_P}
M11 q qbar gnd gnd CMOSN W={width_N} L={2*LAMBDA}
+ AS={5*width_N*LAMBDA} PS={10*LAMBDA+2*width_N}
AD={5*width_N*LAMBDA} PD={10*LAMBDA+2*width_N}
.ends ffposedge
```

Figure 22 : TSPC flip-flop netlist



Figure 23 : TSPC flip-flop functionality



Figure 25 : TSPC flip-flop setup time analysis-II

## V. D FLIP-FLOP TIMING ANALYSIS

### A. SETUP time

When both clock,clk, and input D signal change at same time , output does not follow input.



Figure 24 : TSPC flip-flop setup time analysis-I

Then by trial-and-error method , checking values for setup time, by changing D input just before rising edge of clock. Setup time was found to be **0.12ns**.

### B. HOLD time

We know for TSPC flip-flop , hold time is 0 second.



Figure 26 : TSPC flip-flop hold time analysis

### C. Clock to Q delay

Using meas in NGSpice , I computed clock to Q delay , which came out to be **85ps**.

```
.control
run
meas tran tcq trig v(clk) val=0.9 rise=1 targ v(0) val=0.9 rise=1
plot v(clk) v(d)+2 v(q)+4
set hcopypscolor = 1
set color0=white
set color1=black
.endc
```

Figure 27 : NGSpice code for tcq delay

No. of Data Rows : 135

tcq = 8.525702e-11 targ= 1.023526e-08 trig= 1.015000e-08

Figure 28 : TSPC flip-flop tcq time

- $T_{SU} = 0.12\text{ns}$
- $T_H = 0\text{s}$
- $T_{CO} = 0.085\text{ns}$

## VI. STICK DIAGRAM

Legends :



Figure 28 : Legends

- NOT gate



Figure 29 : stick diagram of not gate

- NOR gates



Figure 30 : stick diagram of 2 input nor gate



Figure 31 : stick diagram of 3 input nor gate



Figure 32 : stick diagram of 4 input nor gate



Figure 33 : stick diagram of 5 input nor gate

- NAND gates



Figure 34 : stick diagram of 2 input nand gate



Figure 35 : stick diagram of 3 input nand gate



Figure 36 : stick diagram of 4 input nand gate



Figure 37 : stick diagram of 5 input nand gate

- XOR gate



Figure 38 : stick diagram of 2 input xor gate

## VII. MAGIC LAYOUT & POST-LAYOUT SIMULATION

### A. NOT gate



Figure 39 : Not gate layout



Figure 40 : Not gate functionality

### B. OR gates

- 2 input



Figure 41 : 2 input or gate layout



Figure 42 : 2 input or gate functionality

- 3 input



Figure 43 : 3 input or gate layout

- 4 input



Figure 45 : 4 input or gate layout



Figure 44 : 3 input or gate functionality



Figure 46 : 4 input or gate functionality

- 5 input



Figure 47 : 5 input or gate layout



Figure 48 : 5 input or gate functionality

### C. AND gates

- 2 input



Figure 49 : 2 input and gate layout



Figure 50 : 2 input and gate functionality

- 3 input
- 4 input



Figure 51 : 3 input and gate layout



Figure 53 : 4input and gate functionality



Figure 52 : 3 input and gate functionality



Figure 54 : 4 input and gate functionality

- 5 input



Figure 55 : 5input and gate functionality

#### D. XOR gate



Figure 57 : 2 input xor layout



Figure 56 : 5 input and gate functionality



Figure 58 : 2 input and xor gate functionality

### E. D flip-flop



Figure 59 : D flip-flop layout



Figure 60 : D flip-flop functionality

Example 1 :  $c_0 = 1$



Figure 61 : Complete CLA circuit

Figure 62 : example 1

Example 2 :  $c_0 = 0$



Figure 62 : example 2

#### A. Worst Case Delay

When input  $a = 1111(15)$ ,  $b = 1111(15)$  and  $c_0 = 1$ , we get output as  $s = 1111(15)$  and  $c_4 = 1$ . This means input  $a = 15 + b = 15 + c_0 = 1$ , results into 31. Max Rise Time delay is 0.58ns

```
.meas tran dd trig v(a0) val=0.9 rise=1
+targ v(s3) val=0.9 rise=1
.tran 0.01n 5ns
.control
```



Figure 63 : Max rise time

When input is  $a = 0000(0)$ ,  $b = 0(0)$  and  $c_0 = 0$ , we get output as  $s = 0000(0)$  and  $c_4 = 0$ . This means input  $a = 0 + b = 0 + c_0 = 0$  results into zero. Max Fall Time delay is 0.52ns

## IX. COMPLETE FLOOR PLAN ON MAGIC



Figure 65 : Complete Magic Layoty

```
select: Topmost cell in the window
Root cell box:
  width x height ( llx, lly ), ( urx, ury ) area (units^2)
microns: 201.780 x 170.910 (-64.350, -127.260), ( 137.430, 43.650) 34486.223
lambda: 2242 x 1899 (-715, -1414), ( 1527, 485) 4257558
```

Figure 66 : Area and dimensions  
Width = 201.78  $\mu\text{m}$ , Length = 170.91  $\mu\text{m}$   
Area = 34486.223  $\mu\text{m}^2$



```
tkcon 2.3 Main
File Console Edit Interp Prefs History Help
Lambda: 1 x 1 ( 0, 0 ), ( 1, 1 ) 1
Main console display active (Tcl8.6.14 / Tk8.6.14)
% s
ambiguous command name "s": save saveall andquit savefailed saveonetlist scale scalegrid scan scanctrl
rollupdate scrollview see seek select selection set setgrid setlabel setpoint setscroll setscale
set showterms sideways sleep snap socket sortnets sortpinslr sortpinsrl source specialopen spinbox
spinbutton startselect startup stodlyp straighten stretch string subst suspendall suspendout switch
% b
ambiguous command name "b": bell bgerror binary bind bindtags box boxview break buildfence button b
select: Topmost cell in the window
Root cell box:
  width x height ( llx, lly ), ( urx, ury ) area (units^2)
microns: 201.780 x 170.910 (-64.350, -127.260), ( 137.430, 43.650) 34486.223
lambda: 2242 x 1899 (-715, -1414), ( 1527, 485) 4257558
```

Figure 67: Combined of figure 66 and 65

Hence max delay is 0.58ns.

$$T_{PD} = 0.58\text{ns}$$

Maximum clock speed is given as  $T_{CLK} = T_{PDMAX} + T_{SU} + T_{CQ}$ . Therefore  $T_{CLK} = 0.785\text{ns}$ .

## X. POST-LAYOUT SIMULATION

### A. Timing Analysis for D flip-flop

- Setup time:

When input and output are changed at same time.



Figure 68 : Setup time analysis-I

By trial and error ,  $T_{SU} = 0.13\text{ns}$



Figure 69 : Setup time analysis-II

- Hold time :

We know hold time for TSPC flip-flop is 0 seconds.

- Clock to Q delay

Using NGSpice command

```
run
meas tran tcq trig v(clk) val=0.9 rise=1 targ v(q) val=0.9 rise=1
plot v(clk) v(d)+2 v(q)+4
set hcopyscolor = 1
set color0=white
set color1=black
```

Figure 70 : Tcq NGSpice command

$T_{CO} = 0.075\text{ns}$

```
Reference value : 1.84750e-08
No. of Data Rows : 6068
tcq = 7.542159e-11 targ= 1.022542e-08 trig= 1.015000e-08
```

Figure 71 : Tcq NGSpice terminal

- $T_{SU} = 0.13\text{ns}$
- $T_H = 0\text{s}$
- $T_{CO} = 0.075\text{ns}$

### B. CLA adder delay

Same as previous method I found max fall and maximum rise time. Maximum fall delay is 0.37ns , while maximum rise delay is 0.56ns. Hence  $T_{pd} = 0.56\text{ns}$ .

```
Measurements for Transient Analysis
risedelay = 5.629839e-10 targ= 5.829839e-10 trig= 2.000000e-11
```

Figure 72 : Rise time delay



Figure 73 : Rise time delay graph -I



Figure 74 : Rise time delay graph -II



Table1 :Pre vs Post layout for D flip-flop

|             | Setup Time | Clk to Q delay |
|-------------|------------|----------------|
| Pre-layout  | 0.12n      | 0.085n         |
| Post-layout | 0.13n      | 0.075n         |

Table 2: Pre vs Post layout for CLA adder

|             | Max Rise Time | Max Fall Time |
|-------------|---------------|---------------|
| Pre-layout  | 0.58ns        | 0.52ns        |
| Post-layout | 0.56ns        | 0.37ns        |

Hence ,  $T_{CLK} = T_{SU} + T_{COMAX} + T_{PDMAX} = 0.765\text{ns}$ . Surprisingly , it is less then pre-layout  $T_{CLK}$  of 0.785ns.

Table 3: Pre vs Post layout clock timing

|             | Minimum clock time |
|-------------|--------------------|
| Pre-layout  | 0.785ns            |
| Post-layout | 0.765ns            |

Table 4: Pre vs Post layout Maximum Clock Frequency

|             | Maximum clock frequency |
|-------------|-------------------------|
| Pre-layout  | 1.27GHz                 |
| Post-layout | 1.30GHz                 |

## XI. VERILOG HDL RESULT



Figure 79 : Output of Verilog on terminal

## XII. FPGA IMPLEMENTATION

- Example 1 : a = 9 (red), b = 4 (purple), c = 0 (yellow). Output, s = 1101 (green LEDs) meaning 13. Leftmost bit is most significant bit.



Figure 80 : FPGA example 1

- Example 2 : a = 13(red), b = 12 (purple), c = 0 (yellow). Output, c4 = 1 , s = 1001 (green LEDs) meaning 25. Leftmost bit is most significant bit.



Figure 80 : FPGA example 2

- Google Drive Link for video :  
<https://drive.google.com/drive/folders/1-GoEt94wHtNIwqL6Jj2ryaqdvv-1UmAS>

### XIII. REFERENCES

1. CMOS VLSI Design : Weste & Harris
2. Digital Integrated Circuits : Jan Rabaey
3. Geeksforgeeks.com