

# Arithmetic Circuits

## Fast Adders

---

Virendra Singh

Professor

Computer Architecture and Dependable Systems Lab

Department of Electrical Engineering

Indian Institute of Technology Bombay

<http://www.ee.iitb.ac.in/~viren/>

E-mail: [viren@ee.iitb.ac.in](mailto:viren@ee.iitb.ac.in)

*CS-226: Digital Logic Design*

---



Lecture 18: 09 March 2021

**CADSL**

# 32-bit Ripple-Carry Adder



# Speeding Up the Adder



# Carry Lookahead

- Given Stage  $i$  from a Full Adder, we know that there will be a carry generated when  $A_i = B_i = "1"$ , whether or not there is a carry-in.
- Alternately, there will be a carry propagated if the “half-sum” is “1” and a carry-in,  $C_i$  occurs.
- These two signal conditions are called *generate*, denoted as  $G_i$ , and *propagate*, denoted as  $P_i$  respectively and are identified in the circuit:



# Carry Lookahead

---

- In the ripple carry adder:
  - $G_i$ ,  $P_i$ , and  $S_i$  are local to each cell of the adder
  - $C_i$  is also local each cell
- In the carry lookahead adder, in order to reduce the length of the carry chain,  $C_i$  is changed to a more global function spanning multiple cells
- Defining the equations for the Full Adder in term of the  $P_i$  and  $G_i$ :

$$\begin{aligned} P_i &= \underline{A_i} \oplus \underline{B_i} \\ S_i &= \underline{P_i} \oplus \underline{C_i} \end{aligned}$$

$$\begin{aligned} G_i &= A_i B_i \\ C_{i+1} &= \underline{G_i} + \underline{P_i} \underline{C_i} \end{aligned}$$



# Carry Lookahead Development

- Flatten equations for carry using  $G_i$  and  $P_i$  terms for less significant bits
- Beginning at the cell 0 with carry in  $C_0$ :

$$\underline{C_1} = G_0 + P_0 C_0 \quad \checkmark$$

$$\underline{C_2} = G_1 + P_1 \underline{C_1} = G_1 + P_1(G_0 + P_0 C_0) \\ = G_1 + P_1 G_0 + P_1 P_0 C_0$$

$$\underline{C_3} = G_2 + P_2 \underline{C_2} = G_2 + P_2(G_1 + P_1 G_0 + P_1 P_0 C_0) \\ = G_2 + P_2 G_1 + P_2 P_1 G_0 + P_2 P_1 P_0 C_0$$

$$C_4 = \underline{G_3 + P_3 \underline{C_3}} \quad \checkmark$$

$$= G_3 + P_3 G_2 + P_3 P_2 G_1 + P_3 P_2 P_1 G_0 + P_3 P_2 P_1 P_0 C_0$$

3 level implementation

$T_{\text{delay}} \propto n$



# Fast Adders ✓

---

- In general, any output of a 32-bit adder can be evaluated as a logic expression in terms of all 65 inputs.
- Number of levels of logic can be reduced to  $\log_2 N$  for N-bit adder. Ripple-carry has N levels.  

- More gates are needed, about  $\log_2 N$  times that of ripple-carry design.  




# Prefix Computation



# Key Architectures for Carry Calculation

---

- 1960: J. Sklansky
- 1973: Kogge-Stone adder ✓
- 1980: Ladner-Fisher adder
- 1982: Brent-Kung adder ✓
- 1987: Han Carlson adder
- 1999: S. Knowles

Other parallel adder architectures:

- 1981: H. Ling adder
- 2001: Beaumont-Smith



# Binary Addition

- **Input:** two  $n$ -bit binary numbers  $\underline{a_{n-1} \dots a_1 a_0}$  and  $\underline{b_{n-1} \dots b_1 b_0}$ , one bit carry-in  $\underline{c_0}$   $\underline{2^{n+1}}$
- **Output:**  $n$ -bit sum  $\underline{s_{n-1} \dots s_1 s_0}$  and one bit carry out  $\underline{c_n}$
- Prefix Addition: Carry generation & propagation

Generate:  $g_i \equiv a_i b_i$

Propagate:  $p_i \equiv a_i \oplus b_i$

$$c_{i+1} \equiv g_i + p_i \cdot c_i$$

$$s_i \equiv c_i \oplus (a_i \oplus b_i)$$

$$= c_i \oplus p_i$$

Block / Group



# Binary Addition as a prefix sum problem.

$$\begin{array}{r} a_3 \quad a_2 \quad a_1 \quad a_0 \\ + b_3 \quad b_2 \quad b_1 \quad b_0 \\ \hline \end{array}$$

A stage  $i$  will generate a carry if  
 $g_i = a_i \cdot b_i$   
and propagate a carry if  
 $p_i = \text{XOR}(a_i, b_i)$   
Hence for stage  $i$ :  
 $c_i = g_i + p_i c_{i-1}$

With :

$$(G_i, P_i) \equiv (g_i, p_i) \circ (G_{i-1}, P_{i-1})$$
$$(g_x, p_x) \circ (g_y, p_y) \equiv (g_x \# p_x \cdot g_y, p_x \cdot p_y)$$

Where :

$$(G_0, P_0) \equiv (g_0, p_0)$$

$\hookrightarrow G(b:0)$

We have :

$$(G_1, P_1) \equiv (g_1, p_1)$$

$$(G_2, P_2) \equiv (g_2, p_2) \circ (G_1, P_1) \equiv (g_2 \# p_2 \cdot g_1, p_2 \cdot p_1)$$

$$\begin{aligned} (G_3, P_3) &\equiv (g_3, p_3) \circ (G_2, P_2) \equiv (g_3 \# p_3 \cdot (g_2 \# p_2 \cdot g_1), p_3 \cdot p_2 \cdot p_1) \\ &\equiv (g_3 \# p_3 \cdot g_2 \# p_3 \cdot p_2 \cdot g_1), p_3 \cdot p_2 \cdot p_1 \end{aligned}$$

etc ...

... The familiar  
carry bit generating  
equations for stage  $i$   
in a CLA adder.



# Parallel Prefix Adders

- The parallel prefix adder employs the 3-stage structure of the CLA adder. The improvement is in the carry generation stage which is the most intensive one:



# Prefix Addition – Formulation

Pre-processing:

$$g_i \equiv a_i b_i \quad p_i \equiv a_i \oplus b_i$$

$\frac{G_0:0}{G_1:1}$   
 $\frac{G_1:1}{G_2:2}$

Prefix Computation:

$$\begin{cases} G_{[i:k]} = G_{[i:j]} \# P_{[i:j]} G_{[j-1:k]} \\ P_{[i:k]} \equiv P_{[i:j]} \cdot P_{[j-1:k]} \end{cases}$$

parallel

Scope  
to make  
fast

Post-processing:

$$\begin{aligned} c_{i+1} &\equiv G_{[i:0]} \# P_{[i:0]} \cdot c_0 \\ s_i &= p_i \oplus c_i \end{aligned}$$



# Computation of Carries – Prefix Graphs

The components usually seen in a prefix graph are the following:

*processing component:*



*buffer component:*



$$(g_{out}, p_{out}) = (g_{in_1} + p_{in_1} \cdot g_{in_2}, p_{in_1} \cdot p_{in_2})$$

$$(g_{out}, p_{out}) = (g_{in}, p_{in})$$

# Prefix graphs for representation of Prefix addition

- Serial adder carry generation represented by prefix graphs



# Prefix Adder – Prefix Structure Graph



# Classical Prefix Adders



1982

Brent-Kung:

Logical levels:  $2\log_2 n - 1$

Max fanouts: 2



1967

Sklansky:

Logical levels:  $\log_2 n$

Max fanouts:  $n/2$



1973

Kogge-Stone:

Logical levels:  $\log_2 n$

Max fanouts: 2



# Kogge Stone Adder

$$A = 1001 \quad B = 1100 \quad \text{Sum} = 10101$$



$$\log_2 n = \lceil \log_2 4 \rceil = 2$$



# 1973: Kogge-Stone adder



delay (prepon + post pone +  
 $\log n$  carry generation)

delay  $\log n$

- The Kogge-Stone adder has:
  - Low depth  $\log n$
  - High node count (implies more area).
  - Minimal fan-out of 1 at each node (implies faster performance).

# 1982: Brent-Kung (BK) Adder



Brent and Kung, “A regular layout for parallel adders”, In IEEE transaction for Computers, 1982



# Brent-Kung (BK) Adder



$$2 \log n - 1$$

$$G_i, P_i$$

Parallel Adder  
Architectures

Tree, of do Group prefix  
carry computation units



# Thank You

