



# UM EECS 270 F22

## Introduction to Logic Design

### 19. Designing Fast Adders

ripple adder is really slow when bits are larger

# Half and Full Adder (Review)



| A | B | C | S |
|---|---|---|---|
| 0 | 0 | 0 | 0 |
| 0 | 1 | 0 | 1 |
| 1 | 0 | 0 | 1 |
| 1 | 1 | 1 | 0 |

| A | B | C <sub>i</sub> | C <sub>o</sub> | S |
|---|---|----------------|----------------|---|
| 0 | 0 | 0              | 0              | 0 |
| 0 | 0 | 1              | 0              | 1 |
| 0 | 1 | 0              | 0              | 1 |
| 0 | 1 | 1              | 1              | 0 |
| 1 | 0 | 0              | 0              | 1 |
| 1 | 0 | 1              | 1              | 0 |
| 1 | 1 | 0              | 1              | 0 |
| 1 | 1 | 1              | 1              | 1 |

$$C = AB$$

$$S = A \oplus B$$

$$C_o = AB + (A \oplus B)C_i$$

$$S = A \oplus B \oplus C_i$$

# 8-bit Ripple Carry Adder



Unfortunately this has a very large propagation time for 32 or 64-bit addition

# General Structure of a Carry Lookahead Adder



Figure 6-86

Structure of one stage of a carry-lookahead adder.

# Elaboration of CLA Structure



- Bit generate and propagate
- Group generate and propagate
- Carry lookahead across more than one bit
- Recursive computation of group signals
- Carry computation and propagation
- Delay analysis

我们在 Area 和 delay 中提出, 例: ripple adder delay 太大, 上去 Ppt Area 太大。

# Bit Generate & Propagate



$$\begin{aligned}
 c_{i+1} &= a_i b_i + c_i (a_i + b_i) \\
 &= g_i + p_i c_i
 \end{aligned}$$

why one: why we call  $g_i$  generate signal.  
 if  $g_i$  is one, output is one.  
 regardless of whether we have a carry in or not.  
 we would generate carry out if  $g_i$  is true  
 ( $a_i$  and  $b_i$  are one)

$$g_i = a_i b_i$$

$$p_i = a_i + b_i$$

why two:  
 why we call  $p_i$  propagate signal:  
 $g_i = 0$ , either  $a_i$  or  $b_i$  is one.  
 But  $a_i + b_i = 1$ , and there's a carry in.



g<sub>i</sub> gate tells: No matter what is the carry in there's gonna be a carry out here  
 p<sub>i</sub> gate tells: we will propagate a carry out if the carry in is true.

# Group Generate & Propagate: 2 Bits



$$\begin{aligned}
 C_{i+2} &= g_{i+1} + p_{i+1}c_{i+1} \\
 &= g_{i+1} + p_{i+1}(g_i + p_i c_i) \\
 &= g_{i+1} + p_{i+1}g_i + p_{i+1}p_i c_i \\
 &\quad \text{group terms having } g_i \text{ together} \quad \text{group terms having } p_i \text{ together} \\
 &= (g_{i+1} + p_{i+1}g_i) + (p_{i+1}p_i)c_i \\
 &= G_{i+1,i} + P_{i+1,i}c_i \\
 &\quad \text{group generate signal} \quad \text{group propagate signal} \\
 &\quad \text{over the bit range} \\
 &\quad \text{from } i \text{ to } i+1
 \end{aligned}$$



# Group Generate & Propagate: >2 Bits



将  $j, k$  换成  $H, L$

$$G_{i+1,i} = g_{i+1} + p_{i+1}g_i$$

$$P_{i+1,i} = p_{i+1}p_i$$

$$G_{j,i} = G_{j,k+1} + P_{j,k+1}G_{k,i}$$

$$P_{j,i} = P_{j,k+1}P_{k,i}$$

# Group Generate & Propagate: >2 Bits



$$G_{H,L} = g_H + p_H g_L$$

$$P_{H,L} = p_H p_L$$

$$G_{H,L} = G_H + P_H G_L$$

$$P_{H,L} = P_H P_L$$

新方法的原因:

We can anticipate the carry out of the range without waiting for the carries in the middle of the range to be computed.



# Carry Lookahead Across Bit Range R = [i, j]

What we want.

from range "i" to "j", know "C<sub>i</sub>", get "C<sub>j+1</sub>"



Carry equation:

$$C_{j+1} = G_R + P_R C_i$$

get the carry out  
if I generate in the range  
it propagate across the range.

# Parallel-Prefix Circuit



# Recursive Computation of Group Signals



what is the depth of this:  $\log_2(n)$

range of size 1

$$\text{eg: } \log_2 16 = 4$$

because we started with grouping 2 bits.



merge into range of size 2



high part  
↓  
low part  
↓

merge into range of size 4



high part  
↓  
low part  
↓

merge into range of size 8



high part  
↓  
low part  
↓

merge into range of size 16



high part  
↓  
low part  
↓

④





the output of this shaded block : Two bits group signals Propagate ( $P_{i,j}$ ) and generate ( $G_{i,j}$ )



same circuit is used to generate the 4 bit signals.



Do Delay analysis

# Generate/Propagate Equations

first level of logic has  
one gate : "AND" "OR" gate

Depth(P,G) (1, 1)

$$g_0 = a_0 b_0$$

$$p_0 = a_0 + b_0$$

$$g_1 = a_1 b_1$$

$$p_1 = a_1 + b_1$$

$$g_2 = a_2 b_2$$

$$p_2 = a_2 + b_2$$

$$g_3 = a_3 b_3$$

$$p_3 = a_3 + b_3$$

$$g_4 = a_4 b_4$$

$$p_4 = a_4 + b_4$$

$$g_5 = a_5 b_5$$

$$p_5 = a_5 + b_5$$

$$g_6 = a_6 b_6$$

$$p_6 = a_6 + b_6$$

$$g_7 = a_7 b_7$$

$$p_7 = a_7 + b_7$$

so the depth of "P" signal  
come out of the 1st level  
is one "OR" and one "AND"

"AND" & "OR" for  
the generate

(2, 3)

$$G_{1,0} = g_1 + p_1 g_0$$

$$P_{1,0} = p_1 p_0$$

one "AND" for the propagate  
So if just counting gate, then the  
generate signal will have a depth  
of 3. Propagate signal has depth of 2.

$$G_{3,2} = g_3 + p_3 g_2$$

$$P_{3,2} = p_3 p_2$$

depth of G: +2

Two more level

$$G_{3,0} = G_{3,2} + P_{3,2} G_{1,0}$$

$$P_{3,0} = P_{3,2} P_{1,0}$$

one more "AND" gate for propagate signal

depth of P: +1

(3, 5)

$$G_{7,0} = G_{7,4} + P_{7,4} G_{3,0}$$

$$P_{7,0} = P_{7,4} P_{3,0}$$

$$G_{7,4} = G_{7,6} + P_{7,6} G_{5,4}$$

$$P_{7,4} = P_{7,6} P_{5,4}$$

# Delays in the GP “Parallel Prefix”



| Level | Bit-Range Width | $\Delta_P$ | $\Delta_G$ |
|-------|-----------------|------------|------------|
| 1     | 1               | 1          | 1          |
| 2     | 2               | 2          | 3          |
| 3     | 4               | 3          | 5          |
| 4     | 8               | 4          | 7          |
| 5     | 16              | 5          | 9          |

→ 表示 depth How many gates do we have at the end of that range.

# Analyze when the carriers become available

I need every carries for the sum.

e.g.: I don't need  $C_1$  for  $C_2$   
but I need  $C_1$  for sum.

## Options for Carry Computation



# Carry equations



| Level | Bit-Range Width | $\Delta_P$ | $\Delta_G$ |
|-------|-----------------|------------|------------|
| 1     | 1               | 1          | 1          |
| 2     | 2               | 2          | 3          |
| 3     | 4               | 3          | 5          |
| 4     | 8               | 4          | 7          |
| 5     | 16              | 5          | 9          |



$C_3$  and  $C_4$  are appearing at the same time.  
Because  $C_4$  is

| $C_{in}$ | $\Delta$ | $c_{in}$ |
|----------|----------|----------|
| $c_0$    | 0        |          |
| $c_1$    | 3        | $c_0$    |
| $c_2$    | 4        | $c_0$    |
| $c_3$    | 6        | $c_2$    |
| $c_4$    | 6        | $c_0$    |
| $c_5$    | 8        | $c_4$    |
| $c_6$    |          |          |
| $c_7$    |          |          |
| $c_8$    |          |          |

The diagram shows the computation of two variables,  $C_1$  and  $C_2$ , using logic gates.

**Variable  $C_1$ :**

- Primary input:  $C_0$  (labeled "0")
- Intermediate output:  $G_0$  (labeled "1")
- Intermediate output:  $P_0 C_0$  (labeled "1")
- Final output:  $C_1 = G_0 + P_0 C_0$
- Annotations: "output of AND gate", "output of OR gate", and "output of signal gate".
- Text: "Not logic '1' '0'; they are just times on logic depth." with a green checkmark.
- Text: "What is the depths coming out?" with a red question mark.

**Variable  $C_2$ :**

- Primary input:  $C_0$  (labeled "0")
- Intermediate output:  $G_{1,0}$  (labeled "3")
- Intermediate output:  $P_{1,0} C_0$  (labeled "2")
- Final output:  $C_2 = G_{1,0} + P_{1,0} C_0$
- Annotations: "output of AND gate" and "output of OR gate".

**Final Output:**

- Intermediate output:  $g_1$  (labeled "1")
- Intermediate output:  $p_1 C_1$  (labeled "1")
- Final output:  $C_2 = g_1 + p_1 C_1$
- Annotations: "when the one bit range".

$$C_4 = G_{3,0} + P_{3,0}C_0 \quad C_4 = G_{3,2} + P_{3,2}C_2 \quad C_4 = g_3 + p_3C_3$$

$$c_5 = g_4 + p_4 c_4$$

# Carry equations



| Level | Bit Range | Width | $\Delta_P$ | $\Delta_G$ |
|-------|-----------|-------|------------|------------|
| 1     |           | 1     | 1          | 1          |
| 2     |           | 2     | 2          | 3          |
| 3     |           | 4     | 3          | 5          |
| 4     |           | 8     | 4          | 7          |
| 5     |           | 16    | 5          | 9          |



|       | $\Delta$ | $c_{in}$ |
|-------|----------|----------|
| $c_0$ | 0        |          |
| $c_1$ | 3        | $c_0$    |
| $c_2$ | 4        | $c_0$    |
| $c_3$ | 6        | $c_2$    |
| $c_4$ | 6        | $c_0$    |
| $c_5$ | 8        | $c_4$    |
| $c_6$ | 8        | $c_4$    |
| $c_7$ | 10       | $c_6$    |
| $c_8$ | 8        | $c_0$    |

$$c_6 = G_{5,4} + P_{5,4}c_4$$



$$c_6 = g_5 + p_5c_5$$



$$c_7 = g_6 + p_6c_6$$



$$c_8 = G_{7,0} + P_{7,0}c_0$$



$$c_8 = G_{7,4} + P_{7,4}c_4$$



$$c_8 = G_{7,6} + P_{7,6}c_6$$



$$c_8 = g_7 + p_7c_7$$



the depth over here is  
not increasing linearly

If this is the overflow bit,  
it actually appear earlier than  
the carry in to the last bit.

# Carry Dependency Tree (which carry propagates to which other carry)



# Here's a different way to create groups



We can start by ranges that are over four bits instead of two bits

# Depth of GP Structure

① grouping into 2



$$d = \log_2 n = \log_2 16 = 4$$

② grouping into 4



$$d = \log_4 n = \log_4 16 = 2$$

• the logic for computing over the four bit range is more complex

• the ripple structure is when the range is 1 bit.

# Group Generate & Propagate: 4 Bits



$$\begin{aligned}
 c_{i+4} &= g_{i+3} + p_{i+3}c_{i+3} \\
 &= g_{i+3} + p_{i+3}(g_{i+2} + p_{i+2}c_{i+2}) \\
 &= g_{i+3} + p_{i+3}g_{i+2} + p_{i+3}p_{i+2}c_{i+2} \\
 &= g_{i+3} + p_{i+3}g_{i+2} + p_{i+3}p_{i+2}(g_{i+1} + p_{i+1}c_{i+1}) \\
 &= g_{i+3} + p_{i+3}g_{i+2} + p_{i+3}p_{i+2}g_{i+1} + p_{i+3}p_{i+2}p_{i+1}c_{i+1} \\
 &= g_{i+3} + p_{i+3}g_{i+2} + p_{i+3}p_{i+2}g_{i+1} + p_{i+3}p_{i+2}p_{i+1}(g_i + p_i c_i) \\
 &= \underbrace{g_{i+3} + p_{i+3}g_{i+2} + p_{i+3}p_{i+2}g_{i+1} + p_{i+3}p_{i+2}p_{i+1}g_i}_{G_{i+3,i}} + \underbrace{p_{i+3}p_{i+2}p_{i+1}p_i c_i}_{P_{i+3,i}}
 \end{aligned}$$

# Another Variation: M&M Structure



$$\text{MAJ}(x, y, z) \equiv xy + xz + yz$$

$$\begin{aligned}
 c_{i+2} &= \text{MAJ}(a_{i+1}, b_{i+1}, c_{i+1}) \\
 &= \text{MAJ}(a_{i+1}, b_{i+1}, \text{MAJ}(a_i, b_i, c_i)) \\
 &\quad \downarrow \text{Algebra} \\
 &= \text{MAJ}(\text{MAJ}(a_{i+1}, b_{i+1}, a_i), \text{MAJ}(a_{i+1}, b_{i+1}, b_i), c_i)
 \end{aligned}$$

Let  $m_i^a \equiv a_i, m_i^b \equiv b_i$ , etc.

$$\therefore c_{i+2} = \text{MAJ}\left(\underbrace{\text{MAJ}\left(m_{i+1}^a, m_{i+1}^b, m_i^a\right)}, \underbrace{\text{MAJ}\left(m_{i+1}^a, m_{i+1}^b, m_i^b\right)}, c_i\right)$$

$M_{i+1,i}^a$                              $M_{i+1,i}^b$



How are we compute the sum

# CLA Adder Building Blocks

high part



low part



combine them into a range



compute the group signal

$G_H \quad P_H \quad G_L \quad P_L$







# Slightly More Accurate Timing Analysis



$$t_p^{AND} = t_p^{NAND} + t_p^{INV}$$

Assume  $t_p^{NAND} = t_p^{INV} = 0.5$

$$\therefore t_p^{XOR} = 1.5$$



# Summary: Fast Addition



## Rich Literature

- Carry Skip
- Carry Save
- Carry Select
- Kogge–Stone Adder
- Brent–Kung Adder
- etc.