

# Topic 13

---

## Arithmetic Components

# Components to be discussed

- Arithmetic and logic unit (ALU)
- Carry lookahead adder
- Incrementer
- Magnitude comparator
- Multiplier

# Arithmetic-Logic Unit: ALU

- **ALU:** Component that can perform any of various arithmetic (add, subtract, increment, etc.) and logic (AND, OR, etc.) operations, based on control inputs
- Key component in computer

TABLE 4.2 Desired calculator operations

| Inputs |   |   | Operation                      | Sample output if<br>A=00001111,<br>B=00000101 |
|--------|---|---|--------------------------------|-----------------------------------------------|
| x      | y | z |                                |                                               |
| 0      | 0 | 0 | S = A + B                      | S=00010100                                    |
| 0      | 0 | 1 | S = A - B                      | S=00001010                                    |
| 0      | 1 | 0 | S = A + 1                      | S=00010000                                    |
| 0      | 1 | 1 | S = A                          | S=00001111                                    |
| 1      | 0 | 0 | S = A AND B (bitwise AND)      | S=00000101                                    |
| 1      | 0 | 1 | S = A OR B (bitwise OR)        | S=00001111                                    |
| 1      | 1 | 0 | S = A XOR B (bitwise XOR)      | S=00001010                                    |
| 1      | 1 | 1 | S = NOT A (bitwise complement) | S=11110000                                    |

# Multifunction Calculator with an ALU

- ALU functions selected by a mux
  - But too many wires
  - Wasted power computing all those operations when at any time you only use one of the results

TABLE 4.2 Desired calculator operations

| Inputs |   |   | Operation                      | Sample output if<br>A=00001111,<br>B=00000101 |
|--------|---|---|--------------------------------|-----------------------------------------------|
| x      | y | z |                                |                                               |
| 0      | 0 | 0 | S = A + B                      | S=00010100                                    |
| 0      | 0 | 1 | S = A - B                      | S=00001010                                    |
| 0      | 1 | 0 | S = A + 1                      | S=00010000                                    |
| 0      | 1 | 1 | S = A                          | S=00001111                                    |
| 1      | 0 | 0 | S = A AND B (bitwise AND)      | S=00000101                                    |
| 1      | 0 | 1 | S = A OR B (bitwise OR)        | S=00001111                                    |
| 1      | 1 | 0 | S = A XOR B (bitwise XOR)      | S=00001010                                    |
| 1      | 1 | 1 | S = NOT A (bitwise complement) | S=11110000                                    |



# Carry-Ripple Adder

- Carry-ripple adder
  - 4-bit adder: Adds two 4-bit numbers, generates 5-bit output
    - 5-bit output can be considered 4-bit “sum” plus 1-bit “carry out”
  - Can easily build any size adder

$$\begin{array}{r} \text{carries: } \quad \color{red}{c_3} \quad \color{red}{c_2} \quad \color{red}{c_1} \quad \text{cin} \\ \text{B:} \quad \quad \quad b_3 \quad b_2 \quad b_1 \quad b_0 \\ \text{A:} \quad + \quad a_3 \quad a_2 \quad a_1 \quad a_0 \\ \hline \text{cout} \quad s_3 \quad s_2 \quad s_1 \quad s_0 \end{array}$$



# Carry-Ripple Adder's Behavior



Assume all inputs initially 0

为什么错了?

Carry 没有足够时间

$0111 + 0001$  to run through

(answer should be 01000)



每次在做加法时，都要先等 carry  
到了才能下一项  
Output after 2 ns (1 FA delay)

Wrong answer -- something wrong? No -- just need more time  
for carry to ripple through the chain of full adders.

# Carry-Ripple Adder's Behavior



**0111+0001**  
(answer should be 01000)

Outputs after 4ns (2 FA delays)



Outputs after 6ns (3 FA delays)  
*a*



Output after 8ns (4 FA delays)

Correct answer appears after 4 FA delays

# Faster Adder

- Faster adder – Use two-level combinational logic design process
  - 4-bit two-level adder is big
    - 9 input and 5 output combination circuit
  - Pro: Fast
    - 2 gate-level delays
  - Con: Large
    - Truth table would have  $2^{(4+4+1)} = 512$  rows
    - Plot shows 4-bit adder would use about 500 gates

只用 logic gate



*Two-level: AND level  
followed by ORs*



# Recall Full Adder



| A | B | C | Sum | Carry |
|---|---|---|-----|-------|
| 0 | 0 | 0 | 0   | 0     |
| 0 | 0 | 1 | 1   | 0     |
| 0 | 1 | 0 | 1   | 0     |
| 0 | 1 | 1 | 0   | 1     |
| 1 | 0 | 0 | 1   | 0     |
| 1 | 0 | 1 | 0   | 1     |
| 1 | 1 | 0 | 0   | 1     |
| 1 | 1 | 1 | 1   | 1     |

想汎: instead of  
waiting for the carry,  
we provide the inputs

$$\begin{aligned} \text{Sum} &= A'B'C + A'BC' + AB'C' + ABC \\ &= A'(B'C + BC') + A(B'C' + BC) \\ &= A'(B \oplus C) + A(B \oplus C)' \end{aligned}$$

$$\begin{aligned} &\text{(let } B \oplus C = D\text{)} \\ &= A'D + AD' \\ &= A \oplus D \\ &= A \oplus B \oplus C \end{aligned}$$

$$\begin{aligned} \text{Carry} &= A'BC + AB'C + ABC' + ABC \\ &= A'BC + AB'C + AB(C' + C) \\ &= A'BC + AB'C + AB \\ &= (A'C + A)B + A(B'C + B) \\ &= (C + A)B + A(C + B) \\ &= CB + AB + AC + AB \\ &= AB + AC + BC \\ &= (A'B + AB')C + AB(C' + C) \\ &= (A \oplus B)C + AB \end{aligned}$$

$$\begin{aligned} \text{Sum} &= A'B'C + A'BC' + AB'C' + ABC \\ &= \Sigma m(1, 2, 4, 7) \end{aligned}$$

$$\begin{aligned} \text{Carry} &= A'BC + AB'C + ABC' + ABC \\ &= \Sigma m(3, 5, 6, 7) \end{aligned}$$

# Faster Adder – Intuitive Attempt at “Lookahead”

- Produce carries directly

$$c_1 = a_0b_0 + a_0c_0 + b_0c_0$$

$$c_2 = a_1b_1 + a_1c_1 + b_1c_1$$

$$= a_1b_1 + a_1(a_0b_0+a_0c_0+b_0c_0) + b_1(a_0b_0+a_0c_0+b_0c_0)$$

$$c_3 = a_2b_2 + a_2c_2 + b_2c_2$$

$$= \dots \text{ (replace } c_2\text{)}$$

$$c_4 = a_3b_3 + a_3c_3 + b_3c_3$$

$$= \dots \text{ (replace } c_3\text{)}$$

- Carry outputs of all FAs are represented with  $a_0, a_1, a_2, a_3, b_0, b_1, b_2, b_3$ , and  $c_0$

# Faster Adder – Intuitive Attempt at “Lookahead”

- Idea: Modify carry-ripple adder
  - don't wait for carry to ripple, but rather *directly compute* from inputs of earlier stages
  - Called “lookahead” because current stage “looks ahead” at previous stages



Notice – no rippling of carry

# Faster Adder – Intuitive Attempt at “Lookahead”

- Carry lookahead logic
  - No waiting for ripple
  - 2-layer SOP logic
- Problem
  - Equations get too big
  - Not efficient
  - Need a better form of lookahead



4 gate delay

$$c_1 = b_0 c_0 + a_0 c_0 + a_0 b_0$$

$$c_2 = b_1 b_0 c_0 + b_1 a_0 c_0 + b_1 a_0 b_0 + a_1 b_0 c_0 + a_1 a_0 c_0 + a_1 a_0 b_0 + a_1 b_1$$

$$c_3 = b_2 b_1 b_0 c_0 + b_2 b_1 a_0 c_0 + b_2 b_1 a_0 b_0 + b_2 a_1 b_0 c_0 + b_2 a_1 a_0 c_0 + b_2 a_1 a_0 b_0 + b_2 a_1 b_1 + a_2 b_1 b_0 c_0 + a_2 b_1 a_0 c_0 + a_2 b_1 a_0 b_0 + a_2 a_1 b_0 c_0 + a_2 a_1 a_0 c_0 + a_2 a_1 a_0 b_0 + a_2 a_1 b_1 + a_2 b_2$$

Huge number of gates

# Better Form of Lookahead

- Recall Full Adder, another equation for carry  
 $\text{Carry} = ab + (a \oplus b)c$
  - Define two terms
    - **Propagate:**  $P = a \oplus b$
    - **Generate:**  $G = ab$
  - Compute lookahead carries from  $P$  and  $G$  terms, *not from external inputs*
    - $\text{Cout} = G + P_c$ 
      - $c_1 = a_0b_0 + (a_0 \oplus b_0)c_0 = G_0 + P_0c_0$
      - $c_2 = a_1b_1 + (a_1 \oplus b_1)c_1 = G_1 + P_1c_1$
      - $c_3 = a_2b_2 + (a_2 \oplus b_2)c_2 = G_2 + P_2c_2$
- 用 P and Gs to present  
a and bs*

# Better Form of Lookahead



After plugging in: *4 gate delays*

$$c1 = G0 + P0c0$$

$$c2 = G1 + P1c1 = G1 + P1(G0 + P0c0)$$

$$c2 = G1 + P1G0 + P1P0c0$$

$$c3 = G2 + P2c2 = G2 + P2(G1 + P1G0 + P1P0c0)$$

$$c3 = G2 + P2G1 + P2P1G0 + P2P1P0c0$$

$$\text{cout} = G3 + P3G2 + P3P2G1 + P3P2P1G0 + P3P2P1P0c0$$

Much simpler than the intuitive lookahead 14

- With  $P$  &  $G$ , the carry lookahead equations are much simpler
  - Equations before plugging in
    - $c1 = G0 + P0c0$
    - $c2 = G1 + P1c1$
    - $c3 = G2 + P2c2$
    - $\text{cout} = G3 + P3c3$

# Better Form of Lookahead (cont.)



# Better Form of Lookahead (cont.)



- With  $P$  &  $G$ , sum outputs are
  - $s_0 = P_0 \oplus cin$
  - $s_1 = P_1 \oplus c_1$
  - $s_2 = P_2 \oplus c_2$
  - $s_3 = P_3 \oplus c_3$

# Carry-Lookahead Adder -- High-Level View



- Fast -- only 4 gate level delays
  - Each stage has SPG block with 2 gate levels
  - Carry-lookahead logic quickly computes the carry from the propagate and generate bits using 2 gate levels inside
- Reasonable number of gates
- Nice balance between intuitive lookahead and pure combinational logic

由 SPG 提供 sum, propagation, generate

# Cascading Adders

- Example: construct an 8-bit adder with two 4-bit adders



# Cascading Adders to CLA Addres

- Example: construct an 16-bit adder with four 4-bit adders



# Carry-Lookahead Adder -- High-Level View

- Adding two more outputs: P, G



$$P = P_3 P_2 P_1 P_0$$

$$G = G_3 + P_3 G_2 + P_3 P_2 G_1 + P_3 P_2 P_1 G_0$$

# Cascading Adders to CLA Addresses

*↳ levels of delay : 1 for P and Gs, 2 for C*

- Example: construct an 16-bit adder with four 4-bit adders



Again, carries cause propagation delay



# CLA Adders

- Multi-level CLA structure

32 bit adder design

we don't wait for the Cs to propagate



# CLA Adders



# Adder Example: DIP-Switch-Based Adding Calculator

- To prevent spurious values from appearing at output, can place register at output



# Incrementer

- Traditional design procedure
  - Capture truth table
  - Derive equation for each output
    - $c_0 = a_3a_2a_1a_0$
    - ...
    - $s_0 = a_0'$
  - Results in small and fast circuit
  - Works for small operand
    - larger operand leads to exponential growth, like for N-bit adder

| Inputs |    |    |    | Outputs |    |    |    |    |
|--------|----|----|----|---------|----|----|----|----|
| a3     | a2 | a1 | a0 | c0      | s3 | s2 | s1 | s0 |
| 0      | 0  | 0  | 0  | 0       | 0  | 0  | 0  | 1  |
| 0      | 0  | 0  | 1  | 0       | 0  | 0  | 1  | 0  |
| 0      | 0  | 1  | 0  | 0       | 0  | 0  | 1  | 1  |
| 0      | 0  | 1  | 1  | 0       | 0  | 1  | 0  | 0  |
| 0      | 1  | 0  | 0  | 0       | 0  | 1  | 0  | 1  |
| 0      | 1  | 0  | 1  | 0       | 0  | 1  | 1  | 0  |
| 0      | 1  | 1  | 0  | 0       | 0  | 1  | 1  | 1  |
| 0      | 1  | 1  | 1  | 0       | 1  | 0  | 0  | 0  |
| 1      | 0  | 0  | 0  | 0       | 1  | 0  | 0  | 1  |
| 1      | 0  | 0  | 1  | 0       | 1  | 0  | 1  | 0  |
| 1      | 0  | 1  | 0  | 0       | 1  | 0  | 1  | 1  |
| 1      | 0  | 1  | 1  | 0       | 1  | 1  | 0  | 0  |
| 1      | 1  | 0  | 0  | 0       | 1  | 1  | 0  | 1  |
| 1      | 1  | 0  | 1  | 0       | 1  | 1  | 1  | 0  |
| 1      | 1  | 1  | 0  | 0       | 1  | 1  | 1  | 1  |
| 1      | 1  | 1  | 1  | 1       | 0  | 0  | 0  | 0  |

# Incrementer

- Alternative incrementer design
  - Could use N-bit adder with one of the inputs set to 1
  - Use half-adders (adds two bits) rather than full-adders (adds three bits)
  - Slower but simpler



# Comparators

- **N-bit equality comparator:** Outputs 1 if two N-bit numbers are equal
- Example: 4-bit equality comparator with inputs A and B
  - Approach 1: combinational design procedure
  - Approach 2: recall functionality of XOR and XNOR

$0110 = 0111 ?$



# Magnitude Comparator

- ***N-bit magnitude comparator:***

- Indicates whether  $A > B$ ,  $A = B$ , or  $A < B$ ,  
for its two  $N$ -bit inputs  $A$  and  $B$
- Design approach 1: combinational  
design procedure
- Design approach 2: Consider how  
compare by hand.

$$A=1011 \quad B=1001$$

1011      1001 Equal

1011      1001 Equal

1011      1001 Unequal

So  $A > B$

# Magnitude Comparator

- By-hand example leads to idea for design
  - Start at left, compare each bit pair, pass results to the right
  - Each stage has 3 inputs indicating results of higher stage, passes results to lower stage



(a)



(b)

# Magnitude Comparator



- Each stage:
  - $out\_gt = in\_gt + (in\_eq * a * b')$ 
    - A>B (so far) if already determined in higher stage, or if higher stages equal but in this stage  $a=1$  and  $b=0$
  - $out\_lt = in\_lt + (in\_eq * a' * b)$ 
    - A<B (so far) if already determined in higher stage, or if higher stages equal but in this stage  $a=0$  and  $b=1$
  - $out\_eq = in\_eq * (a \text{ XNOR } b)$ 
    - A=B (so far) if already determined in higher stage and in this stage  $a=b$  too
  - Simple circuit inside each stage, just a few gates (not shown)

# Magnitude Comparator

- How does it work?

$$1011 = 1001 ?$$

*Ieq=1 causes this stage to compare*



# Magnitude Comparator

$1011 = 1001 ?$



(c)



(d)

- Final answer appears on the right
- Takes time for answer to “ripple” from left to right
- Thus called “carry-ripple style” even though there’s no “carry” involved

# Magnitude Comparator Example: Minimum of Two Numbers

- Design a combinational component that finds the minimum of two 8-bit numbers
  - Solution: Use 8-bit magnitude comparator and 8-bit 2x1 mux
    - If  $A < B$ , pass A through mux. Else, pass B.



What if inputs are 2's complement???

# Multiplier

- Can build multiplier that mimics multiplication by hand
  - Notice that multiplying multiplicand by 1 is same as ANDing with 1

|          |                                                                                    |
|----------|------------------------------------------------------------------------------------|
| 0110     | (the top number is called the <i>multiplicand</i> )                                |
| 0011     | (the bottom number is called the <i>multiplier</i> )                               |
| ----     | (each row below is called a <i>partial product</i> )                               |
| 0110     | (because the rightmost bit of the multiplier is 1, and $0110 * 1 = 0110$ )         |
| 0110     | (because the second bit of the multiplier is 1, and $0110 * 1 = 0110$ )            |
| 0000     | (because the third bit of the multiplier is 0, and $0110 * 0 = 0000$ )             |
| +0000    | (because the leftmost bit of the multiplier is 0, and $0110 * 0 = 0000$ )          |
| -----    |                                                                                    |
| 00010010 | (the <i>product</i> is the sum of all the partial products: 18, which is $6 * 3$ ) |

# Multiplier – Array Style

- Generalized representation of multiplication by hand

$$\begin{array}{r} & a_3 & a_2 & a_1 & a_0 \\ \times & b_3 & b_2 & b_1 & b_0 \\ \hline & b_0a_3 & b_0a_2 & b_0a_1 & b_0a_0 & (pp1) \\ & b_1a_3 & b_1a_2 & b_1a_1 & b_1a_0 & 0 & (pp2) \\ & b_2a_3 & b_2a_2 & b_2a_1 & b_2a_0 & 0 & 0 & (pp3) \\ + & b_3a_3 & b_3a_2 & b_3a_1 & b_3a_0 & 0 & 0 & 0 & (pp4) \\ \hline p_7 & p_6 & p_5 & p_4 & p_3 & p_2 & p_1 & p_0 \end{array}$$

# Multiplier – Array Style

- Multiplier design – array of AND gates



# Smaller Multiplier -- Sequential (Add-and-Shift) Style

- Add-and-Shift
  - Don't compute all partial products simultaneously
  - Rather, compute one at a time (similar to by hand), maintain a *running sum*

|                   | Step 1                                                        | Step 2                                                          | Step 3                                                            | Step 4                                                              |
|-------------------|---------------------------------------------------------------|-----------------------------------------------------------------|-------------------------------------------------------------------|---------------------------------------------------------------------|
| (running sum)     | 0 1 1 0                                                       | 0 1 1 0                                                         | 0 1 1 0                                                           | 0 1 1 0                                                             |
| (partial product) | * 0 0 1 1                                                     | * 0 0 1 1                                                       | * 0 0 1 1                                                         | * 0 0 1 1                                                           |
| (new running sum) | $\begin{array}{r} 0000 \\ + 0110 \\ \hline 00110 \end{array}$ | $\begin{array}{r} 00110 \\ + 0110 \\ \hline 010010 \end{array}$ | $\begin{array}{r} 010010 \\ + 0000 \\ \hline 0010010 \end{array}$ | $\begin{array}{r} 0010010 \\ + 0000 \\ \hline 00010010 \end{array}$ |

# Smaller Multiplier -- Sequential (Add-and-Shift) Style

- Design circuit that computes one partial product at a time, adds to running sum
  - Note that **shifting** running sum right (relative to partial product) after each step ensures partial product added to correct running sum bits



|                                                                            |                                                                                |                                                                             |                                                                                 |                                                         |
|----------------------------------------------------------------------------|--------------------------------------------------------------------------------|-----------------------------------------------------------------------------|---------------------------------------------------------------------------------|---------------------------------------------------------|
| Step 1<br>0 1 1 0<br>+ 0 0 1 <b>1</b><br>0 0 0 0<br>+ 0 1 1 0<br>0 0 1 1 0 | Step 2<br>0 1 1 0<br>+ 0 0 <b>1</b> 1<br>0 0 1 1 0<br>+ 0 1 1 0<br>0 1 0 0 1 0 | Step 3<br>0 1 1 0<br>+ 0 0 1 1<br>0 1 0 0 1 0<br>+ 0 0 0 0<br>0 0 1 0 0 1 0 | Step 4<br>0 1 1 0<br>+ 0 0 1 1<br>0 0 1 0 0 1 0<br>+ 0 0 0 0<br>0 0 0 1 0 0 1 0 | (running sum)<br>(partial product)<br>(new running sum) |
|----------------------------------------------------------------------------|--------------------------------------------------------------------------------|-----------------------------------------------------------------------------|---------------------------------------------------------------------------------|---------------------------------------------------------|

What about carry?

What if 2's complement?

# Smaller Multiplier – Controller Design

