

# Computer Organization and Design

## Arithmetic & Logic Circuits



Henry Fuchs

Slides adapted from Montek Singh, who adapted them  
from Leonard McMillan and from Gary Bishop  
Back to McMillan & Chris Terman, MIT 6.004 1999

Tuesday, March 24, 2015

Lecture 11

# Topics

## \* Arithmetic Circuits

- adder
- subtractor

## \* Logic Circuits

## \* Arithmetic & Logic Unit (ALU)

- ... putting it all together

# Review: 2's Complement



## \* 8-bit 2's complement example:

$$11010110 = -128 + 64 + 16 + 4 + 2 = -42$$

## \* Beauty of 2's complement representation:

- same binary addition procedure will work for adding both signed and unsigned numbers
  - as long as the leftmost carry out is properly handled/ignored

## \* Insert a "binary" point to represent fractions too:

$$1101.0110 = -8 + 4 + 1 + 0.25 + 0.125 = -2.625$$

# Binary Addition

## \* Example of binary addition “by hand”:

Adding two N-bit numbers produces an (N+1)-bit result

$$\begin{array}{r}
 & \text{1 1 0 1} \\
 A: & 1101 \\
 B: + & 0101 \\
 \hline
 & 10010
 \end{array}$$

Carries from previous column

## \* Let's design a circuit to do it!

- Start by building a block that adds one column:
  - called a “full adder” (FA)
- Then cascade them to add two numbers of any size...



# Binary Addition

## \* Extend to arbitrary # of bits

- $n$  bits: cascade  $n$  full adders



- Called “Ripple-Carry Adder”
  - carries ripple through from right to left
  - longest chain of carries has length  $n$ 
    - how long does it take to add the numbers 0 and 1?
    - how long does it take to add the numbers -1 and 1?

# Designing a Full Adder (1-bit adder)

6 / 19

## \* Follow the step-by-step method

1. Start with a truth table:
2. Write down eqns for the “1” outputs:

$$C_o = \overline{C}_i AB + C_i \overline{A}B + C_i A\overline{B} + C_i AB$$

$$S = \overline{C}_i \overline{A}\overline{B} + \overline{C}_i \overline{A}\overline{B} + C_i \overline{A}\overline{B} + C_i AB$$

3. Simplifying a bit (seems hard, but experienced designers are good at this art!)

| $C_i$ | $A$ | $B$ | $C_o$ | $S$ |
|-------|-----|-----|-------|-----|
| 0     | 0   | 0   | 0     | 0   |
| 0     | 0   | 1   | 0     | 1   |
| 0     | 1   | 0   | 0     | 1   |
| 0     | 1   | 1   | 1     | 0   |
| 1     | 0   | 0   | 0     | 1   |
| 1     | 0   | 1   | 1     | 0   |
| 1     | 1   | 0   | 1     | 0   |
| 1     | 1   | 1   | 1     | 1   |

$$C_o = C_i(A + B) + AB = C_i(A \oplus B) + AB$$

$$S = C_i \oplus A \oplus B$$

# For Those Who Prefer Logic Diagrams

7 / 19

\* A little tricky, but only 5 gates/bit!

$$C_o = C_i \cdot (A \oplus B) + A \cdot B$$

$$S = C_i \oplus (A \oplus B)$$



# Subtraction: $A - B = A + (-B)$

8 / 19

\* Subtract B from A = add 2's complement of B to A

- In 2's complement:  $-B = \sim B + 1$

$\sim$  = bit-wise complement



- Let's build an arithmetic unit that does *both* add and sub  
➤ operation selected by *control input*:



# Condition Codes

\* One often wants 4 other bits of information from an arith unit:

- Z (zero): result is = 0  
➤ big NOR gate
- N (negative): result is < 0  
➤  $S_{N-1}$
- C (carry): indicates that most significant position produced a carry, e.g., “ $1 + (-1)$ ”  
➤ Carry from last FA
- V (overflow): indicates answer doesn’t fit  
➤ precisely:

$$V = A_{n-1} B_{n-1} \bar{N} + \bar{A}_{n-1} \bar{B}_{n-1} N$$

-or-

$$V = C_o_{n-1} \oplus C_i_{n-1}$$

To compare A and B, perform A–B and use condition codes:

Signed comparison:

|           |                                             |
|-----------|---------------------------------------------|
| <b>LT</b> | <b><math>N \oplus V</math></b>              |
| <b>LE</b> | <b><math>Z + (N \oplus V)</math></b>        |
| <b>EQ</b> | <b><math>Z</math></b>                       |
| <b>NE</b> | <b><math>\sim Z</math></b>                  |
| <b>GE</b> | <b><math>\sim (N \oplus V)</math></b>       |
| <b>GT</b> | <b><math>\sim (Z + (N \oplus V))</math></b> |

Unsigned comparison:

|            |                                       |
|------------|---------------------------------------|
| <b>LTU</b> | <b><math>\sim C</math></b>            |
| <b>LEU</b> | <b><math>\sim C + Z</math></b>        |
| <b>GEU</b> | <b><math>C</math></b>                 |
| <b>GTU</b> | <b><math>\sim (\sim C + Z)</math></b> |

\* Subtraction is useful also for comparing numbers

\* To compare A and B:

- First compute  $A - B$
- Then check the flags Z, N, C, V

\* Examples:

- LTU (less than for unsigned numbers)
  - $A < B$  is given by  $\sim C$
- LT (less than for signed numbers)
  - $A < B$  is given by  $N \oplus V$
- Others in table

To compare A and B, perform  $A - B$  and use condition codes:

Signed comparison:

|    |                           |
|----|---------------------------|
| LT | $N \oplus V$              |
| LE | $Z + (N \oplus V)$        |
| EQ | $Z$                       |
| NE | $\sim Z$                  |
| GE | $\sim (N \oplus V)$       |
| GT | $\sim (Z + (N \oplus V))$ |

Unsigned comparison:

|     |                     |
|-----|---------------------|
| LTU | $\sim C$            |
| LEU | $\sim C + Z$        |
| GEU | $C$                 |
| GTU | $\sim (\sim C + Z)$ |

# Latency of Ripple-Carry Adder

( $T_{PD}$  = max propagation delay or latency of the entire adder)



Worse-case path: carry propagation from LSB to MSB, e.g., when adding 11...111 to 00...001.

$$t_{PD} = (t_{PD,XOR} + t_{PD,AND} + t_{PD,OR}) + (N-2)*(t_{PD,OR} + t_{PD,AND}) + t_{PD,XOR}$$

$\underbrace{\quad}_{A,B \text{ to CO}}$

$\approx \Theta(N)$

$\underbrace{\quad}_{CI \text{ to CO}} \quad \underbrace{\quad}_{CI_{N-1} \text{ to } S_{N-1}}$



$\Theta(N)$  is read “order N” and tells us that the latency of our adder grows in proportion to the number of bits in the operands.

\* Yes,  
there are many sophisticated designs that are faster

- Carry-Lookahead Adders (CLA)
- Carry-Skip Adders
- Carry-Select Adders

→ See textbook for details

# Adder Summary

13 / 19

\* Adding is not only common, but it is also tends to be one of the most time-critical of operations

- As a result, a wide range of adder architectures have been developed that allow a designer to tradeoff complexity (in terms of the number of gates) for performance.



At this point we'll define a high-level functional unit for an adder, and specify the details of the implementation as necessary.



# Shifting and Logical Operations



# Shifting Logic

## \* Shifting is a common operation

- applied to groups of bits
- used for alignment
- used for “short cut” arithmetic operations
  - $X << 1$  is often the same as  $2*X$
  - $X >> 1$  can be the same as  $X/2$

## \* For example:

- $X = 20_{10} = 00010100_2$
- Left Shift:
  - $(X << 1) = 00101000_2 = 40_{10}$
- Right Shift:
  - $(X >> 1) = 00001010_2 = 10_{10}$
- Signed or “Arithmetic” Right Shift:
  - $(-X >>> 1) = (11101100_2 >>> 1) = 11110110_2 = -10_{10}$



# Boolean Operations

16 / 19

\* It will also be useful to perform logical operations on groups of bits. Which ones?

- ANDing is useful for “masking” off groups of bits.
  - ex.  $10101110 \& 00001111 = 00001110$  (mask selects last 4 bits)
- ANDing is also useful for “clearing” groups of bits.
  - ex.  $10101110 \& 00001111 = 00001110$  (0's clear first 4 bits)
- ORing is useful for “setting” groups of bits.
  - ex.  $10101110 | 00001111 = 10101111$  (1's set last 4 bits)
- XORing is useful for “complementing” groups of bits.
  - ex.  $10101110 ^ 00001111 = 10100001$  (1's invert last 4 bits)
- NORing is useful for.. uhm...
  - ex.  $10101110 \# 00001111 = 01010000$  (0's invert, 1's clear)

# Boolean Unit

17 / 19

\* It is simple to build up a Boolean unit using primitive gates and a mux to select the function.

- Since there is no interconnection between bits, this unit can be simply replicated at each position.
- The cost is about 7 gates per bit. One for each primitive function, and approx 3 for the 4-input mux.



\* This is a straightforward, but not too elegant of a design.

# An ALU, at Last (without comparisons)

18 / 19

\* Combine add/sub, shift and Boolean units



# A Complete ALU

\* With support for comparisons (LT and LTU)

- $\text{LTU} = \sim C$  and  $\text{LT} = N \oplus V$



| 5-bit ALUFN |      |      |      |                       |
|-------------|------|------|------|-----------------------|
| Sub         | Bool | Shft | Math | OP                    |
| 0           | XX   | 0    | 1    | $A+B$                 |
| 1           | XX   | 0    | 1    | $A-B$                 |
| 1           | X0   | 1    | 1    | $A < B$               |
| 1           | X1   | 1    | 1    | $A \text{ LTU } B$    |
| X           | 00   | 1    | 0    | $B \ll A$             |
| X           | 10   | 1    | 0    | $B \gg A$             |
| X           | 11   | 1    | 0    | $B \ggg A$            |
| X           | 00   | 0    | 0    | $A \& B$              |
| X           | 01   | 0    | 0    | $A \mid B$            |
| X           | 10   | 0    | 0    | $A \wedge B$          |
| X           | 11   | 0    | 0    | $A \overline{\mid} B$ |

## \* Circuits for

- Multiplication
- Division
- Floating-Point Operations