

---

---

# **CSE477**

# **VLSI Digital Circuits**

# **Fall 2002**

## **Lecture 22: Shifters, Decoders, Muxes**

Mary Jane Irwin ( [www.cse.psu.edu/~mji](http://www.cse.psu.edu/~mji) )  
[www.cse.psu.edu/~cg477](http://www.cse.psu.edu/~cg477)

[Adapted from Rabaey's *Digital Integrated Circuits*, ©2002, J. Rabaey et al.]

# Review: Basic Building Blocks

- Datapath

- Execution units
  - Adder, multiplier, divider, **shifter**, etc.
- Register file and pipeline registers
- **Multiplexers, decoders**

- Control

- Finite state machines (PLA, ROM, random logic)

- Interconnect

- Switches, arbiters, buses

- Memory

- Caches (SRAMs), TLBs, DRAMs, buffers

# Parallel Programmable Shifters



Shifters used in multipliers, floating point units

Consume lots of area if done in random logic gates

# A Programmable Binary Shifter



| $A_i$ | $A_{i-1}$ | rgt | nop | left | $B_i$ | $B_{i-1}$ |
|-------|-----------|-----|-----|------|-------|-----------|
| $A_1$ | $A_0$     | 0   | 1   | 0    | $A_1$ | $A_0$     |
| $A_1$ | $A_0$     | 1   | 0   | 0    | 0     | $A_1$     |
| $A_1$ | $A_0$     | 0   | 0   | 1    | $A_0$ | 0         |

# 4-bit Barrel Shifter



Example:  $Sh0 = 1$

$$B_3 B_2 B_1 B_0 = A_3 A_2 A_1 A_0$$

$Sh1 = 1$

$$B_3 B_2 B_1 B_0 = A_3 A_3 A_2 A_1$$

$Sh2 = 1$

$$B_3 B_2 B_1 B_0 = A_3 A_3 A_3 A_2$$

$Sh3 = 1$

$$B_3 B_2 B_1 B_0 = A_3 A_3 A_3 A_3$$

Area dominated by  
wiring

# 4-bit Barrel Shifter Layout



Only one Sh#  
active at a time,

$$\text{Width}_{\text{barrel}} \sim 2 p_m N$$

$N = \text{max shift distance}$ ,  $p_m = \text{metal pitch}$

Delay  $\sim 1 \text{ fet} + N \text{ diff caps}$

# 8-bit Logarithmic Shifter



# 8-bit Logarithmic Shifter Layout Slice



$$\text{Width}_{\log} \sim p_m(2K + (1+2+\dots+2^{K-1})) = p_m(2^K + 2K - 1)$$
$$K = \log_2 N$$

Delay  $\sim K$  fets + 2 diff caps

# Shifter Implementation Comparisons

| N  | K | Barrel    |               | Logarithmic     |               |
|----|---|-----------|---------------|-----------------|---------------|
|    |   | Width     | Speed         | Width           | Speed         |
|    |   | $2 N p_m$ | $1 + N$ diffs | $p_m(2^K+2K-1)$ | $K + 2$ diffs |
| 8  | 3 | $16 p_m$  | $1 + 8$       | $13 p_m$        | $3 + 2$       |
| 16 | 4 | $32 p_m$  | $1 + 16$      | $23 p_m$        | $4 + 2$       |
| 32 | 5 | $64 p_m$  | $1 + 32$      | $41 p_m$        | $5 + 2$       |
| 64 | 6 | $128 p_m$ | $1 + 64$      | $75 p_m$        | $6 + 2$       |

# Decoders

- Decodes inputs to activate one of many outputs



- two inverters, four 2-input nand gates, four inverters plus enable logic
- how about for a 3-to-8, 4-to-16, etc. decoder?

# Dynamic NOR Decoder



# Dynamic NAND Decoder



# Building Big Decoders from Small



# Multiplexers

- Selects one of several inputs to gate to the single output



- two inverters, four 3-input nands, one 4-input nand
- how about for an 8x1, 16x1, etc. mux?

# Review: TG 2x1 Multiplexer



$$F = !((\text{In}_1 \& S) | (\text{In}_2 \& \text{!}S))$$

**GND**



# Building Big Muxes from Small



# Review: Datapath Bit-Sliced Organization



Tile identical bit-slice elements

# Layout of Bit-Sliced Datapaths



# Layout of Bit-sliced Datapaths

**Without feedthroughs or pitch matching ( $4.2\mu\text{m}^2$ )**



**With feedthroughs  
( $3.2\mu\text{m}^2$ )**



**With feedthroughs and pitch matching ( $2.2\mu\text{m}^2$ )**



# Alpha 21264 Integer Unit Datapath

- △ bus driver
- tristate bus driver



# Next Lecture and Reminders

---

## ❑ Next lecture

- ❑ Semiconductor memories
  - Reading assignment – Rabaey, et al, 12.1-12.2.1

## ❑ Reminders

- ❑ Project final reports due December 5<sup>th</sup>
- ❑ HW5 (last one!) due November 19<sup>th</sup>
- ❑ Final grading negotiations/correction (except for the final exam) must be concluded by December 10<sup>th</sup>
- ❑ Final exam scheduled
  - Monday, December 16<sup>th</sup> from 10:10 to noon in 118 and 121 Thomas