

---

---

# **CSE477**

# **VLSI Digital Circuits**

# **Fall 2002**

## **Lecture 15&16: Dynamic CMOS**

Mary Jane Irwin ( [www.cse.psu.edu/~mji](http://www.cse.psu.edu/~mji) )  
[www.cse.psu.edu/~cg477](http://www.cse.psu.edu/~cg477)

[Adapted from Rabaey's *Digital Integrated Circuits*, ©2002, J. Rabaey et al.]

# Review: Designing Fast CMOS Gates

- ❑ Transistor sizing
- ❑ Progressive transistor sizing
  - ❑ fet closest to the output is smallest of series fets
- ❑ Transistor ordering
  - ❑ put latest arriving signal closest to the output
- ❑ Logic structure reordering
  - ❑ replace large fan-in gates with smaller fan-in gate network
- ❑ Apply “logical effort”
- ❑ Buffer (inverter) insertion
  - ❑ separate large fan-in from large  $C_L$  with buffers
  - ❑ uses buffers so there are no more than four TGs in series

# Review: Energy & Power Equations

$$E = C_L V_{DD}^2 P_{0 \rightarrow 1} + t_{sc} V_{DD} I_{peak} P_{0 \rightarrow 1} + V_{DD} I_{leakage}$$

$$f_{0 \rightarrow 1} = P_{0 \rightarrow 1} * f_{clock}$$

$$P = C_L V_{DD}^2 f_{0 \rightarrow 1} + t_{sc} V_{DD} I_{peak} f_{0 \rightarrow 1} + V_{DD} I_{leakage}$$

Dynamic power (~90% today and decreasing relatively)

Short-circuit power (~8% today and decreasing absolutely)

Leakage power (~2% today and increasing)

# Review: Power and Energy Design Space

|         | Constant Throughput/Latency                                   | Variable Throughput/Latency                            |                                             |
|---------|---------------------------------------------------------------|--------------------------------------------------------|---------------------------------------------|
| Energy  | Design Time                                                   | Non-active Modules                                     | Run Time                                    |
| Active  | Logic Design<br>Sizing<br>Reduced $V_{dd}$<br>Multi- $V_{dd}$ | Clock Gating                                           | DFS, DVS<br>(Dynamic Freq, Voltage Scaling) |
| Leakage | + Multi- $V_T$                                                | Sleep Transistors<br>Variable $V_T$<br>Multi- $V_{dd}$ | + Variable $V_T$                            |

---

In addition, the Eclipse Group's engineers were finding plenty of bugs in the logic of their design. ... So Rasala's schedules slipped and slipped, and slipped again. "The way to stay on schedule," he said, "is to make another one."

*The Soul of a New Machine*, Kidder, pg. 246

# Dynamic CMOS

---

- ❑ In **static** circuits at every point in time (except when switching) the output is connected to either GND or  $V_{DD}$  via a low resistance path.
  - fan-in of N requires  $2N$  devices
  
- ❑ **Dynamic** circuits rely on the temporary storage of signal values on the capacitance of high impedance nodes.
  - requires only  $N + 2$  transistors
  - takes a sequence of **precharge** and conditional **evaluation** phases to realize logic functions

# Dynamic Gate



Two phase operation

Precharge (CLK = 0)

Evaluate (CLK = 1)

# Conditions on Output

- ❑ Once the output of a dynamic gate is discharged, it cannot be charged again until the next precharge operation.
- ❑ Inputs to the gate can make **at most** one transition during evaluation.
- ❑ Output can be in the high impedance state during and after evaluation (PDN off), state is stored on  $C_L$

# Properties of Dynamic Gates

- ❑ Logic function is implemented by the PDN only
  - ❑ number of transistors is  $N + 2$  (versus  $2N$  for static complementary CMOS)
  - ❑ should be smaller in area than static complementary CMOS
- ❑ Full swing outputs ( $V_{OL} = GND$  and  $V_{OH} = V_{DD}$ )
- ❑ Nonratioed - sizing of the devices is not important for proper functioning (only for performance)
- ❑ Faster switching speeds
  - ❑ reduced load capacitance due to lower number of transistors per gate ( $C_{int}$ ) so a reduced logical effort
  - ❑ reduced load capacitance due to smaller fan-out ( $C_{ext}$ )
  - ❑ no  $I_{sc}$ , so all the current provided by PDN goes into discharging  $C_L$
  - ❑ Ignoring the influence of precharge time on the switching speed of the gate,  $t_{pLH} = 0$  but the presence of the evaluation transistor slows down the  $t_{pHL}$

# Properties of Dynamic Gates, con't

- ❑ Power dissipation should be better
  - ❑ consumes only dynamic power – no short circuit power consumption since the pull-up path is not on when evaluating
  - ❑ lower  $C_L$ - both  $C_{int}$  (since there are fewer transistors connected to the drain output) and  $C_{ext}$  (since there the output load is one per connected gate, not two)
  - ❑ by construction can have at most one transition per cycle – **no glitching**
- ❑ But power dissipation can be significantly **higher** due to
  - ❑ higher transition probabilities
  - ❑ extra load on CLK
- ❑ PDN starts to work as soon as the input signals exceed  $V_{Tn}$ , so set  $V_M$ ,  $V_{IH}$  and  $V_{IL}$  all equal to  $V_{Tn}$ 
  - ❑ low noise margin ( $NM_L$ )
- ❑ Needs a precharge clock

# Dynamic Behavior



| #Trns | $V_{OH}$ | $V_{OL}$ | $V_M$    | $NM_H$         | $NM_L$   | $t_{pHL}$ | $t_{pLH}$ | $t_p$ |
|-------|----------|----------|----------|----------------|----------|-----------|-----------|-------|
| 6     | 2.5V     | 0V       | $V_{Tn}$ | $2.5 - V_{Tn}$ | $V_{Tn}$ | 110ps     | 0ns       | 83ps  |

# Gate Parameters are Time Independent

- ❑ The amount by which the output voltage drops is a strong function of the input voltage and the **available evaluation time**.
  - ❑ Noise needed to corrupt the signal has to be larger if the evaluation time is short – i.e., the switching threshold is truly time independent.



# Power Consumption of Dynamic Gate



Power only dissipated when previous Out = 0

# Dynamic Power Consumption is Data Dependent

Dynamic 2-input NOR Gate

| A | B | Out |
|---|---|-----|
| 0 | 0 | 1   |
| 0 | 1 | 0   |
| 1 | 0 | 0   |
| 1 | 1 | 0   |

Assume signal probabilities

$$P_{A=1} = 1/2$$

$$P_{B=1} = 1/2$$

Then transition probability

$$P_{0 \rightarrow 1} = P_{\text{out}=0} \times P_{\text{out}=1}$$

$$= 3/4 \times 1 = 3/4$$

Switching activity can be **higher** in dynamic gates!

$$P_{0 \rightarrow 1} = P_{\text{out}=0}$$

# Issues in Dynamic Design 1: Charge Leakage



Minimum clock rate of a few kHz

# Impact of Charge Leakage

- ❑ Output settles to an intermediate voltage determined by a resistive divider of the pull-up and pull-down networks
  - ❑ Once the output drops below the switching threshold of the fan-out logic gate, the output is interpreted as a low voltage.



# A Solution to Charge Leakage

- ❑ **Keeper** compensates for the charge lost due to the pull-down leakage paths.



Same approach as level restorer for pass transistor logic

## Issues in Dynamic Design 2: Charge Sharing



Charge stored originally on  $C_L$  is redistributed (shared) over  $C_L$  and  $C_A$  leading to static power consumption by downstream gates and possible circuit malfunction.

When  $\Delta V_{out} = -V_{DD} (C_a / (C_a + C_L))$  the drop in  $V_{out}$  is large enough to be below the switching threshold of the gate it drives causing a malfunction.

# Charge Sharing Example

What is the worst case voltage drop on  $y$ ? (Assume all inputs are low during precharge and that all internal nodes are initially at 0V.)



$$\begin{aligned}\Delta V_{out} &= -V_{DD} \left( \frac{(C_a + C_c)}{(C_a + C_c) + C_y} \right) \\ &= -2.5V * (30/(30+50)) = -0.94V\end{aligned}$$

# Solution to Charge Redistribution



Precharge internal nodes using a clock-driven transistor (at the cost of increased area and power)

# Issues in Dynamic Design 3: Backgate Coupling

- ❑ Susceptible to crosstalk due to 1) high impedance of the output node and 2) capacitive coupling
  - ❑ Out2 capacitively couples with Out1 through the gate-source and gate-drain capacitances of M4



# Backgate Coupling Effect

- ❑ Capacitive coupling means Out1 drops significantly so Out2 doesn't go all the way to ground



# Issues in Dynamic Design 4: Clock Feedthrough

- A special case of capacitive coupling between the clock input of the precharge transistor and the dynamic output node



Coupling between Out and CLK input of the precharge device due to the gate-drain capacitance. So voltage of Out can rise above  $V_{DD}$ . The fast rising (and falling edges) of the clock **couple** to Out.

# Clock Feedthrough



# Cascading Dynamic Gates



Only a single  $0 \rightarrow 1$  transition allowed at the inputs during the evaluation period!

# Domino Logic



# Why Domino?



Like falling dominos!

# Domino Manchester Carry Chain



# Domino Comparator



# Properties of Domino Logic

- ❑ Only non-inverting logic can be implemented, fixes include
  - can reorganize the logic using Boolean transformations
  - use differential logic (dual rail)
  - use np-CMOS (zipper)
- ❑ Very high speed
  - $t_{p_{HL}} = 0$
  - static inverter can be optimized to match fan-out (separation of fan-in and fan-out capacitances)

# Differential (Dual Rail) Domino



Due to its high-performance, differential domino is very popular and is used in several commercial microprocessors!

# np-CMOS (Zipper)



Only  $0 \rightarrow 1$  transitions allowed at inputs of PDN  
Only  $1 \rightarrow 0$  transitions allowed at inputs of PUN

# np-CMOS Adder Circuit



# DCVS Logic



PDN1 and PDN2 are mutually exclusive

# DCVSL Example



# How to Choose a Logic Style

- Must consider ease of design, robustness (noise immunity), area, speed, power, system clocking requirements, fan-out, functionality, ease of testing

4-input NAND

| Style       | # Trans | Ease | Ratioed ? | Delay | Power   |
|-------------|---------|------|-----------|-------|---------|
| Comp Static | 8       | 1    | no        | 3     | 1       |
| CPL *       | 12 + 2  | 2    | no        | 4     | 3       |
| domino      | 6 + 2   | 4    | no        | 2     | 2 + clk |
| * DCVSL*    | 10      | 3    | yes       | 1     | 4       |

- Current trend is towards an increased use of complementary static CMOS: design support through DA tools, robust, more amenable to voltage scaling.

# Next Lecture and Reminders

---

## ❑ Next lecture

- ❑ Timing metrics, static sequential circuits
  - Reading assignment – Rabaey, et al, 7.1-7.2

## ❑ Reminders

- ❑ Project prototypes due October 29<sup>th</sup>
- ❑ HW4 due October 31<sup>st</sup>
- ❑ Final exam scheduled
  - Monday, December 16<sup>th</sup> from 10:10 to noon in TBD