

# EE476 Sample Final Exam Questions

**Duration: Up to 2 hours**

Unless otherwise stated, the following default parameters apply:

|                                 |                       |
|---------------------------------|-----------------------|
| Vdd                             | 1.2V                  |
| $\epsilon_r$ (Silicon-di-Oxide) | 3.8                   |
| $\epsilon_0$                    | 8.85 E-12             |
| $\lambda$                       | 0                     |
| $T_{ox}$                        | 100A                  |
| $\mu_n$                         | 500cm <sup>2</sup> /V |
| $\mu_p$                         | 250cm <sup>2</sup> /V |
| $V_{thn} = V_{thp}$             | 0.25V                 |
| $W_n$                           | 1μm                   |
| $W_p$                           | 2μm                   |
| $L_n=L_p$                       | 60nm                  |

1. (10 points) For the circuit shown below complete the table below for energy delivered by the supply and energy dissipated for each transition

| Input Transition | Energy delivered by supply | Energy dissipated in circuit   |
|------------------|----------------------------|--------------------------------|
| Rising           | $V_{dd}^2(C_b + 2C_c)$     | $(C_a + 4C_c + C_b)V_{dd}^2/2$ |
| Falling          | $V_{dd}^2(C_a + 2C_c)$     | $(C_a + 4C_c + C_b)V_{dd}^2/2$ |



$$V_C(0) = V_{dd}$$

$$V_C(\text{end}) = -V_{dd}$$

$$\frac{1}{2} CV_{dd}^2 \cdot 2$$

$$V_{dd}^2 (C_a + C_b + 4C_c) \quad \frac{V_{dd}^2}{2} (2C_a + 2C_b + 8C_c)$$





$$G = \int V_{dd} \cdot i \, dt$$

$$i = C \frac{dV_o}{dt}$$

$$G = \int V_{dd} C \frac{dV_o}{dt} \, dt$$

$$= C V_{dd} \int \frac{dV_o}{dt} \, dt$$

$$= C V_{dd} [V_o(\text{end}) - V_o(0)]$$

$$= C V_{dd} \cdot \Delta V$$

$$= C V_{dd} \Delta t^2$$



## Timing Analysis and Voltage Scaling



Original unpipelined design

2. (20 points) Consider the Datapath shown in above. A designer decides to re-design it using a double-edge-triggered flip-flop (A flip flop that works on both, the rising and the falling edge of the clock). The pipelined design is shown below. The available clock has a 40% duty cycle. Timing properties of such a flip-flop are:  $T_{\text{setup}} = 50\text{ps}$ ,  $T_{\text{hold}} = 50\text{ps}$ ,  $T_{\text{cq}} = 50\text{ps}$ .



$$2f = \frac{2}{T}$$

- (3 points) Determine the cycle-time of the original design
- (8 points) Determine the maximum cycle time of such a pipelined design and its peak throughput.

b)  $0.4T \geq t_{\text{cq}} + t_{\text{setup}} + \text{Logic}$   
 $\geq 250\text{ps}$

Max Cycle Time :  $\rightarrow T \geq 250\text{ps}/0.4 = 625\text{ps}$

Peak Throughput :  $\frac{2}{T} = 2/625\text{ps} = 3.2\text{GHz}$

a)  $0.4T \geq t_{\text{cq}} + \text{Setup} + \text{Logic}$   
 $50\text{ps} + 300\text{ps} + 50\text{ps}$   
 $0.4T \geq 400\text{ps}$   
 $T \geq 1000\text{ps}$

- c. (9 points) Assuming a square-law current model, determine the energy savings obtained by such a pipelined design if it were voltage scaled to achieve the original. Ignore the power dissipation of the flip-flops.

$$T_o = 1000 \text{ ps} = 1 \text{ ns} \Rightarrow f_o = 1 \text{ GHz}$$

$$T_p = 625 \text{ ps} = 0.625 \text{ ns} \Rightarrow f_p = 1.6 \text{ GHz}$$

$$P = \sum_{\text{nodes}} \frac{1}{2} C_i C_i V_{dd}^2 f = \frac{1}{2} V_{dd}^2 f \left( \sum_{\text{nodes}} C_i C_i \right)$$

$$\tau = \frac{C V_{dd}}{2 I_S} = \frac{C V_{dd}}{2 \beta W (V_{dd} - V_{th})^2} ; \quad \tau \propto \frac{V_{dd}}{(V_{dd} - V_{th})^2}$$

$$V_{dd} \gg V_{th} \Rightarrow \boxed{\tau \propto \frac{1}{V_{dd}}} \Rightarrow f \propto \frac{1}{\tau} \propto V_{dd} \Rightarrow f = k V_{dd}$$

$$1.6 \text{ GHz} = k \cdot 1.2V \Rightarrow k = \frac{1.6 \text{ GHz}}{1.2V}$$

$$f_{np} = f_o \Rightarrow k V_{dd np} = 1 \text{ GHz} \Rightarrow V_{dd np} = 1 \text{ GHz} / 1.6 \text{ GHz} / 1.2V$$

$$= V_{dd np} = 1.2V \cdot \frac{1}{1.6} = 0.75V \quad | \quad R_o = \frac{1}{2} 1.2V^2 \cdot 16 \text{ Hz} \left( \sum C_i C_i \right)$$

$$P_{np} = \frac{1}{2} 0.75V^2 \cdot 16 \text{ Hz} \left( \sum C_i C_i \right)$$

$$P_{np}/P_o = 0.75^2 / 1.2^2 = 0.469 \Rightarrow \boxed{\text{E.S. } 53.1\%}$$

### Memory

3. (5 points)

a. Why does a typical 6T SRAM cell pre-charge its bit-line to VDD instead of ground?

*Typical 6T SRAM bit cell: NMOS  $\Rightarrow$  Pass transistor  $\Rightarrow$  PMOS Precharge to VDD  $\Rightarrow$  You might overwrite the 6T*

b. Consider an SRAM array with a 10-bit row-decoder, and a 5-bit column decoder. Each row of memory stores 4, 64-bit numbers. What is the capacity of the memory in Bytes

Word: 4 · 64 bits = 256b (each address)

addresses: 15b  $\Rightarrow$  2<sup>15</sup> addresses

$$\Rightarrow 2^{15} \cdot 256b = 32 \cdot 1024 \cdot 256b \approx 8.3 \text{ Mb}$$

c. State the disadvantages of having many memory cells on a single bit-line



- Bitline cap is high
- Slower
- More power
- NMOS transistors need to be stronger

## Adders

4a. (5 points) What is the **worst-possible delay** of a 16-bit carry skip adder that is made up of 4 equal skip-stages (4-bits each) Which gates would the path travel through (Sketch the path using a block diagram).



b. (5 points) Explain why it is not possible for the critical path to be equal to the carry-in rippling through every single 4-bit adder stage

- After the first adder (3:0) computes its carry out we know if the different blocks propagate
- If a block propagates, we get no carry in one "max delay"
- If it doesn't propagate, the carry out of the block is independent of  $C_{in} \Rightarrow C_{out}$  is already at the right value, so we still only need one "max delay"

c. (5 points) Consider a carry skip adder built using 4 unequal sections of size 1,5,4,3,2,1. Write down the critical path of such an adder in terms of  $T_{AOI}$ ,  $T_{MUX}$ ,  $T_{setup}$ . How does this delay compare to the traditional implementation if  $T_{AOI}$ ,  $T_{MUX}$ ,  $T_{setup}$  all take up 1 unit delay





$$\tau_c = 3 \tau_{\text{mux}} + 8 \tau_{\text{PA}} \leftarrow \text{Type 1}$$

$$\tau_c = 2 \tau_{\text{mux}} + 8 \tau_{\text{PA}} \leftarrow \text{Non uniform}$$

$f = \text{fento}$        $\text{fP}$

5. (12 points) For the circuit shown below complete the table below for energy delivered by the supply and energy dissipated for each transition

| Input Transition | Energy delivered by supply | Energy dissipated in circuit                                      |
|------------------|----------------------------|-------------------------------------------------------------------|
| Rising           | $1fV_{dd}^2$               | $\frac{1}{2}2fV_{dd}^2 + \frac{1}{2}(fV_{dd})^2 + \sum fV_{dd}^2$ |
| Falling          | $2fV_{dd}^2 + 1fV_{dd}^2$  | $\frac{1}{2}2fV_{dd}^2 + \frac{1}{2}(fV_{dd})^2 + \sum fV_{dd}^2$ |



Remember to add these terms

6. [2+2+3+4 = 11 points] Consider the circuit below with the provided switching activities and probabilities. Calculate the power dissipation expected in this netlist based on these parameters, and the load capacitance at nodes A, C, D and F . Report Energy for A, and C in terms of  $C_A$  and  $C_C$ .

$$\begin{aligned}
 & P = \frac{1}{2} C V_{dd}^2 f_S \quad \text{Add } \frac{1}{2} L^2 f \cdot \left[ C_A \cdot 0.6 + C_B \cdot 0.6 + 10f \cdot 1.2 \right] \\
 & P_1 = 0.5 S_1 = 0.5 \\
 & P_2 = 0.4 S_2 = 0.8 \\
 & P_3 = 0.8 S_3 = 0.4 \\
 & P_4 = 0.5 S_4 = 0.5 \\
 & P = 0.8 \cdot 0.4 + 0.2 \cdot 0.6 = 0.44 \\
 & S = 0.5 \cdot 0.4 + 0.8 \cdot 0.5 = 0.6 \\
 & S_{A,B} = 1 \cdot 0.6 \\
 & S_{C,D} = 1 \cdot 0.6 \\
 & S = 1.2 \\
 & C_A = 10f \\
 & C_C = 10f
 \end{aligned}$$

$$\begin{aligned}
 & S_E = 0.9 \cdot 0.5 + 0.5 \cdot 0.2 \\
 & = 0.2 + 0.1 \\
 & = 0.3 \\
 & P_E = 1 - 0.2 \cdot 0.5 \\
 & = 1 - 0.1 \\
 & = 0.9 \\
 & S_D = 0.5 \cdot 0.6 + 0.8 \cdot 0.5 \\
 & = 0.3 + 0.4 \\
 & = 0.7 \\
 & P_D = 1 - 0.5 \cdot 0.6 \\
 & = 1 - 0.3 \\
 & = 0.7 \\
 & S_F = 0.7 \cdot 0.8 \\
 & + 0.3 \cdot 0.2 \\
 & + 0.2(P_{D=1, E=0} + P_{D=0, E=1}) \\
 & = 0.56 + 0.06 + 0.2(0.7 \cdot 0.1 + 0.3 \cdot 0.9) \\
 & = 0.62 + 0.2(0.07 + 0.27) \\
 & = 0.62 + 0.2 \cdot 0.34 \\
 & = 0.62 + 0.068 = 0.688
 \end{aligned}$$



| a | b | s |
|---|---|---|
| 0 | 0 | 0 |
| 0 | 1 | 1 |
| 1 | 0 | 1 |
| 1 | 1 | 1 |

When does  $\rightarrow$  matter?

$$b = 1, P = 0.4$$

$$S_{z,a} = 0.5 \cdot 0.4 = 0.2$$

$\rightarrow$  in b    $\xrightarrow{?}$     $\rightarrow$  in z ?

$$a = 1 \quad P = 0.5$$

$$S_{z,b} = 0.8 \cdot 0.5 = 0.4$$

$$P(z=1)$$

$$= 1 - P(a=1, b=1)$$

$$= 1 - 0.5 \cdot 0.4$$

$$= 1 - 0.2 \quad S_z = S_{z,a} + S_{z,b} = 0.4 + 0.2 = 0.6$$

$$= 0.8$$

Assume  $\rightarrow R_d = 0$

7. [10 points] Consider the following inverter cascade. Solve for "x", the input capacitance of the inverter to achieve the lowest energy-delay product. The input capacitance of the first stage is 1fF. Assume only contributor to energy is switching capacitance (ignore leakage and crowbar current)



$R_{eff}, C_m = \text{constant}$   
for all  
inverters

(independent

of string)

$\Rightarrow T_1 = P_d + \tau_{INV} \frac{C_{m2}}{C_{m1}}$

$$T_1 = P_d + R_{eff1} \cdot C_{m2}$$

$$T_2 = P_d + R_{eff2} \cdot 16f$$

$$T_1 = P_d + R_{eff1} \cdot C_{m1} \cdot \frac{C_{m2}}{C_{m1}}$$

$$T_2 = P_d + R_{eff2} \cdot C_{m2} \cdot \frac{16f}{C_{m2}}$$

$$T_1 = P_d + \tau_{INV} \frac{C_{m2}}{C_{m1}}$$

$$T_2 = P_d + \tau_{INV} \frac{16f}{C_{m2}}$$

$$T = \text{total delay} = 2P_d + \tau_{INV} \left( \frac{C_{m2}}{C_{m1}} + \frac{16f}{C_{m2}} \right)$$

$$E = P \cdot T = \left( \frac{1}{2} f C_{m2} V_{dd}^2 + \frac{1}{2} f 16f V_{dd}^2 \right) T$$

$$= \frac{1}{2} V_{dd}^2 (C_{m2} + 16f) \quad (\text{For one clock cycle})$$

$E - T$

$$= \frac{1}{2} V_{dd}^2 (C_{m2} + 16f) \left[ 2P_d + \tau_{INV} \left( \frac{C_{m2}}{C_{m1}} + \frac{16f}{C_{m2}} \right) \right]$$

$$= \frac{1}{2} V_{dd}^2 (C_{m2} + 16f) \tau_{INV} \left( \frac{C_{m2}}{1fF} + \frac{16f}{C_{m2}} \right)$$

$$= \frac{1}{2} V_{dd}^2 \tau_{INV} \left[ \frac{C_{m2}^2}{1f} + 16f + 16f C_{m2} + \frac{256}{C_{m2}} \right]$$

$$\frac{d(GT)}{dC_{m2}} = 0 \Rightarrow \frac{d}{dC_{m2}} \left[ \frac{C_{m2}^2}{1f} + 16f + 16f C_{m2} + \frac{256}{C_{m2}} \right] = 0$$

$$\Rightarrow 2 \frac{C_{m2}}{1f} + 16f - \frac{256f^2}{C_{m2}^2} = 0$$

$$2 \frac{C_{in2}}{2f} + 16f - \frac{2S6f^2}{C_{in2}^2} = 0$$

$$2 \frac{C_{in2}^3}{1f} + 16f C_{in2}^2 - 2S6f^2 = 0$$

$$C_{in2}^3 + 8f C_{in2}^2 - 128f^3 = 0$$

$$\Rightarrow C_{in2} = 4. \sqrt[3]{fF}$$

Energy-delay product?



$$\begin{aligned} & \min \tau \cdot E \\ & \equiv \min \log(\tau \cdot E) \\ & \equiv \min \log \tau + \log E \end{aligned}$$