

## Chapter 3 SOLUTIONS

1) a)  $I_D = \frac{V_{in} - 2V_{D(on)}}{R_1 + R_2}$

$$= \frac{2.5 - 1.4}{4k\Omega} = \frac{1.1}{4k\Omega} V$$

$I_D = 275 \mu A$

b)  $I_S = 10^{14} A$ ,  $T = 300 K$ ,  $V_{D(on)} = 0.7 V$

$$I_D = I_S (e^{\frac{V_D}{\phi_T}} - 1)$$

where  $\phi_T = \frac{kT}{q} = 26 mV @ 300 K$

(\*)  $I_D = \frac{V_{in} - 2V_D}{R_1 + R_2} = I_S (e^{\frac{V_D}{\phi_T}} - 1)$

$$= \frac{2.5 - 0.7}{4k\Omega} = 10^{14} (e^{\frac{0.7}{0.026}} - 1)$$

iterating on this expression

we can obtain.

$V_D = 0.628 V$   
 $I_D = 311 \mu A$

c) SPICE

### exercise 3.1

vin in 0 dc 2.5

r1 in 2 2k

d1 2 3 D

r2 3 out 2k

d2 out 0 D

.options post=2

.op

.model D D level=1 is=1e-14 m=0.5

.end

\*\*\*\*\*

| NODE | VOLTAGE    |
|------|------------|
| 2    | 1.8750E+00 |
| 3    | 1.2500E+00 |
| OUT  | 6.2501E-01 |

CURRENT 3.1249E-04

d)  $I_S = 10^{16} A$ ,  $T = 300 K$ ,  
 again use (\*) from b and iterate  
 to find:

$V_D = 0.743 V$   
 $I_D = 253 \mu A$

$T_S = 10^{14} A$ ,  $T = 350 K$

$V_D = 0.728 V$   
 $I_D = 261 \mu A$

exercise 3.1d  
 vin in 0 dc 2.5  
 r1 in 2 2k  
 d1 2 3 D  
 r2 3 out 2k  
 d2 out 0 D

.options post=2  
 .op

\*.temp27  
 .temp 77  
 \*.model D D level=1 is=1e-16 m=0.5  
 \*.model D D level=1 is=1e-14 m=0.5  
 .end

\*\*\*\*\*

t=27C°  
 NODE VOLTAGE  
 2 1.9889E+00  
 3 1.2500E+00  
 IN 2.5000E+00  
 OUT 7.3892E-01

ID 2.5554E-04

\*\*\*\*\*

t=77C°  
 NODE VOLTAGE  
 2 1.7844E+00  
 3 1.2500E+00  
 IN 2.5000E+00  
 OUT 5.3436E-01

ID 3.5782E-04

2

$$a) I_D = 0, V_0 = -V_S = 3.3V$$

$$\text{More exact: } I_D = I_{Rev} \approx -I_S$$

$$V_0 = -(V_S - I_S R_S)$$

b) Reverse biased

$$c) W_J = \sqrt{\left(\frac{2E_S}{q}\right) \frac{N_A + N_D}{N_A N_D} (\phi_0 - V_0)}$$

$$q = 1.6 \times 10^{-19} C, V_0 = -V_S = 3.3V$$

$$E_S = 11.7 E_0 = 1.035 \times 10^{-12} \text{ eV/cm}$$

$$N_A = 2.5 \times 10^{16} \text{ cm}^{-3}, N_D = 2.5 \times 10^{15} \text{ Atoms/cm}^3$$

$$W_J = 110.75 \times 10^{-6} \text{ cm}$$

UNIT CHECK:

$$\sqrt{\frac{F}{C} \cdot \frac{(cm)}{cm^2}} = \sqrt{\frac{C}{cm^2} \cdot \frac{(cm^3)}{cm^3}} \cdot V$$

Since 1 coulomb = 1 farad · 1 volt

$$d) C_J = \frac{E_S \cdot A_D}{W_J}$$

$$w/ A_D = 120 \times 10^{-9} \text{ cm}^2$$

$$C_J = 1.12 \text{ fF}$$

$$e) V_{Snew} = 1.5V < V_{Sold} = 3.3V$$

The new voltage, reduces the reverse bias of the P-N junction, hence the width of the depl. region,  $w_J$ , decreases. As you bring the plates of a capacitor together, the capacitance increases.

$$[3] a) V_{GS} = 2.5V \\ V_{DS} = 2.5V \quad \text{sat}$$

$$I_D = \frac{k' w}{L} (V_{GS} - V_t)^2 (1 + \gamma V_{DS})$$

$$= \frac{115 \times 10^6}{2} (2.5 - 0.43)^2 (1 + 0.06 \times 2.5)$$

$$= 283.3 \mu A$$

$$V_{GS} = -0.5V \quad \text{sat}$$

$$V_{DS} = -1.25V$$

$$I_D = \frac{k' w}{L} (V_{GS} - V_t)^2 (1 + \gamma V_{DS})$$

$$= \frac{80 \times 10^6}{2} (0.5 - 0.4)^2 (1 + 0.1 \times 1.25)$$

$$= 0.17 \mu A$$

$$b) V_{GS} = 3.3V \quad \text{linear.}$$

$$V_{DS} = 2.2V$$

$$I_D = k' \frac{w}{L} \left( (V_{GS} - V_t) V_{DS} - \frac{V_{DS}^2}{2} \right)$$

$$= 115 \times 10^6 \left( (3.3 - 0.43) 2.2 - \frac{2.2^2}{2} \right)$$

$$= 447.8 \mu A$$

$$V_{GS} = -2.5 \quad \text{linear}$$

$$V_{DS} = -1.8$$

$$I_D = k' \frac{w}{L} \left( (V_{GS} - V_t) V_{DS} - \frac{V_{DS}^2}{2} \right)$$

$$= 30 \times 10^6 \left( (2.5 - 0.4) 1.8 - \frac{1.8^2}{2} \right)$$

$$= 61.8 \mu A$$

$$c) V_{GS} = 0.6V \quad \text{linear}$$

$$V_{DS} = 0.1V$$

$$I_D = \frac{115 \times 10^6}{2} \left( (0.6 - 0.43) \times 0.1 - \frac{0.1^2}{2} \right)$$

$$= 1.38 \mu A$$

$$V_{GS} = -2.5V \quad \text{linear}$$

$$V_{DS} = 0.7V$$

$$I_D = 30 \times 10^6 \left( (2.5 - 0.4) \times 0.7 - \frac{0.7^2}{2} \right)$$

$$= 36.75 \mu A$$

4-5

$$\diamond I(M1.D)_{1:8} + I(M1.D)_{2:8} \pm I(M1.D)_{3:8} * I(M1.D)_{4:8}$$

$$A \quad \square I(M1.D)_{5:8} \circ I(M1.D)_{6:8}$$



$$\diamond I(M2.D)_{1:4} + I(M2.D)_{2:4} \pm I(M2.D)_{3:4} * I(M2.D)_{4:4}$$

$$A \quad \square I(M2.D)_{5:4} \circ I(M2.D)_{6:4}$$



$$\diamond I(M3.D)_{1:7} + I(M3.D)_{2:7} \pm I(M3.D)_{3:7} * I(M3.D)_{4:7}$$

$$A \quad \square I(M3.D)_{5:7} \circ I(M3.D)_{6:7}$$



$$\diamond I(M4.D)_{1:6} + I(M4.D)_{2:6} \pm I(M4.D)_{3:6} * I(M4.D)_{4:6}$$

$$A \quad \square I(M4.D)_{5:6} \circ I(M4.D)_{6:6}$$



c) 1&3 are in velocity sat, The effect can be seen from the linearly increasing  $I_D$ .

for a short channel device

$$I_D = k' \frac{W}{L} \left[ (V_{GS} - V_T) V_{DSAT} - \frac{V_{DSAT}^2}{2} \right] (1 + \lambda V_{DS})$$

$$V_{DSAT} = \min[(V_{GS} - V_T), V_{DS}, V_{DSAT}]$$

To begin with the operation regions need to be determined.

For any of these data to be in saturation.

$V_T$  should be :  $V_{GS} - V_T < V_{DSAT}$

$$2 - V_T < 0.6 \Rightarrow V_T > 1.4 \text{ V}$$

This is a quite high value in our process.  
Thus, we can assume that all data are taken in velocity saturation. We will check this assumption later.

In velocity sat:

$$I_D = k' \frac{W}{L} \left[ (V_{GS} - V_T) V_{DSAT} - \frac{V_{DSAT}^2}{2} \right] (1 + \lambda V_{DS})$$

using 1&2.

$$I_D = k' \frac{W}{L} \left[ (2.5 - V_T) 0.6 - \frac{0.6^2}{2} \right] (1 + \lambda 1.8) = 1812$$

$$I_D = k' \frac{W}{L} \left[ (2 - V_T) 0.6 - \frac{0.6^2}{2} \right] (1 + \lambda 1.8) = 1297$$

$$\frac{1812}{1297} = \frac{(2.5 - V_T) 0.6 - \frac{0.6^2}{2}}{(2 - V_T) 0.6 - \frac{0.6^2}{2}} \Rightarrow V_T = 0.44 \text{ V}$$

using 2&3  $[V_T < 1.4 \text{ V}]$  so 1,2,3 are in vel. sat.

$$\frac{1297}{1361} = \frac{1 + \lambda 1.8}{1 + \lambda 2.5} \Rightarrow \lambda = 0.08 \text{ V}^{-1}$$

using 2&4

$$V_T = 0.587 \text{ V} \quad ①$$

$$\text{using 2&5: } V_T = 0.691 \text{ V} \quad ②$$

both these values satisfy  $V_T < 1.4 \text{ V}$   
so all the data in our table were

taken in velocity saturation.

$$V_T = V_{TO} + \gamma \left( \sqrt{V_{GS} + |2\phi_f|} - \sqrt{2\phi_f} \right)$$

① & ② can be used along with  $V_{TO} = 0.44 \text{ V}$

to conclude

$$|2\phi_f| = 0.6 \text{ V}$$

$$\gamma = 0.3 \text{ V}^{1/2}$$

also using 2<sup>nd</sup> set of data

$$I_D = 1297, W=L \left[ (V_{GS} - V_T) V_{DSAT} - \frac{V_{DSAT}^2}{2} \right]$$

$$\frac{W}{L} = 15$$

a) This is a PMOS device

b) using measurements 1&4

$$V_{TO} = 0.5 \text{ V}$$

$$c) \text{ Using 1&5: } \gamma = 0.538 \text{ V}^{1/2}$$

$$d) \text{ Using 1&6: } \lambda = 0.05 \text{ V}^{-1}$$

e) 1-Vel. Sat, 2-cut off, 3-saturation, 4-5-6-Vel. Sat., 7-linear

f) When  $R=10k$ ,  $V_D = V_{DD} - IR$

$$\Rightarrow V_D = 2.5 - 50 \times 10^{-6} \times 10^4 = 2.5 - 0.5 = 2 \text{ V}$$

assume the device is in saturation:  
(needs to be verified eventually.)

$$I_D = \frac{k'}{2} \frac{W}{L} (V_{GS} - V_T)^2 = 50 \text{ mA}$$

$$\Rightarrow V_{GS} - V_T = 0.3 \text{ V} \Rightarrow V_{GS} = 0.3 + 0.4 = 0.7 \text{ V}$$

$$V_{DS} = 2 - 0.7 = 1.3 \text{ V}$$

$$V_{min} = \min(V_{GS} - V_T, V_{DSAT}, V_{DS}) = \min(0.3, 0.6, 0.7) = V_{GS} - V_T \Rightarrow \text{saturation verified.}$$

|                      |                       |
|----------------------|-----------------------|
| $V_D = 2 \text{ V}$  | $V_S = 1.3 \text{ V}$ |
| saturation operation |                       |

$$b) V_D = 2.5 - 30 \times 10^{-3} \times 50 \times 10^{-6} = 2.5 - 1.5$$

$$V_D = 1 \text{ V}$$

assume linear op:

$$I_D = k' \frac{W}{L} \left[ (V_{GS} - V_T) V_{DS} - \frac{V_{DS}^2}{2} \right] = 50 \text{ mA}$$

$$110 \times 10^{-3} \times 10 \left[ (2 - V_S - 0.4)(1 - V_S) - \frac{(1 - V_S)^2}{2} \right] = 50 \text{ mA}$$

$$\Rightarrow V_S = 0.93 \text{ V}$$

$$\min(V_{GS} - V_T, V_{GS}, V_{DSAT}) = \min(2 - 0.93 - 0.4, 0.07, 0.7) = V_{DS} \Rightarrow \text{linear verified}$$

c) increase.  $V_D$  is fixed due to const. current.  $(1+2V_{DS})$  term would try to increase the current more than available  $50\mu A$ , thus  $V_{DS}$  needs to reduce by increasing  $V_S$ .

g) Device is always in saturation.

$$a) \frac{-V_X}{R} = \frac{k_p'}{2} \frac{W}{L} (V_X - V_{tP})^2$$



$$c) \frac{IV}{20k\Omega} = \frac{30 \times 10^{-6}}{2} \left( \frac{W}{L} \right) \times (1.5 - 0.4)^2$$

$$50\mu A = 15 \times 10^{-6} \left( \frac{W}{L} \right) 1.21$$

$$2.755 = \left( \frac{W}{L} \right) \Rightarrow W \approx 0.69 \mu m$$



10) a)  $I_D = \frac{k_n'}{2} \frac{W}{L} (V_i - V_o - V_t)^2$

$$\sqrt{\frac{2I_D}{k_n' W}} = V_i - V_o - V_t$$

neglecting body effect  $V_t = V_{to}$

$$V_i = \sqrt{\frac{2I_D}{k_n' \left( \frac{W}{L} \right)}} + V_{to} + V_o$$

Level Shift (LS)

$$LS = V_{to} + \sqrt{\frac{2I_D}{k_n' \left( \frac{W}{L} \right)}} = 0.43 \sqrt{\frac{2 \times 35 \mu A}{115 \mu A \cdot 3}} = 0.88 V = LS$$

b)  $V_t = V_{to} + \gamma \left( \sqrt{|V_o| + 2\phi_F} - \sqrt{2\phi_F} \right)$

$$= 0.43 + 0.4 \left( \sqrt{V_o + 0.6} - \sqrt{0.6} \right)$$



$$c) V_o = V_i - \sqrt{\frac{2I_D}{k_n' W}} - V_t$$

Plot with 1)  $V_t = V_{to}$ , 2)  $V_t = V_t(V_o)$



11 a) In saturation

$$\frac{V_{DD} - V_{out}}{R} = \frac{k'}{2} \frac{W}{L} (V_{in} - V_t)^2$$

In triode:

$$\frac{V_{DD} - V_{out}}{R} = k' \frac{W}{L} \left[ (V_{in} - V_t) V_{out} - \frac{V_{out}^2}{2} \right]$$

$$V_{DD} = 2.5, R = 8k\Omega, k' = 115 \mu A/V^2$$

$$V_t = V_{to} = 0.43V$$

b) Results and spice follow



exercise 3.12  
.lib g25.lib TT

vdd vdd 0 dc 2.5

vin in 0 dc 2.5

r1 vdd out\_long 8k

m1 out\_long in s1 0 nmos\_t l=0.5u w=4u  
m2 s1 in 0 0 nmos\_t l=0.5u w=4u

r2 vdd out\_short 8k

m3 out\_short in 0 0 nmos\_t l=0.25u w=1u

.probe .dc v(in) v(out) v(out1)

.de vin 0 2.5 0.1

.options post=2 csdf

.op

.end

\*\*\*\*\*

| Vin | Vout_long | Vout_short | Vout_hand |
|-----|-----------|------------|-----------|
| 0.0 | 2.500     | 2.50       | 2.50      |
| 0.5 | 2.470     | 2.48       | 2.49      |
| 1.0 | 1.537     | 1.86       | 1.90      |
| 1.5 | 0.442     | 0.92       | 0.85      |
| 2.0 | 0.301     | 0.46       | 0.41      |
| 2.5 | 0.244     | 0.35       | 0.30      |

c) The long device was modeled as two transistors in series. The equivalent transistor has a keeper transition

12

V<sub>TO</sub>

This one should immediately signal you to look at a curve(s) that don't have body-effect. That means V<sub>BS</sub> = 0V. Pick two points, each from different curves that satisfy the no-body-effect condition. Make sure they're in the same operating region too!

| Point | V <sub>GS</sub> | V <sub>DS</sub> | I <sub>D</sub> | Operating Region |
|-------|-----------------|-----------------|----------------|------------------|
| A     | 2.5V            | 1.8V            | 300uA          | saturation       |
| B     | 2.0V            | 1.8V            | 160uA          | saturation       |

The reason why I chose points with the same V<sub>DS</sub> will be evident once I work through the math.

$$\frac{I_{D,A}}{I_{D,B}} = \frac{\frac{1}{2} k_p \left( \frac{W}{L} \right) (V_{GS,A} - V_{T0})^2 (1 + \lambda \cdot V_{DS,A})}{\frac{1}{2} k_p \left( \frac{W}{L} \right) (V_{GS,B} - V_{T0})^2 (1 + \lambda \cdot V_{DS,B})}$$

$$\frac{300}{160} = \frac{(2.5 - V_{T0})^2}{(2.0 - V_{T0})^2}$$

$$V_{T0} = 0.64V$$

As you can see, in order for me to isolate V<sub>TO</sub>, I needed to make sure I can cancel as many variables to be able to solve the equation.

λ We can use the same methodology as above. This time, we want to keep V<sub>GS</sub> constant.

| Point | V <sub>GS</sub> | V <sub>DS</sub> | I <sub>D</sub> | Operating Region |
|-------|-----------------|-----------------|----------------|------------------|
| A     | 2.5V            | 2.4V            | 310uA          | saturation       |
| B     | 2.5V            | 1.8V            | 300uA          | saturation       |

$$\frac{I_{D,A}}{I_{D,B}} = \frac{\frac{1}{2} k_p \left( \frac{W}{L} \right) (V_{GS,A} - V_T)^2 (1 + \lambda \cdot V_{DS,A})}{\frac{1}{2} k_p \left( \frac{W}{L} \right) (V_{GS,B} - V_T)^2 (1 + \lambda \cdot V_{DS,B})}$$

$$\frac{310}{300} = \frac{(1 + \lambda \cdot 2.4)}{(1 + \lambda \cdot 1.8)}$$

$$\lambda = 0.0617V^{-1}$$

γ It shouldn't be a surprise, but that leaves us to keep almost everything constant except for V<sub>BS</sub>.

| Point | V <sub>BS</sub> | V <sub>GS</sub> | V <sub>DS</sub> | I <sub>D</sub> | Operating Region |
|-------|-----------------|-----------------|-----------------|----------------|------------------|
| A     | 1.0V            | 2.0V            | 1.2V            | 105uA          | saturation       |
| B     | 0.0V            | 2.0V            | 1.2V            | 150uA          | saturation       |

12 cont'd

$$I_{D,A} = \frac{1}{2} k_p \left( \frac{W}{L} \right) (V_{GS,A} - V_T)^2 (1 + \lambda \cdot V_{DS,A})$$

$$I_{D,B} = \frac{1}{2} k_p \left( \frac{W}{L} \right) (V_{GS,B} - V_{T0})^2 (1 + \lambda \cdot V_{DS,B})$$

$$\frac{105}{150} = \frac{(2.0 - V_T)^2}{(2.0 - 0.64)^2}$$

$V_T = 0.862V$

Now solve for  $\gamma$  using the following equation:

$$V_T - V_{T0} = \gamma \sqrt{|V_{SB} - 2\Phi_F|} - \sqrt{-2\Phi_F}$$

$$0.862 - 0.64 = \gamma \sqrt{|1 + 0.6|} - \sqrt{0.6}$$

$$\gamma = 0.453V^{1/2}$$

12-1

a)  $I_{DS} = QV$  ( $V = V_{velocity}$ )  
 $Q = CV$  ( $V = V_{oltage}$ )

$$C = W \cdot C_{ox} \quad V = V_{GS} - V_t$$

$$I_{DS} = W C_{ox} (V_{GS} - V_t) V$$

b)  $V = \mu_n \cdot E$

$$E = \frac{(V_{GS} - V_t)}{2L}$$

$$I_{DS} = W C_{ox} (V_{GS} - V_t) \frac{(V_{GS} - V_t)}{2L}$$

$$I_{DS} = \frac{\mu_n C_{ox}}{2} \left( \frac{W}{L} \right) (V_{GS} - V_t)^2$$

c)  $V = V_{max} - \underline{\text{constant}}$

$$I_{DS} = W C_{ox} (V_{GS} - V_t) V_{max}$$

d)

$$1) I_D \propto w \quad \underline{\text{not}} \quad \underline{\frac{w}{L}}$$

$$2) I_D \propto (V_{GS} - V_t)^1$$

not  $(V_{GS} - V_t)^2$

15)

a)  $I_D = \frac{k'}{2} \frac{w}{L} (V_{GS}' - V_t)^2$

$$V_{GS}' = V_{GS} - I_D R_s$$

$$I_D = \frac{k'}{2} \frac{w}{L} \left[ (V_{GS} - V_t)^2 - 2(V_{GS} - V_t) I_D R_s \right.$$

$$\left. + \frac{(I_D R_s)^2}{(neglect)} \right]$$

$\frac{V_{GS}'}{R_s} \approx R_s$

13)  $V_{in} = 0.2 \Rightarrow$   
 $I_{DS} = 3 \times 10^{-8} A \quad ①$

or  
 $V_{in} = 0.2 \Rightarrow$   
 $I_{DS} = 5 \times 10^{-9} A \quad ②$

$$\Delta t = C \frac{\Delta V}{I}$$

$$\Delta t_1 = 1 \text{ pF} \times \frac{1}{3 \times 10^{-8}} \\ = 33.3 \text{ } \mu\text{s}$$

$$\Delta t_2 = 1 \text{ pF} \times \frac{1}{5 \times 10^{-9}} \\ = 200 \text{ } \mu\text{s}$$

15a cont'd

$$I_D \left[ 1 + K' \frac{w}{L} (V_{GS} - V_t) R_s \right] = \frac{K'}{2} \frac{w}{L} (V_{GS} - V_t)^2$$

$$\therefore I_D = \frac{1}{1 + \frac{K' w}{L} (V_{GS} - V_t) R_s} \cdot \underbrace{\frac{K' w}{2} \frac{w}{L}}_{(V_{GS} - V_t)^2}$$

Comparing w/ given short channel equation reveals:

$$\frac{K' w}{L} (V_{GS} - V_t) R_s = \frac{(V_{GS} - V_t)}{E_{SAT} L}$$

$$R_s = \frac{1}{N_n C_{ox} W E_{SAT}}$$

b)  $E_{SAT} = 1.5 \text{ V}/\mu\text{m}$   $K' = 20 \mu\text{A}/\sqrt{2}$

$$R_s = \frac{1}{w (1.5 \text{ V}/\mu\text{m}) (20 \mu\text{A}/\sqrt{2})}$$

$$R_s = \frac{33.33 \text{ k}\Omega}{w} \quad (\text{w in } \mu\text{m})$$

Independent of channel length

[16]

a) First let us write the resistance as a function of the output voltage

$$R(V) = \frac{V}{I(V)} = \frac{V}{k * V * e^{mV_0}} = \frac{1}{k * e^{mV_0}}$$

Then, we need to average this resistance over the voltages of interest. A variant of the formula 3.42 in course notes can be written as

$$R_{eq} = \frac{1}{(V_2 - V_1)} \int_{V_1}^{V_2} R(v) dv$$

plugging the  $R(V)$  expression in and carrying out integral, we obtain

$$R_{eq} = \frac{1}{2V_0} \int_0^{2V_0} \frac{1}{k * e^{mV_0}} dv = \frac{1}{2V_0} \frac{-V_0}{k} (e^{-2V_0/V_0} - 1) = \frac{1}{2k} (1 - e^{-2}) = \frac{0.423}{k} \Omega$$

b) Again we should obtain  $R(v)$  by starting from the I-V relation.

Note that the device will be operating in velocity saturation régime. This can be seen by comparing  $V_{GS} - V_T = V_{DD} - V_T \approx 2.5 - 0.4 \approx 2.1$ ;  $V_{DSAT} \approx 0.6$  and  $V_{DS} > V_{DD}/2 = 1.25$ , where  $V_{DD} = 2.5 \text{ V}$  and  $V_T$  and  $V_{DSAT}$  from Table 3.2 were used.

In velocity saturation region :

$$I = k' W / L [(V_{DD} - V_T) V_{DSAT} - V_{DSAT}^2 / 2] (1 + \lambda V_{DS}) = I_{DSAT} (1 + \lambda V_{DS})$$

Where, we define  $I_{DSAT} = k' W / L [(V_{DD} - V_T) V_{DSAT} - V_{DSAT}^2 / 2]$ . Using this I-V<sub>DS</sub> relation we can write the integral.

$$R_{eq} = -2 / (V_{DD} I_{DSAT}) \int_{V_{DD}/2}^{V_{DD}/2} dV_{DS} / (1 + \lambda V_{DS}) . \text{ Carrying out}$$

this integral we obtain

$$R_{eq} = 2 / (\lambda * V_{DD} * I_{DSAT}) * \{ V_{DD}/2 - 1/\lambda [\ln(1 + \lambda V_{DD}) - \ln(1 + \lambda V_{DD}/2)] \}$$

Now, we will replace the  $\ln(1+x)$ 's with their respective Taylor expansions.

$$\ln(1 + \lambda V_{DD}) \approx \{ \lambda V_{DD} - (\lambda V_{DD})^2/2 + (\lambda V_{DD})^3/3 \} \text{ and}$$

$$\ln(1 + \lambda V_{DD}/2) \approx \{ \lambda V_{DD}/2 - (\lambda V_{DD})^2/8 + (\lambda V_{DD})^3/24 \}$$

Subtracting these two expressions we get,  $\ln(1 + \lambda V_{DD}) - \ln(1 + \lambda V_{DD}/2) \approx \{ \lambda V_{DD}/2 - 3(\lambda V_{DD})^2/8 + 7(\lambda V_{DD})^3/24 \}$ .

Now let's insert this expression in the  $R_{eq}$  equation to get:

$$R_{eq} = 2 / (\lambda * V_{DD} * I_{DSAT}) * \{ V_{DD}/2 - V_{DD}/2 + 3\lambda V_{DD}^2/8 - 7\lambda V_{DD}^3/24 \}$$

Bringing the expression  $3\lambda V_{DD}^2/8$  outside the curly brackets, we obtain

$$R_{eq} = 2 / (\lambda * V_{DD} * I_{DSAT}) * 3 \lambda V_{DD}^2/8 \{ 1 - 7\lambda V_{DD}/9 \} = (3/4) * (V_{DD}/I_{DSAT}) \{ 1 - 7\lambda V_{DD}/9 \}$$

17

$$C_{ox} = 6 \text{ fF}/\mu\text{m}^2 \quad L_D = 0.5 \mu\text{m} \quad W_D = 1 \mu\text{m}$$

cut-off  
Linear.  
Set-Vel. sat

$$\begin{aligned} C_g &= C_{ox}WL + 2C_0W \\ &= C_{ox}WL + 2C_0W \\ &= \frac{2}{3}C_{ox}WL + 2C_0W \end{aligned}$$

Diffusion Capacitance ( $C_d$ )

$$C_d = C_j L_D W_D + C_{jsw} (2L_D + W_D) : \text{diffusion cap}$$

$$C_j = \frac{C_{j0}}{\left(1 + \frac{V_{DS}}{\phi_b}\right)^{M_j}} \quad C_{jsw} = \frac{C_{jsw0}}{\left(1 + \frac{V_{DS}}{\phi_b}\right)^{M_{jsw}}}$$

$$a) \rightarrow V_{in} = 2.5V, V_{out} = 2.5V$$

Vel. saturation.

$$C_g = 1.62 \text{ fF} \quad Q = 4.05 \text{ fC} = 4.05 \times 10^{-15} \text{ C}$$

$$C_d = 0.827 \text{ fF}$$

$$\rightarrow V_{out} = 0.5V$$

Linear region

$$C_g = 2.12 \text{ fF} \quad Q = 5.3 \text{ fC} = 5.3 \times 10^{-15} \text{ C}$$

$$C_d = 1.263 \text{ fF}$$

$$\rightarrow V_{out} = 0V$$

Linear region

$$C_g = 2.12 \text{ fF} \quad Q = 5.3 \text{ fC} = 5.3 \times 10^{-15} \text{ C}$$

$$C_d = 1.56 \text{ fF}$$

$$b) V_{in} = 0 \Rightarrow \text{Cut off.}$$

Regardless of  $V_{DS}$ ,

$$C_g = C_{ox}WL$$

$$C_g = 2.12 \text{ fF}, Q = 0$$

 $C_d$  are the same as part b.

$$V_{out} = 2.5 \Rightarrow C_d = 0.827 \text{ fF}$$

$$V_{out} = 0.5 \Rightarrow C_d = 1.263 \text{ fF}$$

$$V_{out} = 0 \Rightarrow C_d = 1.56 \text{ fF}$$

18) a) The value of  $C_T$  canchange after  $V_g = V_T$ 

$$\text{for } V_g: 0 \rightarrow V_T \Rightarrow C_T = C_T(1)$$

$$V_g: V_T \rightarrow 2V_T \Rightarrow C_T = C_T(2)$$

$$\text{then. } t_1 = C_T(1) \frac{V_T}{I_{in}}$$

$$t_2 = C_T(2) \frac{(2V_T - V_T)}{I_{in}}$$

$$t = [C_T(1) + C_T(2)] \frac{V_T}{I_{in}}$$

b)  $C_{gb}, C_{db}$  do not contribute to the total gate capacitance. Until,  $V_g = V_T$  device is off and only  $g_b$  makes up  $C_T$ . Between  $V_T < V_g < 2V_T$   $C_{gb}$  falls down to zero and being in Vel sat,  $C_{gd}$  &  $C_{gs}$  add up to  $C_T$ . Thus,

$$0 < V_g < V_T \Rightarrow C_T = C_T(1) = C_{ox}WL + W(C_{GDO} + C_{GSO})$$

$$V_T < V_g < 2V_T \Rightarrow C_T = C_T(2) = \frac{2}{3}C_{ox}WL + W(C_{GDO} + C_{GSO})$$

c) This time the device is completely off, at all times.  $C_{gs}, C_{gb}, C_{db}$  do not have any connection to drain node. Thus they don't contribute to

Only overlap component of  $C_{gd}$  & a varying  $C_{db}$  make up  $C_T$  in this case

$$C_T = WC_{GDO} + K_{eq} C_{j0} + K_{eqsw} C_{jsw0}$$

$$C_{j0} = C_j AD$$

$$C_{jsw0} = C_{jsw} PD$$

$$C_T = WC_{GDO} + K_{eq} C_{j0} AD + K_{eqsw} C_{jsw0} PD$$

$$K_{eq} = \frac{-P_B^{M_j}}{2V_T(1-M_j)} \left[ (P_B - 2V_T)^{(1-M_j)} - (P_B)^{(1-M_j)} \right]$$

$$K_{eqsw} = \frac{-P_B^{M_{jsw}}}{2V_T(1-M_{jsw})} \left[ (P_B - 2V_T)^{(1-M_{jsw})} - (P_B)^{(1-M_{jsw})} \right]$$

19

 $\diamond V(D) + V(S)$  $\diamond V(D) + V(S)$ 

20 a) Minimum:  $K_n' = 16.66 \mu A/V^2$   
 $V_t = 0.765 \quad (W/L_{eff}) = (14.7/5.0)$

Nominal:

$$K_n' = 19.60 \mu A/V^2 \quad V_t = 0.740V$$

$$(W/L_{eff}) = (20/4.7)$$

Maximum:

$$K_n' = 22.54 \mu A/V^2 \quad V_t = 0.715V$$

$$(W/L_{eff}) = (20.3/4.4)$$

$$V_{gs} = 0$$

$$I_{min} = I_{nom} = I_{max} = 0$$

$$V_{gs} = 2.5V \quad (\text{sat})$$

$$I_{min} = 98.7 \mu A$$

$$I_{nom} = 129 \mu A$$

$$I_{max} = 165 \mu A$$

$$V_{gs} = 5.0V \quad (\text{triode})$$

$$I_{min} = 398 \mu A$$

$$I_{nom} = 438 \mu A$$

$$I_{max} = 471 \mu A$$

b) For  $V_{in} \rightarrow I_{max}$ ,  $R = 8400 \Omega$

$$V_{max} \rightarrow I_{min}, R = 7200 \Omega$$

$$V_{gs} = 0 \rightarrow V_{out} = 5V$$

(sat)  $V_{gs} = 2.5V; V_{out} = V_{DD} - IR$

$$I_{min} = 98 \mu A, R_{min} \Rightarrow V_{out,max} = 4.29V$$

$$I_{nom} = 129 \mu A, R_{nom} \Rightarrow V_{out,nom} = 3.97V$$

$$I_{max} = 165 \mu A, R_{max} \Rightarrow V_{out,min} = 3.55V$$

$$V_{gs} = 5V \quad (\text{triode})$$

$$R_{min} \Rightarrow V_{out,max} = 1.47V$$

$$R_{nom} \Rightarrow V_{out,nom} = 1.50V$$

$$R_{max} \Rightarrow V_{out,min} = 1.14V$$

\* PROBLEM 2.15

```

.par w1 = 20u
.par l1 = 5u
.vdd vdd 0 dc 5
.R vdd out R1
.mos out in 0 0 nmos www1 l=ll
.vin in 0 dc 0
.DATA d1
w1 l1 kpa vt0 R1
+19.7u 5.3u 9.20068E-05 0.768469 7200
+20.0u 5.0u 8.00059E-05 0.743469 8000
+20.3u 4.7u 6.80050E-05 0.718469 8800
* SPICE LEVEL 2 Model for MOSIS 1.2 mu Process
.MODEL NMOS NMOS LEVEL=2 LD=0.15U TOX=200.0E-10
+ NSUB=5.36726E+15 VTO=vt0 KP=kpa GAMMA=0.543
+ PHI=0.6 UO=655.881 UEXP=0.157282 UCRIT=31443.8
+ DELTA=2.39824 VMAX=55260.9 XJ=0.25U LAMBDA=0.0367072
+ NFS=1E+12 NEFF=1.001 NSS=1E+11 TPG=1.0 RSH=70.00
+ CGDO=4.3E-10 CGSO=4.3E-10 CJ=0.0003 MJ=0.6585
+ CJSW=8.0E-10 MJSW=0.2402 PB=0.58
* Weff = WDrawn - Delta_W
* The suggested Delta_W is 1.9970E-07
.dc vin 0 5 2.5 sweep data=d1
.print v(out) i(vdd) i(vdd2)
.open post nomod
.end

```

OUTPUT:

Data index#1 :

| volt    | voltage out | current    |
|---------|-------------|------------|
| 0.      | 5.0000      | -10.6299p  |
| 2.50000 | 2.5350      | .342.3599u |
| 5.00000 | 688.5851m   | .598.8076u |

Data index#2 (nominal):

| volt    | voltage out | current    |
|---------|-------------|------------|
| 0.      | 5.0000      | -10.9104p  |
| 2.50000 | 2.3750      | .328.1237u |
| 5.00000 | 658.0161m   | .542.7480u |

Data index#3:

| volt    | voltage out | current    |
|---------|-------------|------------|
| 0.      | 5.0000      | -11.3080p  |
| 2.50000 | 2.2784      | .309.2724u |
| 5.00000 | 646.0152m   | .494.7710u |

- (21) a)  $S = \frac{0.18\mu}{0.12\mu} = 1.5$   
Fixed voltage scaling  
 $A' = \frac{A}{(1.5)^2} = \frac{0.7\text{mm}^2}{2.25} = 0.311\text{mm}^2$   
 $P' = \frac{0.4\text{mW}}{1\text{MHz}} \times 100\text{MHz} = 40\text{mW}$   
 $\left(\frac{P}{A}\right)' = 12.44 \text{ mW/mm}^2$
- b) General scaling  
 $U = \frac{1.8}{1.5} = 1.2$   
 $P' = \frac{4.0\text{mW}}{(1.2)^2} = 27.78\text{mW}$   
 $\left(\frac{P}{A}\right)' = 89.29 \text{ mW/mm}^2$
- c)  $f' = Sp = 100\text{MHz} \cdot 1.5 = 150\text{MHz}$   
Assuming dynamic power dominates  
 $P_{150}' = P_{100}' S = (27.78\text{mW}) (1.5)$   
 $= 41.67 \text{ mW}$
- $\left(\frac{P_{150}}{A}\right)' = \left(\frac{P_{100}}{A}\right)' S = 89.29 \times 1.5$   
 $= 133.9 \text{ mW/mm}^2$
- d)  $\left(\frac{P_{150}}{A}\right)' = \left(\frac{P_{100}}{U^2}\right) \left(\frac{S^2}{A}\right) S = \frac{P_{100}}{A}$
- $U^2 = S^3$
- 
- $U = S^{3/2} = (1.5)^{3/2} = 1.837$
- 
- $U = \frac{1.8}{V'} = 1.837$
- 
- $V' \approx 1.0\text{V}$

- (22) a)  $S = 0.25/0.1 = 2.5$   
Speed scales inversely to  $f_p$   
which scales as  $1/S^2 \Rightarrow$  Speed scale with  $S^2$ , so  $f = 625\text{MHz}$ .
- Power scales  $\propto S \Rightarrow P = 25\text{W}$
- b) in full scaling speed scales with  $1/S^2$ . Power scales as  $1/S^2$  & thus  $P = 1.6\text{W}$
- c) We want to keep power constant  
 $\frac{S}{U^3} = 1 \Rightarrow U = S^{1/3} = (2.5)^{1/3} = 1.36$   
 $\Rightarrow$  Voltage becomes  $V = 1.842\text{V}$   
speed scales as  $S^2/U = 4.6$   
 $f = 460\text{MHz}$

# CHAPTER



## THE CMOS INVERTER

*Quantification of integrity, performance, and energy metrics of an inverter  
Optimization of an inverter design*

- 5.1 Exercises and Design Problems
- 5.2 The Static CMOS Inverter — An Intuitive Perspective
- 5.3 Evaluating the Robustness of the CMOS Inverter: The Static Behavior
  - 5.3.1 Switching Threshold
  - 5.3.2 Noise Margins
  - 5.3.3 Robustness Revisited
- 5.4 Performance of CMOS Inverter: The Dynamic Behavior
  - 5.4.1 Computing the Capacitances
- 5.4.2 Propagation Delay: First-Order Analysis
- 5.4.3 Propagation Delay from a Design Perspective
- 5.5 Power, Energy, and Energy-Delay
  - 5.5.1 Dynamic Power Consumption
  - 5.5.2 Static Consumption
  - 5.5.3 Putting It All Together
  - 5.5.4 Analyzing Power Consumption Using SPICE
- 5.6 Perspective: Technology Scaling and its Impact on the Inverter Metrics

## 5.1 Exercises and Design Problems

1. [M, SPICE, 3.3.2] The layout of a static CMOS inverter is given in Figure 5.1. ( $\lambda = 0.125 \mu\text{m}$ ).

- a. Determine the sizes of the NMOS and PMOS transistors.

**Solution**

The sizes are  $w_n=1.0\mu\text{m}$ ,  $l_n=0.25\mu\text{m}$ ,  $w_p=0.5\mu\text{m}$ , and  $l_p=0.25\mu\text{m}$ .

- b. Plot the VTC (using HSPICE) and derive its parameters ( $V_{OH}$ ,  $V_{OL}$ ,  $V_m$ ,  $V_{IH}$ , and  $V_{IL}$ ).

**Solution**

The inverter VTC is shown below. For a static CMOS inverter with a supply voltage of 2.5 V,  $V_{OH}=2.5$  V and  $V_{OL}=0$  V. In order to calculate  $V_m$ , note from the VTC that the value is between 0.8 V and 0.9 V. Therefore, the NMOS is saturated and the PMOS is velocity saturated. Let  $V_{in}=V_{out}=V_m$  and set the currents equal to obtain the following equation:

$$(k_n/2)(V_{GS}-V_{TN})^2(1+\lambda V_{DS})=k_p V_{DSAT}[(V_{GS}-V_{TP})-(V_{DSAT}/2)](1+\lambda V_{DS})$$

Substitute the appropriate values and solve numerically to find  $V_m=0.883$  V.

Use the VTC data to solve for  $V_{IL}$  and  $V_{IH}$  numerically. The result is that  $V_{IH}=0.97$  V and  $V_{IL}=0.56$  V.



- c. Is the VTC affected when the output of the gates is connected to the inputs of 4 similar gates?

**Solution**

No. CMOS gates are a purely capacitive load so the DC circuit characteristics are not affected.



**Figure 5.1** CMOS inverter layout.

- d. Resize the inverter to achieve a switching threshold of approximately 0.75 V. Do not layout the new inverter, use HSPICE for your simulations. How are the noise margins affected by this modification?

**Solution**

Changing the NMOS sizing to  $w_n=2.0\mu\text{m}$  moves the switching threshold to 0.75 V. This increases  $N_{MH}$  and decreases  $N_{ML}$ .

2. Figure 5.2 shows a piecewise linear approximation for the VTC. The transition region is approximated by a straight line with a slope equal to the inverter gain at  $V_M$ . The intersection of this line with the  $V_{OH}$  and the  $V_{OL}$  lines defines  $V_{IH}$  and  $V_{IL}$ .
  - a. The noise margins of a CMOS inverter are highly dependent on the sizing ratio,  $r = k_p/k_n$ , of the NMOS and PMOS transistors. Use HSPICE with  $V_{Tn} = |V_{Tp}|$  to determine the value of  $r$  that results in equal noise margins? Give a qualitative explanation.

**Solution**

The TSMC 0.25μm models were used for simulation and the threshold voltages of NMOS and PMOS devices are nearly equal in this process. A value near  $r=1$  should result in equal noise margins, since the transistors will be closely matched. HSPICE showed that the resulting noise margins for this sizing were  $N_{MH}=0.97$  V and  $N_{ML}=1.1$  V. The mismatch is due to the fact that the PMOS threshold voltage is actually slightly lower, so the PMOS is stronger and the upper noise margin is reduced. The actual value that results in equal noise margins is  $r=0.83$ .

- b. Section 5.3.2 of the text uses this piecewise linear approximation to derive simplified expressions for  $NM_H$  and  $NM_L$  in terms of the inverter gain. The derivation of the gain is based on the assumption that both the NMOS and the PMOS devices are velocity saturated at  $V_M$ . For what range of  $r$  is this assumption valid? What is the resulting range of  $V_M$ ?

**Solution**

Using the equations for finding the region of operation, it can be shown that the PMOS and NMOS are both velocity saturated only while the switching threshold is between 1.06 V and 1.10 V. Since this range may be considered inclusive, we can assume that both devices are velocity saturated and set the currents equal with  $V_{IN}=V_{OUT}=V_M$  to find  $k_p/k_n$ . The result is that  $k_p/k_n$  must be between 0.34 and 0.41. This result can be checked by sizing the devices accordingly and testing the resulting  $V_M$  in HSPICE. The result gives a range of 1.04 V to 1.09 V. This makes sense, because the NMOS must be much stronger than the PMOS to achieve a switching threshold near 1 V.

- c. Derive expressions for the inverter gain at  $V_M$  for the cases when the sizing ratio is just above and just below the limits of the range where both devices are velocity saturated. What are the operating regions of the NMOS and the PMOS for each case? Consider the effect of channel-length modulation by using the following expression for the small-signal resistance in the saturation region:  $r_{o,sat} = 1/(\lambda I_D)$ .



**Figure 5.2** A different approach to derive  $V_{IL}$  and  $V_{IH}$ .

**Solution:**

When  $V_M$  is slightly larger than 1.1 V, the NMOS is velocity saturated and the PMOS is saturated. When  $V_M$  is slightly smaller than 1.06 V, the PMOS is velocity saturated and the NMOS is saturated. Section 5.3.2 of the text shows this derivation for the case when both devices are velocity saturated. These derivations can be completed by substituting the correct current equations and using the same method. The results are as follows:

For the case when the NMOS is saturated and the PMOS is velocity saturated:

$$\frac{dV_{out}}{dV_{in}} = -\frac{k_n(V_{in} - V_{tn})(1 + \lambda_n V_{out}) + k_p V_{DSATP}(1 + \lambda_p(V_{out} - V_{DD}))}{\frac{k_n \lambda_n}{2}(V_{in} - V_{tn})^2 + k_p V_{DSATP} \lambda_p (V_{in} - V_{DD} - V_{tp} - \frac{V_{DSATP}}{2})}$$

Dropping the second order terms in the numerator, substituting  $V_m$  for  $V_{in}$ , and simplifying the denominator leads to the following expression for the gain:

$$\frac{dV_{out}}{dV_{in}} = -\frac{k_n(V_m - V_{tn}) + k_p V_{DSATP}}{I_D(V_m)(\lambda_n - \lambda_p)}$$

For the case when the NMOS is velocity saturated and the PMOS is saturated:

$$\frac{dV_{out}}{dV_{in}} = -\frac{k_n V_{DSATN}(1 + \lambda_n V_{out}) + k_p(V_{in} - V_{DD} - V_{tp})(1 + \lambda_p(V_{out} - V_{DD}))}{k_n V_{DSATN} \lambda_n \left( V_{in} - V_{tn} - \frac{V_{DSATN}}{2} \right) + \frac{k_p \lambda_p}{2} (V_{in} - V_{DD} - V_{tp})^2}$$

Again, dropping the second order terms in the numerator, substituting  $V_m$  for  $V_{in}$ , and simplifying the denominator leads to the following expression for the gain:

$$\frac{dV_{out}}{dV_{in}} = -\frac{k_n V_{DSATN} + k_p (V_m - V_{DD} - V_{tp})}{I_D (V_m) (\lambda_n - \lambda_p)}$$

3. [M, SPICE, 3.3.2] Figure 5.3 shows an NMOS inverter with a resistive load.

- a. Qualitatively discuss why this circuit behaves as an inverter.

**Solution**

For  $V_{IN} < V_T$ , M1 is in cutoff regime, thus  $I=0$  and  $V_{out}=2.5V$ . For  $V_{IN} > V_T$ , M1 is conducting and  $V_{out}=2.5V - (I^*R)$ . This in turn gives a low  $V_{out}$  and the input signal is inverted.

- b. Find  $V_{OH}$  and  $V_{OL}$  calculate  $V_{IH}$  and  $V_{IL}$ .

**Solution**

Assuming negligible leakage, when  $V_{in} < V_T$ , transistor M1 is off and  $V_{OH}=2.5V$ . For  $V_{in}=2.5V$ , assume M1 is in the linear region, and because  $V_{DS}$  is negligible in the linear region, channel-length modulation can be ignored. For the linear region,  $V_{min}=V_{DS}=V_{out}=V_{OL}=46.25m$ . Checking the assumption:  $V_{GT}=2.07V$ ,  $V_{DSat}=0.63V$ , and  $V_{DS}=46.25m$ , thus, M1 was correctly assumed to be in the linear region.

To find  $V_M$ , set the resistor current equal to the NMOS current, with an input and output voltage of  $V_M$ :

$$\frac{2.5 - V_M}{75k} = k_n \frac{(V - 0.43)^2}{2} (1 + 0.06 V_M)$$

Thus,  $V_M = 0.79V$ .

To find  $V_{IL}$  and  $V_{IH}$ , the slope of the VTC, at  $V_M$ , is derived and the line is extrapolated out to  $V_{OH}$  and  $V_{OL}$  respectively. Ignoring the effects of channel length modulation, the slope is given by the following:

$$\frac{dV_o}{dV_{in}} = -\frac{R_L k_n W}{2L} (2V_{in} - 0.86)$$

Plugging  $V_M = 0.79V$ , into the slope equation above, gives a slope of 9.32. Extrapolating the line back to  $V_{OH}$  gives  $V_{IL}=0.607V$  and the extrapolation of the line to  $V_{OL}$  gives  $V_{IH}=0.87V$ .

- c. Find  $NM_L$  and  $NM_H$ , and plot the VTC using HSPICE.

**Solution**

$$NM_L = V_{IL} = 0.607V \text{ and } NM_H = 2.5V - V_{IH} = 1.63V$$



- d. Compute the average power dissipation for: (i)  $V_{in} = 0 \text{ V}$  and (ii)  $V_{in} = 2.5 \text{ V}$



**Figure 5.3** Resistive-load inverter

**Solution**

- (i)  $V_{in}=0$  means M1 is cutoff, therefore,  $I_{VDD}=0$  and consequently  $P_{VDD}=0$   
(ii)  $V_{in}=2.5 \text{ V}$ ,  $V_{out}=V_{OL}=46.25 \text{ mV}$ ,

$$I_{VDD} = \frac{\Delta V}{R} = \frac{2.5 - 46.25 \text{ m}}{75 \text{ k}} = 32.7 \mu\text{A}$$

$$P = V_{DD} * I_{VDD} = 2.5 \text{ V} * 32.7 \text{ mA} = 81.75 \text{ mW}$$

- e. Use HSPICE to sketch the VTCs for  $R_L = 37 \text{ k}$ ,  $75 \text{ k}$ , and  $150 \text{ k}$  on a single graph.

**Solution**



- f. Comment on the relationship between the critical VTC voltages (i.e.,  $V_{OL}$ ,  $V_{OH}$ ,  $V_{IL}$ ,  $V_{IH}$ ) and the load resistance,  $R_L$ .

**Solution**

As  $R_L$  increases, the VTC curve becomes more ideal for the following reasons:  $V_{OL}$  decreases,  $NM_L$  increases,  $V_{IH}$  decreases, and  $NM_H$  increases. However, these come as tradeoffs because, as  $R_L$  increases,  $V_{IL}$  decreases, which is less ideal, and  $V_{OH}$  remains unchanged.

- g. Do high or low impedance loads seem to produce more ideal inverter characteristics?

**Solution**

As the impedance load increases, there is a tradeoff, the inverter VTC becomes more ideal with a higher gain and thus better noise margins. However, the VTC curve is shifted in favor of M1 and the threshold voltage is lowered as the VTC moves to the left.

4. [E, None, 3.3.3] For the inverter of Figure 5.3 and an output load of 3 pF:

- a. Calculate  $t_{plh}$ ,  $t_{phl}$ , and  $t_p$ .

**Solution**

$$t_{plH}=0.69R_LC_L=155 \text{ nsec.}$$

For  $t_{phL}$ : First calculate  $R_{on}$  for  $V_{out}=2.5V$  and  $1.25V$ . At  $V_{out}=2.5V$ ,  $I_{DVsat}=0.439\text{mA}$  giving  $R_{on}=5695\Omega$  and when  $V_{out}=1.25V$ ,  $I_{DVsat}=0.41\text{m}$  giving  $R_{on}=3049\text{W}$ .

Thus, the average resistance between  $V_{out}=2.5\text{V}$  and  $V_{out}=1.25\text{V}$  is  $R_{average}=4.372\text{k}\Omega$ .  
 $t_{plH}=0.69R_{average}C_L=9.05\text{nsec.}$

$$t_p=\text{av}\{t_{plH}, t_{phL}\}=82.0\text{nsec}$$

- b. Are the rising and falling delays equal? Why or why not?

**Solution**

$t_{plH} >> t_{phL}$  because  $R_L=75\text{k}\Omega$  is much larger than the effective linearized on-resistance of M1.

- c. Compute the static and dynamic power dissipation assuming the gate is clocked as fast as possible.

**Solution**

Static Power:

$V_{IN}=V_{OL}$  gives  $V_{out}=V_{OH}=2.5V$ , thus  $I_{VDD}=0A$  so  $P_{VDD}=0W$ .  
 $V_{IN}=V_{OH}$  gives  $V_{out}=V_{OL}=46.3mV$ , which is in the linear region.

Calculating the current through M1 gives  $I_{VDD}=32.8mA \rightarrow P_{VDD}=82mW$

Dynamic Power:

$$P_{dyn}=C_L \Delta V * V_{dd} * f_{max} = 3pF * (2.5V - 46.3mV) * 2.5V * 12.2MHz = 0.225mW$$

5. The next figure shows two implementations of MOS inverters. The first inverter uses only NMOS transistors.

- a. Calculate  $V_{OH}$ ,  $V_{OL}$ ,  $V_M$  for each case.



Figure 5.4 Inverter Implementations

**Solution**

Circuit A.

$V_{OH}$ : We calculate  $V_{OH}$ , when M1 is off. The threshold for M2 is:

$$V_T = V_{T0} + \gamma \cdot (\sqrt{|-2\phi_F + V_{SB}|} - \sqrt{|-2\phi_F|}), \quad V_{SB} = V_{OUT}, \quad |-2\phi_F| = 0.6V$$

and M2 will be off when:  $V_{GS} - V_T = V_{DD} - V_{OUT} - V_T = 0$ ,  
 Substitute  $V_T$  in the last equation and solve for  $V_{OUT}$ .

$$V_{DD} - V_{OUT} - V_T = 2.5 - V_{OUT} - (0.43 + 0.4 \cdot (\sqrt{0.6 + V_{OUT}} - \sqrt{0.6})) = 0$$

We get  $V_{OUT} = V_{OH} = 1.765V$

$V_{OL}$ : To calculate  $V_{OL}$ , we set  $V_{IN} = V_{DD} = 2.5V$ .

We expect  $V_{OUT}$  to be low, so we can make the assumption that M2 will be velocity saturated and M1 will be in the linear region.

$$\text{For M2: } I_{D2} = k_n \cdot \frac{W_2}{L_2} \cdot \left( (V_{GS} - V_T) \cdot V_{DSAT} - \frac{V_{DSAT}^2}{2} \right) \cdot (1 + \lambda V_{DS}) \text{ and}$$

$$\text{for M1: } I_{D1} = k_n \cdot \frac{W_1}{L_1} \cdot \left( (V_{GS} - V_{T0}) \cdot V_{DS} - \frac{V_{DS}^2}{2} \right)$$

Setting  $I_{D1} = I_{D2}$ , we get an equation and we solve for  $V_{\text{OUT}}$ . We get:  $V_{\text{OUT}} = V_{\text{OL}} = 0.263\text{V}$ , so our assumption holds.

**$V_M$ :** To calculate  $V_M$  we set  $V_M = V_{\text{IN}} = V_{\text{OUT}}$ .

Assuming that both transistors are velocity saturated, then we have the next pair of equations:

$$I_{D1} = k_n \cdot \frac{W_1}{L_1} \cdot \left( (V_M - V_{T0}) \cdot V_{DSAT} - \frac{V_{DSAT}^2}{2} \right) \cdot (1 + \lambda V_M)$$

$$I_{D2} = k_n \cdot \frac{W_2}{L_2} \cdot \left( (V_{DD} - V_M - V_T) \cdot V_{DSAT} - \frac{V_{DSAT}^2}{2} \right) \cdot (1 + \lambda(V_{DD} - V_M))$$

Setting  $I_{D1} = I_{D2}$ , we get for  $V_M = 1.269\text{V}$

#### Circuit B.

When  $V_{\text{IN}} = 0\text{V}$ , the NMOS transistor is off and the PMOS transistor is on and pulls  $V_{\text{OUT}}$  up to  $V_{\text{DD}}$ , so  $V_{\text{OH}} = 2.5$ . Similarly, when  $V_{\text{IN}} = 2.5\text{V}$ , the PMOS transistor is off and the NMOS transistor pulls  $V_{\text{OUT}}$  all the way down to ground, so  $V_{\text{OL}} = 0\text{V}$ .

To calculate  $V_M$  we set  $V_M = V_{\text{IN}} = V_{\text{OUT}}$ .

We assume that both transistors are velocity saturated. We get the following pair of equations.

$$I_{D4} = k_p \cdot \frac{W_4}{L_4} \cdot \left( (V_M - V_{DD} - V_{T0p}) \cdot V_{DSATp} - \frac{V_{DSATp}^2}{2} \right) \cdot (1 + \lambda_p V_M)$$

$$I_{D3} = k_n \cdot \frac{W_3}{L_3} \cdot \left( (V_M - V_{T0n}) \cdot V_{DSATn} - \frac{V_{DSATn}^2}{2} \right) \cdot (1 + \lambda_n V_M)$$

Setting  $I_{D3} + I_{D2} = 0$ , we get for  $V_M = 1.095\text{V}$ .

So the assumption that both transistors were velocity saturated holds.

- b. Use HSPICE to obtain the two VTCs. You must assume certain values for the source/drain areas and perimeters since there is no layout. For our scalable CMOS process,  $\lambda = 0.125 \mu\text{m}$ , and the source/drain extensions are  $5\lambda$  for the PMOS; for the NMOS the source/drain contact regions are  $5\lambda \times 5\lambda$ .

#### **Solution**

The two VTCs are shown below.



Depletion Load Inverter



CMOS Inverter

- c. Find  $V_{IH}$ ,  $V_{IL}$ ,  $NM_L$  and  $NM_H$  for each inverter and comment on the results. How can you increase the noise margins and reduce the undefined region?

**Solution**

Circuit A

$$V_{IL} = 0.503V \Rightarrow V_{OUT1} = 1.65V, V_{IH} = 1.35V \Rightarrow V_{OUT2} = 0.588V$$

$$NM_H = V_{OH} - V_{OUT2} = 1.765 - 1.65 = 0.115V, NM_L = V_{OUT1} - V_{OL} = 0.588 - 0.23 = 0.358V$$

Circuit B

$$V_{IL} = 0.861V \Rightarrow V_{OUT1} = 2.33V, V_{IH} = 1.22V \Rightarrow V_{OUT2} = 0.219V$$

$$NM_H = V_{OH} - V_{OUT2} = 2.5V - 1.22V = 1.28V, NM_L = V_{OUT1} - V_{OL} = 0.861V - 0V = 0.861V$$

We can increase the noise margins by moving  $V_M$  closer to the middle of the output voltage swing.

- d. Comment on the differences in the VTCs, robustness and regeneration of each inverter.

**Solution**

It is clear from the two VTCs, that the CMOS inverter is more robust, since the low and high noise margins are higher than the first inverter. Also the regeneration in the second inverter is greater since it provides rail to rail output and the gain of the inverter is much greater.

6. Consider the following NMOS inverter. Assume that the bulk terminals of all NMOS devices are connected to GND. Assume that the input IN has a 0V to 2.5V swing.



- a. Set up the equation(s) to compute the voltage on node  $x$ . Assume  $\gamma=0.5$ .

**Solution**

The voltage on node  $x$  is set to one threshold value  $V_T$  below  $V_{DD}$ . So:

$$\begin{aligned} V_X &= V_{DD} - V_T \\ V_X &= V_{DD} - [V_{T0} + \gamma(\sqrt{V_{SB} + |-2\phi_F|} - \sqrt{|-2\phi_F|})] \\ V_X &= 2.5 - [0.43 + 0.5(\sqrt{V_X + 0.6} - \sqrt{0.6})] \\ V_X &= 2.07 + 0.39 - 0.5\sqrt{V_X + 0.6} \\ V_X &= 2.46 - 0.5\sqrt{V_X + 0.6} \end{aligned}$$

which gives  $V_X = 1.7014V$ .

- b. What are the modes of operation of device M2? Assume  $\gamma=0$ .

**Solution**

$$\begin{aligned} V_X &= V_{DD} - V_T \\ V_{DS2} &= V_{DD} - V_{OUT} \\ V_{GS2} - V_T &= V_{DD} - V_T - V_{OUT} - V_T = V_{DD} - V_{OUT} - 2V_T \end{aligned}$$

This means that  $V_{DS2} > V_{GS2} - V_T$ , so M2 is either saturated (or vel. saturated) or cut off.

- c. What is the value on the output node  $OUT$  for the case when  $IN=0V$ ? Assume  $\gamma=0$ .

**Solution**

When  $IN=0$  then M1 is off and OUT will charge up to:

$$\begin{aligned} V_{out(max)} &= V_X - V_T \\ V_{out(max)} &= V_{DD} - V_T - V_T \\ V_{out(max)} &= V_{DD} - 2V_T \end{aligned}$$

- d. Assuming  $\gamma=0$ , derive an expression for the switching threshold ( $V_M$ ) of the inverter. Recall that the switching threshold is the point where  $V_{IN}=V_{OUT}$ . Assume that the device sizes for M1, M2 and M3 are  $(W/L)_1$ ,  $(W/L)_2$ , and  $(W/L)_3$  respectively. What are the limits on the switching threshold?

For this, consider two cases:

- i)  $(W/L)_1 \gg (W/L)_2$
- ii)  $(W/L)_2 \gg (W/L)_1$

**Solution**

Assuming that both devices are velocity saturated we can equate the currents when  $V_{IN}=V_{OUT}=V_M$ . This gives

$$k'_n \left( \frac{W}{L} \right)_1 \left( V_{GS1} - V_T - \frac{V_{DSAT}}{2} \right) = k'_n \left( \frac{W}{L} \right)_2 \left( V_{GS2} - V_T - \frac{V_{DSAT}}{2} \right)$$

$$\left( \frac{W}{L} \right)_1 \left( V_M - V_T - \frac{V_{DSAT}}{2} \right) = \left( \frac{W}{L} \right)_2 \left( V_{DD} - V_T - V_M - V_T - \frac{V_{DSAT}}{2} \right)$$

Solving for  $V_M$  and substituting  $r = \frac{(W/L)_2}{(W/L)_1}$  we get:

$$\left( V_M - V_T - \frac{V_{DSAT}}{2} \right) = r \left( V_{DD} - 2V_T - V_M - \frac{V_{DSAT}}{2} \right)$$

$$V_M = \frac{r \left( V_{DD} - 2V_T - \frac{V_{DSAT}}{2} \right) + V_T + \frac{V_{DSAT}}{2}}{1 + r}$$

To find the limits for  $V_M$  we check the two cases:

- i) When  $(W/L)_1 \gg (W/L)_2$ ,  $V_M = V_T + V_{DSAT}/2 = 0.43 + 0.63/2 = 0.745$
- ii) When  $(W/L)_2 \gg (W/L)_1$ ,  $V_M = V_{DD} - 2V_T - V_{DSAT}/2 = 1.325$

For both cases the assumptions for M1 and M2 are valid.

7. Consider the circuit in Figure 5.5. Device M1 is a standard NMOS device. Device M2 has all the same properties as M1, except that its device threshold voltage is *negative* and has a value of -0.4V. Assume that all the current equations and inequality equations (to determine the mode of operation) for the depletion device M2 are the same as a regular NMOS. Assume that the input  $IN$  has a 0V to 2.5V swing.



Figure 5.5 A depletion load NMOS inverter

- a. Device M2 has its gate terminal connected to its source terminal. If  $V_{IN} = 0V$ , what is the output voltage? In steady state, what is the mode of operation of device M2 for this input?

**Solution**

When  $V_{IN} = 0V$  then M1 is off. M2 is on since  $V_{GS} = 0 > V_{Tn2}$ . Since there is no current through M2, the drain to source voltage of M2 is 0 (linear mode). This means that  $V_{OUT} = 2.5V$ .

- b. Compute the output voltage for  $V_{IN} = 2.5V$ . You may assume that  $V_{OUT}$  is small to simplify your calculation. In steady state, what is the mode of operation of device M2 for this input?

**Solution**

We assume that M1 is in the linear mode and M2 is velocity saturated. This means:

$$k_{n1} \left[ (2.5 - 0.4) V_{out} - \frac{V_{out}^2}{2} \right] = k_{n2} \left[ (0 - (-0.4)) V_{Dsat} - \frac{V_{Dsat}^2}{2} \right]$$

Since  $V_{out}$  is small we can neglect the  $V_{out}^2/2$  term and the previous equation becomes

$$V_{out} = \frac{k_{n2} 0.05355}{k_{n1} 2.1}, \text{ which gives } V_{out} \approx 12mV$$

So our assumptions are valid.

- c. Assuming  $P_{IN=0}=0.3$ , what is the static power dissipation of this circuit?

**Solution**

There is static power dissipation when both transistors are on. This happens when  $V_{IN}=1$ . Then the static power dissipation is given by:

$$\begin{aligned} P_{static} &= P_{in=1} V_{DD} I_D \\ P_{static} &= (1 - 0.3) 2.5 \left( \frac{115 \mu A}{V^2} \frac{2}{1} \left( 0.4 \cdot 0.63 - \frac{0.63^2}{2} \right) \right) \\ P_{static} &= 21.55 \mu W \end{aligned}$$

8. [M, None, 3.3.3] An NMOS transistor is used to charge a large capacitor, as shown in Figure 5.6.

- a. Determine the  $t_{pLH}$  of this circuit, assuming an ideal step from 0 to 2.5V at the input node.

**Solutions**

To determine the rise time, an average current has to be calculated between the start of the transition with  $V_O=0V$  and midpoint of the transition.

At the start of the transition:  $V_O=V_{OL}=0V$ , M1 is velocity saturated and  $I_{Dsat}=1.46mA$ . To find the voltage swing,  $V_{OH}$  must be calculated using the body effect:

$$V_{gs} = 2.5V - V_{OH} = V_{tn} + \gamma(\sqrt{0.6 + V_{OH}} - \sqrt{0.6})$$

$V_{OH}=1.76V$ . The midpoint is thus,

$$\frac{V_{OH} - V_{OL}}{2} = 0.88V$$

and the threshold voltage at the midpoint is:  $V_T(V_{sb}=0.88V)=0.607V$ .

Using this threshold voltage,  $V_{GT}=1.013V$ ,  $V_{DS}=1.62V$ , and  $V_{DSat}=0.63V$ , thus, the transistor M1 is still velocity saturated, giving  $I_{Dsat}=49.17mA$ .

Finding the average current between  $V_0=0V$  and  $V_0=0.88V$  gives:  $I_{average}=0.756mA$ .

$$t_p = \frac{C_L \Delta V}{I_{average}} = \frac{5pF \times 0.88V}{0.756mA} = 5.82nsec$$

- b. Assume that a resistor  $R_S$  of  $5\text{ k}\Omega$  is used to discharge the capacitance to ground. Determine  $t_{pHL}$ .

**Solution**

$$t_{pLH}=0.69 \cdot R_L C_L = 0.69 \cdot 5\text{ k}\Omega \cdot 5\text{ pF} = 17.25\text{ ns}$$

**Figure 5.6** Circuit diagram with annotated  $W/L$  ratios

- c. Determine how much energy is taken from the supply during the charging of the capacitor. How much of this is dissipated in *M1*. How much is dissipated in the pull-down resistance during discharge? How does this change when  $R_s$  is reduced to  $1 \text{ k}\Omega$ .

**Solution**

$$\Delta Q_{VDD} = C_L \Delta V = 5 \text{ pF} * 1.76 \text{ V} = 8.8 \text{ pC}$$

$$\Delta E_{VDD} = \Delta Q_{VDD} * V_{dd} = 8.8 \text{ pC} * 2.5 \text{ V} = 22 \text{ pJ}$$

Half the energy is dissipated in the transistor *M1*, while the other half is dissipated in the resistor  $R_s$ . The energy dissipated is independent of  $R_s$ .

- d. The NMOS transistor is replaced by a PMOS device, sized so that  $k_p$  is equal to the  $k_n$  of the original NMOS. Will the resulting structure be faster? Explain why or why not.

**Solution**

If a PMOS device replaces the NMOS device, body effect will not exist and the PMOS device will be faster.

9. The circuit in Figure 5.7 is known as the *source follower* configuration. It achieves a DC level shift between the input and the output. The value of this shift is determined by the current  $I_o$ . Assume  $x_d=0$ ,  $\gamma=0.4$ ,  $2|\phi_f|=0.6\text{V}$ ,  $V_{T0}=0.43\text{V}$ ,  $k_n'=115\mu\text{A/V}^2$  and  $\lambda=0$ .

**Figure 5.7** NMOS source follower configuration

- a. Suppose we want the nominal level shift between  $V_i$  and  $V_o$  to be 0.6V in the circuit in Figure 5.7 (a). Neglecting the backgate effect, calculate the width of M2 to provide this level shift (Hint: first relate  $V_i$  to  $V_o$  in terms of  $I_o$ ).

**Solution**

The level shift of 0.6V tells us that  $V_{GS1}=0.6V$  so  $V_{GT1}=0.17V$ . This means that **M1** must be in the saturation region (not velocity saturated). Thus,

$$\frac{k_n \cdot \frac{W}{L}}{2} \cdot (V_{GS} - V_T)^2 = I_D, \text{ and } I_D=6.647\mu A.$$

For **M2**,  $V_{GT}=0.12$ , so **M2** is also in the saturation region (not velocity saturated). Using the same equation as above and solving for  $W/L$  gives  $W/L = 8$ .

- b. Now assume that an ideal current source replaces M2 (Figure 5.7 (b)). The NMOS transistor M1 experiences a shift in  $V_T$  due to the backgate effect. Find  $V_T$  as a function of  $V_o$  for  $V_o$  ranging from 0 to 2.5V with 0.5V intervals. Plot  $V_T$  vs.  $V_o$

**Solution**

The threshold voltage equation provides the relation that we need:

$$V_T = V_{T0} + \gamma \cdot (\sqrt{|2\phi_F| + V_{SB}} - \sqrt{|2\phi_F|}) = V_{T0} + \gamma \cdot (\sqrt{|2\phi_F| + V_o} - \sqrt{|2\phi_F|}).$$

See the graph at the end of this problem.

- c. Plot  $V_o$  vs.  $V_i$  as  $V_o$  varies from 0 to 2.5V with 0.5 V intervals. Plot two curves: one neglecting the body effect and one accounting for it. How does the body effect influence the operation of the level converter?

**Solution**

To plot  $V_o$  versus  $V_i$ , we need to relate  $V_o$  to  $V_i$ . We can do this by solving the current equation (**M1** should remain in the same region to first order because  $V_{GT}$  will remain roughly constant to maintain the correct drain current) for  $V_i$ :

$$V_i = V_o + V_T + \sqrt{\frac{2I_D}{k_n \cdot \frac{W}{L}}}.$$

- d. At  $V_o$ (with body effect) = 2.5V, find  $V_o$ (ideal) and thus determine the maximum error introduced by the body effect.

**Solution**

The maximum error occurs at the highest  $V_{SB}$ . At  $V_o = 2.5$ , the error is  $3.4944 - 3.1 = 0.3944$  V.



Figure for part (b)



Figure for part (c)

- 10.** For this problem assume:

$V_{DD} = 2.5V$ ,  $W_p/L = 1.25/0.25$ ,  $W_n/L = 0.375/0.25$ ,  $L=L_{eff}=0.25\mu m$  (i.e.  $x_d=0\mu m$ ),  $C_L=C_{inv-gate}$ ,  $k_n'=115\mu A/V^2$ ,  $k_p'=-30\mu A/V^2$ ,  $V_{m0}=|V_{tp0}|=0.4V$ ,  $\lambda=0V^{-1}$ ,  $\gamma=0.4$ ,  $2|\phi_f|=0.6V$ , and  $t_{ox}=58A$ . Use the HSPICE model parameters for parasitic capacitance given below (i.e.  $C_{gd0}$ ,  $C_j$ ,  $C_{jsw}$ ), and assume that  $V_{SB}=0V$  for all problems except part (e).



Figure 5.8 CMOS inverter with capacitive

## Parasitic Capacitance Parameters (F/m)##

NMOS: CGDO=3.11x10<sup>-10</sup>, CGSO=3.11x10<sup>-10</sup>, CJ=2.02x10<sup>-3</sup>, CJSW=2.75x10<sup>-10</sup>  
PMOS: CGDO=2.68x10<sup>-10</sup>, CGSO=2.68x10<sup>-10</sup>, CJ=1.93x10<sup>-3</sup>, CJSW=2.23x10<sup>-10</sup>

- a. What is the  $V_m$  for this inverter?

**Solution**

Assume that  $V_m$  is around midrail (1.25V). That means that the NMOS is velocity saturated and the PMOS is saturated. To find  $V_m$ , we set the sum of the currents at  $V_{out}$  equal to 0 using the correct equation for each device:

$$k_n \cdot V_{DSATn} \cdot \left( V_M - V_{Tn} - \frac{V_{DSATn}}{2} \right) + k_p \cdot 0.5 \cdot (V_M - V_{DD} - V_{Tp})^2 = 0.$$

Plug in numbers:

$$172.5 \cdot 0.6 \cdot (V_M - 0.4 - 0.315) + (-150) \cdot 0.5 \cdot (V_M - 2.5 - (-0.4))^2 = 0$$

$$103.5V_M - 74 - \left( -75 \cdot \left( V_M^2 - 4.2V_M + 4.41 \right) \right) = 0.$$

Solving this quadratic gives  $V_M = 1.245 \text{ V}$ .

- b.** What is the effective load capacitance  $C_{Leff}$  of this inverter? (Include parasitic capacitance, refer to the text for  $K_{eq}$  and  $m$ .) **Hint:** You must assume certain values for the source/drain areas and perimeters since there is no layout. For our scalable CMOS process,  $\lambda = 0.125 \mu\text{m}$ , and the source/drain extensions are  $5\lambda$  for the PMOS; for the NMOS the source/drain contact regions are  $5\lambda \times 5\lambda$ .

#### Solution

The calculation of the lumped load capacitance follows the format presented in the lecture notes. The only difference is the dimensions of the devices.

$$\begin{aligned} C_{Leff} &= C_L + C_{parasitic} = C_{g3} + C_{g4} + C_{db1} + C_{db2} + C_{gd1} + C_{gd2}. \\ C_{g3} &= (C_{GD0n} + C_{GS0n})W_n + C_{ox}W_nL = (3.11e-10)(0.375e-6) + 6e-15(0.375)(0.25) = 0.796 \text{ fF} \\ C_{g4} &= (C_{GD0p} + C_{GS0p})W_p + C_{ox}W_pL = (2.68e-10)(1.25e-6) + 6e-15(1.25)(0.25) = 2.545 \text{ fF} \\ C_{db1} &= K_{eqn}(AD_n)C_j + K_{eqswn}(PD_n)C_{jsw}. \text{ Need to do this calculation for both transitions and average the results. The } K_{eq} \text{ values are already calculated in the text.} \end{aligned}$$

$$AD_p = AS_p = 1.25\mu\text{m} \times 0.625\mu\text{m} = 0.78125\mu\text{m}^2 \text{ and}$$

$$AD_n = AS_n = 0.125 \times 0.375 + 0.625^2 = 0.4375\mu\text{m}^2.$$

$$PD_p = PS_p = 2 \times 0.625\mu\text{m} + 1.25\mu\text{m} = 2.5\mu\text{m} \text{ and}$$

$$PD_n = PS_n = 5 \times 0.125\mu\text{m} \times 3 + (2+1+1) \times 0.125\mu\text{m} = 2.375\mu\text{m}.$$

$$(0.57 \times 0.4375^2 \times 2 + 0.61 \times 2.375 \times 0.28) = 0.904 \text{ fF for HL transition}$$

$$(0.79 \times 0.4375^2 \times 2 + 0.81 \times 2.375 \times 0.28) = 1.23 \text{ fF for LH. Average } C_{db1} = 1.067 \text{ fF.}$$

$$C_{db2} = K_{eqp}(AD_p)C_j + K_{eqswp}(PD_p)C_{jsw}.$$

$$(0.79 \times 0.78125^2 \times 1.9 + 0.86 \times 2.5 \times 0.22) = 1.65 \text{ fF for HL transition}$$

$$(0.59 \times 0.78125^2 \times 1.9 + 0.7 \times 2.5 \times 0.22) = 1.26 \text{ fF for LH. Average } C_{db2} = 1.455 \text{ fF.}$$

$$C_{gd1} = 2C_{GD0n}W_n = 2 \times 3.11e-10 \times 0.375e-6 = 0.233 \text{ fF.}$$

$$C_{gd2} = 2C_{GD0p}W_p = 2 \times 2.68e-10 \times 1.25e-6 = 0.67 \text{ fF.}$$

$C_L = \text{sum} = 6.767 \text{ fF}$ . Note - since the problem states that  $x_d=0$ , it is ok if you neglected the last two parasitic capacitances. We intended for them to be included, though.

- c.** Calculate  $t_{PHL}$ ,  $t_{PLH}$  assuming the result of (b) is ' $C_{Leff} = 6.5 \text{ fF}$ '. (Assume an ideal step input, i.e.  $t_{rise}=t_{fall}=0$ . Do this part by computing the average current used to charge/discharge  $C_{Leff}$ )

#### Solution

We can estimate the propagation delay using the approximation  $\Delta t = \Delta Q/I$ , where  $\Delta Q = C_{Leff}V_{DD}$  and  $I$  is the average current used to charge/discharge  $C_{Leff}$ . During the high-to-low transition  $C_{Leff}$  is discharged through the NMOS transistor so  $I = I_{avgN}$ . During the low-to-high transition  $C_{Leff}$  is charged through the PMOS transistor so  $I = I_{avgP}$ . In summary:

$$t_{delay} \cong \frac{V_{DD} \cdot C_{Leff}}{2 \cdot I_{avg}}, \text{ where}$$

$$I_{avgN} = \frac{I_{ds}(V_o = 0) + I_{ds}\left(V_o = \frac{V_{DD}}{2}\right)}{2}, I_{avgP} = \frac{I_{ds}(V_o = V_{DD}) + I_{ds}\left(V_o = \frac{V_{DD}}{2}\right)}{2}$$

Table 1 shows corresponding values for  $I_{avgN}$ ,  $I_{avgP}$ ,  $t_{PLH}$ , and  $t_{PHL}$ . NOTE- This solution

|               | $V_o$ (V) | Operation Mode | $I_{ds}$ (mA) | $I_{avg}$ (mA) | Prop Delay (ps) |
|---------------|-----------|----------------|---------------|----------------|-----------------|
| for $t_{PLH}$ | 0         | PMOS vel sat.  | 0.300         | 0.285          | 28.5            |
|               | 1.25      | PMOS vel sat   | 0.270         |                |                 |
| for $t_{PHL}$ | 2.5       | NMOS vel sat.  | 0.209         | 0.202          | 40.0            |
|               | 1.25      | NMOS vel sat   | 0.195         |                |                 |

Table 1: Average currents and propagation delays for Problem 4(c).

included channel length modulation, but it is ok if your solution did not (see problem assumptions).

- d. Find ( $W_p/W_n$ ) such that  $t_{PHL} = t_{PLH}$ .

**Solution**

One way to do this is to solve the current average equations for  $W_p/W_n$  after setting the propagation delays equal to one another. A much easier method is to sweep the widths in HSPICE. The HSPICE sim shows that  $W_p/W_n = 2.6$  gives equal rise and fall times.

- e. Suppose we increase the width of the transistors to reduce the  $t_{PHL}$ ,  $t_{PLH}$ . Do we get a proportional decrease in the delay times? Justify your answer.

**Solution**

The propagation delays DO NOT decrease in proportion to the widths because of self-loading effects. As the device size increases, its parasitic capacitances increase as well. In this problem, increasing device size increases both average current and  $C_{Leff}$ .

- f. Suppose  $V_{SB} = 1V$ , what is the value of  $V_m$ ,  $V_{tp}$ ,  $V_m$ ? How does this qualitatively affect  $C_{Leff}$ ?

**Solution**

$$V_{tp} = V_{tp0} = -0.4V.$$

$$V_{tn} = 0.4 + \gamma \cdot (\sqrt{2\phi_F + 1} - \sqrt{|2\phi_F|}) = 0.596 \text{ V.}$$

Using the equation for part a) and plugging in the new value of  $V_{tn}$  gives:  $V_M = 1.35 \text{ V}$ . The increased  $V_{sb}$  will increase the depletion region and lower the junction capacitance, lowering  $C_{Leff}$ .

11. Using Hspice answer the following questions.

- a. Simulate the circuit in Problem 10 and measure  $t_p$  and the average power for input  $V_{in}$ : pulse(0  $V_{DD}$  5n 0.1n 0.1n 9n 20n), as  $V_{DD}$  varies from 1V - 2.5V with a 0.25V interval. [ $t_p = (t_{PHL} + t_{PLH}) / 2$ ]. Using this data, plot ' $t_p$  vs.  $V_{DD}$ ', and 'Power vs.  $V_{DD}$ '.

Specify AS, AD, PS, PD in your spice deck, and manually add  $C_L = 6.5 \text{ fF}$ . Set  $V_{SB} = 0V$  for this problem.

**Solution**

Delay vs VDD (part a)



Power vs VDD (part a)

- b. For Vdd equal to 2.5V determine the maximum fan-out of identical inverters this gate can drive before its delay becomes larger than 2 ns.

**Solution**

The maximum number of identical inverters that this gate can drive before the propagation delay exceeds 2ns is 115 inverters.

- c. Simulate the same circuit for a set of ‘pulse’ inputs with rise and fall times of  $t_{in\_rise,fall} = 1\text{ns}, 2\text{ns}, 5\text{ns}, 10\text{ns}, 20\text{ns}$ . For each input, measure (1) the rise and fall times  $t_{out\_rise}$  and  $t_{out\_fall}$  of the inverter output, (2) the total energy lost  $E_{total}$ , and (3) the energy lost due to short circuit current  $E_{short}$ .

Using this data, prepare a plot of (1)  $(t_{out\_rise} + t_{out\_fall})/2$  vs.  $t_{in\_rise,fall}$ , (2)  $E_{total}$  vs.  $t_{in\_rise,fall}$ , (3)  $E_{short}$  vs.  $t_{in\_rise,fall}$  and (4)  $E_{short}/E_{total}$  vs.  $t_{in\_rise,fall}$ .

**Solution**



d. Provide simple explanations for:

- (i) Why the slope for (1) is less than 1?
- (ii) Why  $E_{short}$  increases with  $t_{in\_rise,fall}$ ?
- (iii) Why  $E_{total}$  increases with  $t_{in\_rise,fall}$ ?

**Solution**

- i) The slope is less than 1 because of the regenerative property of the inverter. The high gain around the switching point causes the output to change faster than the inputs.
- ii) The amount of time for which both devices are on simultaneously increases.
- iii) Total energy increases because the short circuit energy begins to dominate, and the short circuit increases as the rise/fall time increases.

12. Consider the low swing driver of Figure 5.9:



Figure 5.9 Low Swing Driver

- a. What is the voltage swing on the output node ( $V_{out}$ )? Assume  $\gamma=0$ .

**Solution**

The range will be from 0.4 V to 2.07 V, since the PMOS is a weak pull down device and the NMOS is a weak pull up device.

- b. Estimate (i) the energy drawn from the supply and (ii) energy dissipated for a 0V to 2.5V transition at the input. Assume that the rise and fall times at the input are 0. Repeat the analysis for a 2.5V to 0V transition at the input.

**Solution**

For a 0 V to 2.5 V transition on the input, the energy drawn from the power supply is:

$$E_{SUPPLY} = \int i_{DD} V_{DD} dt = V_{DD} \Delta Q = CV_{DD}((V_{DD} - V_{tn}) - |V_{tp}|)$$

The PMOS will be in cutoff and the energy dissipated in the NMOS will be:

$$E_{DISSIPATED} = E_{SUPPLY} - \Delta E_{CAP}$$

$$E_{DISSIPATED} = CV_{DD}((V_{DD} - V_{tn}) - |V_{tp}|) - C \left[ \left( \frac{V_{DD} - V_{tn}}{2} \right)^2 - \left( \frac{|V_{tp}|}{2} \right)^2 \right]$$

For a 2.5 V to 0 V transition on the input, the NMOS will be in cutoff and no energy will be drawn from the power supply. The energy dissipated in the PMOS device will be equal to:

$$E = C \left[ \left( \frac{V_{DD} - V_{tn}}{2} \right)^2 - \left( \frac{|V_{tp}|}{2} \right)^2 \right]$$

- c. Compute  $t_{pLH}$  (i.e. the time to transition from  $V_{OL}$  to  $(V_{OH} + V_{OL})/2$ ). Assume the input rise time to be 0.  $V_{OL}$  is the output voltage with the input at 0V and  $V_{OH}$  is the output voltage with the input at 2.5V.

**Solution**

When the input is high and the capacitor charges, the PMOS device is in cutoff and the NMOS is velocity saturated for the duration of the charging. The total voltage range is 0.4 V to 2.07 V, so the midpoint is 1.24 V. We can use the average current method to approximate  $t_{pLH}$ . For the velocity saturated NMOS:

$$I = \left( \frac{\mu_n C_{ox} W}{L} \right) V_{DSATN} \left( V_{GS} - V_{tn} - \frac{V_{DSATN}}{2} \right) (1 + \lambda V_{DS})$$

Solving for the current at  $V=0.4$  V and  $V=1.24$  V and averaging yields an average current of 404 uA. Then:

$$t_{plh} = \frac{C \Delta V}{I_{avg}} = \frac{(100fF)(1.24V - 0.4V)}{404\mu A} = 208ps$$

- d. Compute  $V_{OH}$  taking into account body effect. Assume  $\gamma = 0.5V^{1/2}$  for both the NMOS and the PMOS.

**Solution**

The PMOS will be deep in cutoff when  $V_{out}$  approaches  $V_{OH}$ . Therefore, we consider only the NMOS. We can express the equation for threshold voltage numerically as follows:

$$V_{tn} = 0.43 + 0.5(\sqrt{(0.6 + 2.5 - V_{tn})} - \sqrt{0.6})$$

This is an equation in one variable, so it may be solved numerically to find that  $V_{tn}=0.8$  V.

13. Consider the following low swing driver consisting of NMOS devices M1 and M2. Assume an NWELL implementation. Assume that the inputs IN and  $\overline{IN}$  have a 0V to 2.5V swing and that  $V_{IN} = 0V$  when  $V_{\overline{IN}} = 2.5V$  and vice-versa. Also assume that there is no skew between IN and  $\overline{IN}$  (i.e., the inverter delay to derive  $\overline{IN}$  from IN is zero).



Figure 5.10 Low Swing Driver

- a. To what voltage is the bulk terminal of M2 connected?

**Solution**

In an NWELL process, the bulk terminal of an NMOS must be connected to ground.

- b. What is the voltage swing on the output node as the inputs swing from 0V to 2.5V. Show the low value and the high value.

**Solution**

Because the supply voltage is more than a threshold voltage lower than the gate drive voltage, the output range will not be limited. Therefore the low value is 0 V and the high value is 0.5 V.

- c. Assume that the inputs IN and  $\overline{IN}$  have zero rise and fall times. Assume a zero skew between IN and  $\overline{IN}$ . Determine the low to high propagation delay for charging the output node measured from the the 50% point of the input to the 50% point of the output. Assume that the total load capacitance is 1pF, including the transistor parasitics.

**Solution**

The lower NMOS will be off during the low to high transition and the upper NMOS will be in the linear region throughout the transition from 0.0V to 0.25V. We will assume that the body effect is negligible, since the maximum value of  $V_{SB}$  is 0.25V. Use the average current method to find  $t_{plh}$ . Using the current equation for the linear region, the current when the capacitor is at 0V, is 10.8mA. When the capacitor reaches 0.25V, the current is 4.58mA. Therefore, the average current is 7.7mA.

$$t_{plh} = \frac{C\Delta V}{I_{avg}} = \frac{(1pF)(0.25V)}{7.7mA} = 32.5ps$$

- d. Assume that, instead of the 1pF load, the low swing driver drives a non-linear capacitor, whose capacitance vs. voltage is plotted below. Compute the energy drawn from the low supply for charging up the load capacitor. Ignore the parasitic capacitance of the driver circuit itself.



### Solution

The capacitor charges only from 0 V to 0.5 V, so only the first segment of the graph should be considered. The total energy drawn from the supply is:

$$E = V_{DD} \int I(t)dt = Q_{total}V_{DD}$$

The total charge required to charge the capacitor is:

$$Q = \int_0^{0.5} C(V)dV = 1pF \int_0^{0.5} (1 + 2V)dV = 0.75pC$$

Therefore, since  $E=QV$ , the total energy drawn from the supply is 0.375 pJ.

14. The inverter below operates with  $V_{DD}=0.4V$  and is composed of  $|V_t| = 0.5V$  devices. The devices have identical  $I_0$  and  $n$ .

- a. Calculate the switching threshold ( $V_M$ ) of this inverter.

### Solution

The subthreshold I-V relation is given by  $I_D = I_o e^{(V_{GS}-V_t)/(nV_T)} (1 + \lambda V_{DS})$ , assuming  $V_{DS} > 50mV$ . To calculate the switching voltage, we need to find where  $V_{in}=V_{out}$  occurs. So equating the absolute values of the currents for the two transistors we get:

$$I_o e^{V_{in}/(nV_T)} (1 + \lambda_n V_{out}) = I_o e^{(V_{DD}-V_{in})/(nV_T)} (1 + \lambda_p (V_{DD} - V_{out}))$$

Considering  $V_{in}=V_{out}$  and doing some cancellations we get:

$$\ln[(1 + \lambda_n V_{in})/(1 + \lambda_p(V_{in} - V_{DD}))] = 1/((n \cdot V_T)(V_{DD} - V_{in} - V_{in}))$$

after massaging the last equation we have:

$$V_{DD}/2 - nV_T/2 \cdot \ln[(1 + \lambda_n V_{in})/(1 + \lambda_p(V_{DD} - V_{in}))] = V_{in}$$

Iterating this expression with  $V_{DD}=0.4V$ ,  $V_T=26mV$ ,  $\lambda_n=0.06$ ,  $\lambda_p=0.1$  and  $n=1.5$  we get  $V_{in}=0.2V$ . So we have a switching threshold of  $V_{DD}/2=0.2V$ .

**b.** Calculate  $V_{IL}$  and  $V_{IH}$  of the inverter.



Figure 5.11 Inverter in Weak Inversion Regime

### Solution

To calculate the noise margins we need to calculate the slope of the VTC at  $V_M=V_{DD}/2$ . Equating the currents we get:

$$I_o e^{V_{in}/(nV_T)} (1 + \lambda_n V_{out}) = I_o e^{(V_{DD} - V_{out})/(nV_T)} (1 + \lambda_p(V_{DD} - V_{out}))$$

and cancelling out  $I_o$  and differentiating both sides with respect to  $V_{in}$  we get:

$$\begin{aligned} \frac{\partial}{\partial V_{in}} (e^{V_{in}/(nV_T)} (1 + \lambda_n V_{out})) &= \frac{\partial}{\partial V_{in}} e^{(V_{in} - V_{DD})/(nV_T)} (1 + \lambda_p(V_{out} - V_{DD})) \\ e^{V_{in}/(nV_T)} (1 + \lambda_n V_{out}) / nV_T + e^{V_{in}/(nV_T)} \lambda_n \frac{\partial V_{out}}{\partial V_{in}} &= \\ - e^{(V_{DD} - V_{in})/(nV_T)} (1 + \lambda_p(V_{DD} - V_{out})) / nV_T - e^{(V_{DD} - V_{in})/(nV_T)} \lambda_p \frac{\partial V_{out}}{\partial V_{in}} & \end{aligned}$$

manipulating this expression we get:

$$\begin{aligned} (e^{(V_{DD} - V_{in})/(nV_T)} \lambda_p + e^{V_{in}/(nV_T)} \lambda_n) \frac{\partial V_{out}}{\partial V_{in}} &= \\ e^{V_{in}/(nV_T)} (1 + \lambda_n V_{out}) / nV_T + e^{(V_{DD} - V_{in})/(nV_T)} (1 + \lambda_p(V_{DD} - V_{out})) / nV_T & \end{aligned}$$

plugging in  $V_{out}=V_{in}=V_{DD}/2$  we reach:

$$-e^{(V_{DD}/2)/(nV_T)}(\lambda_p + \lambda_n) \frac{\partial V_{out}}{\partial V_{in}} = e^{(V_{DD}/2)/(nV_T)}(2 + (\lambda_p + \lambda_n)V_{DD}/2)/(nV_T)$$

Finally:

$$\left. \frac{\partial V_{out}}{\partial V_{in}} \right|_{V_{in} = V_{out} = V_{DD}/2} = -(2 + (\lambda_p + \lambda_n)V_{DD}/2)/(nV_T)/(\lambda_p + \lambda_n)$$

Using the values  $V_{DD}=0.4V$ ,  $V_T=26mV$ ,  $\lambda_n=0.06$ ,  $\lambda_p=0.1$  and  $n=1.5$  we obtain:

$$g = \frac{\partial V_{out}}{\partial V_{in}} = -325.6$$

This value is much more than we would expect from an MOS inverter (which has  $g \sim 30$ ). However we should keep in mind that in the subthreshold regime MOS devices behave essentially as bipolar devices and can yield such values of gain.

We know that  $V_{IL}=V_M+(V_{DD}-V_M)/g$  and  $V_{IH}=V_M-V_M/g$  from the text (eq 5.7). Using these equations and the results that we got we have:  $V_{IL}=0.1994V$  and  $V_{IH}=0.2006V$ .

Also  $NM_H=NM_L=0.1994V$

**15. Sizing a chain of inverters.**

- a. In order to drive a large capacitance ( $C_L = 20 pF$ ) from a minimum size gate (with input capacitance  $C_i = 10fF$ ), you decide to introduce a two-staged buffer as shown in Figure 5.12. Assume that the propagation delay of a minimum size inverter is 70 ps. Also assume that the input capacitance of a gate is proportional to its size. Determine the sizing of the two additional buffer stages that will minimize the propagation delay.



**Figure 5.12** Buffer insertion for driving large loads.

**Solution**

Minimum delay occurs when the delay through each buffer is the same. This can be achieved by sizing the buffer as  $f$ ,  $f^2$ , respectively where  $f = \sqrt[N]{F} = \sqrt[3]{2000} = 12.6$ , so ( $\gamma=0$ )

$$t_p = Nt_{p0}(1+f/\gamma) = 3 \cdot 70ps \cdot (1+12.6) = 2.8ns$$

- b. If you could add any number of stages to achieve the minimum delay, how many stages would you insert? What is the propagation delay in this case?

**Solution**

From the text, we know that the minimum delay occurs when  $f = e$ . Therefore,

$$N = \frac{\ln(2000)}{\ln(f)} = 7.6$$

$$f = e^{\frac{\ln(2000)}{7}} = 2.96$$

$$t_{delay} = 7 \times 3.96 \times 70\text{ps} = 1.9\text{ns}$$

- c. Describe the advantages and disadvantages of the methods shown in (a) and (b).

**Solution**

Solution (b) is faster but it consumes much more area than (a).

- d. Determine a closed form expression for the power consumption in the circuit. Consider only gate capacitances in your analysis. What is the power consumption for a supply voltage of 2.5V and an activity factor of 1?

**Solution**

The power consumption is determined as follows

$$\begin{aligned} P &= C_{tot} V_{dd}^2 \frac{1}{T} \alpha \\ P &= C_i V_{dd}^2 \frac{1}{T} \alpha \sum_{k=0}^3 f^k = C_i V_{dd}^2 \frac{1}{T} \alpha \left( \frac{f^4 - 1}{f - 1} \right) = 136 \left( \frac{1}{T} \right) \text{ pWatts} \end{aligned}$$

16. [M, None, 3.3.5] Consider scaling a CMOS technology by  $S > 1$ . In order to maintain compatibility with existing system components, you decide to use constant voltage scaling.

- a. In traditional constant voltage scaling, transistor widths scale inversely with  $S$ ,  $W \propto 1/S$ .

To avoid the power increases associated with constant voltage scaling, however, you decide to change the scaling factor for  $W$ . What should this new scaling factor be to maintain approximately constant power. Assume long-channel devices (i.e., neglect velocity saturation).

**Solution**

We know that:  $P \propto CV_{DD}^2 f$  and  $f \propto \frac{1}{t_p} \propto \frac{I_{Dsat}}{CV_{DD}}$ , so

$$P \propto I_{Dsat} V \propto k \frac{W}{L} (V - V_t)^2 V \propto (s) \frac{(W)}{\left(\frac{1}{S}\right)}$$

To keep power constant we need to scale  $W \propto \frac{1}{s^2}$ , which means redesigning gates with  $W$  a factor of  $1/s$  smaller.

- b. How does delay scale under this new methodology?

**Solution**

$$t_p \propto \frac{CV}{k' \frac{W}{L} V^2} \propto \frac{WL \frac{\epsilon}{t}}{k' \frac{W}{L} V} \propto \frac{\left(\frac{1}{s^2}\right) \left(\frac{1}{s}\right) \frac{1}{1/s}}{s \frac{1/s^2}{1/s}}$$

so  $t_p \propto 1/s^2$ .

- c. Assuming short-channel devices (i.e., velocity saturation), how would transistor widths have to scale to maintain the constant power requirement?

**Solution**

$P \propto I_{SAT}V_{DD} \propto V_{DD}WC_{ox}(V_{gs} - V_t)v_{max} \propto W(s)$ , so  $W \propto \frac{1}{s}$ .  
This means that no changes need to be made.

### DESIGN PROBLEM

Using the 0.25  $\mu\text{m}$  CMOS introduced in Chapter 2, design a static CMOS inverter that meets the following requirements:

1. Matched pull-up and pull-down times (i.e.,  $t_{pHL} = t_{pLH}$ ).
2.  $t_p = 5 \text{ nsec} (\pm 0.1 \text{ nsec})$ .

The load capacitance connected to the output is equal to 4 pF. Notice that this capacitance is substantially larger than the internal capacitances of the gate.

Determine the  $W$  and  $L$  of the transistors. To reduce the parasitics, use minimal lengths ( $L = 0.25 \mu\text{m}$ ) for all transistors. Verify and optimize the design using SPICE after proposing a first design using manual computations. Compute also the energy consumed per transition. If you have a layout editor (such as MAGIC) available, perform the physical design, extract the real circuit parameters, and compare the simulated results with the ones obtained earlier.

## Chapter 6 PROBLEMS

1. [E, None, 4.2] Implement the equation  $X = ((\bar{A} + \bar{B})(\bar{C} + \bar{D} + \bar{E}) + \bar{F})\bar{G}$  using complementary CMOS. Size the devices so that the output resistance is the same as that of an inverter with an NMOS  $W/L = 2$  and PMOS  $W/L = 6$ . Which input pattern(s) would give the worst and best equivalent pull-up or pull-down resistance?

**Solution**

Rewriting the output expression in the form  $X = ((\bar{A} + \bar{B})(\bar{C} + \bar{D} + \bar{E}) + \bar{F})\bar{G} = (\overline{(AB + CDE)F} + G)$  allows us to build the pulldown network by inspection (parallel devices implement an OR, and series devices implement an AND). The pullup network is the dual of the pulldown network.



The plot shows sizes that meet the requirement - in the worst case, the output resistance of the circuit matches the output resistance of an inverter with NMOS  $W/L=2$  and PMOS  $W/L=6$ .

The worst case pull-up resistance occurs whenever a single path exists from the output node to Vdd. Examples of vectors for the worst case are ABCDEFG=1111100 and 0101110. The best case pull-up resistance occurs when ABCDEFG=0000000.

The worst case pull-down resistance occurs whenever a single path exists from the output node to GND. Examples of vectors for the worst case are ABCDEFG=0000001 and 0011110.

The best case pull-down resistance occurs when ABCDEFG=1111111.

2. Implement the following expression in a full static CMOS logic fashion using no more than 10 transistors:

$$\bar{Y} = (A \cdot B) + (A \cdot C \cdot E) + (D \cdot E) + (D \cdot C \cdot B)$$

**Solution**

The circuit is given in the next figure.



3. Consider the circuit of Figure 6.1.



**Figure 6.1** CMOS combinational logic gate.

- a. What is the logic function implemented by the CMOS transistor network? Size the NMOS and PMOS devices so that the output resistance is the same as that of an inverter with an NMOS  $W/L = 4$  and PMOS  $W/L = 8$ .

**Solution**

The logic function is :  $Y = \overline{(A + B)CD}$ . The transistor sizes are given in the figure above.

- b. What are the input patterns that give the worst case  $t_{pHl}$  and  $t_{plh}$ ? State clearly what are the initial input patterns and which input(s) has to make a transition in order to achieve this maximum propagation delay. Consider the effect of the capacitances at the internal nodes.

**Solution**

The worst case  $t_{pHl}$  happens when the internal node capacitances ( $Cx2$  and  $Cx3$ ) are charged before the high to low transition. The initial states that can cause this are:  $ABCD=[1010, 1110, 0110]$ . The final state is one of:  $ABCD=[1011, 0111]$ .

The worst case  $t_{PLH}$  happens when  $CxI$  is charged before the low to high transition. The input pattern that can cause this is: ABCD=[0111] =>[0011].

- c. Verify part (b) with SPICE. Assume all transistors have minimum gate length (0.25 $\mu$ m).

**Solution**

The two cases are shown below.



**Figure 6.2** Best and worst  $t_{PHL}$ .



**Figure 6.3** Best and worst  $t_{PLH}$ .

- d. If  $P(A=1)=0.5$ ,  $P(B=1)=0.2$ ,  $P(C=1)=0.3$  and  $P(D=1)=1$ , determine the power dissipation in the logic gate. Assume  $V_{DD}=2.5V$ ,  $C_{out}=30fF$  and  $f_{clk}=250MHz$ .

**Solution**

Since D is always 1, the circuit implements the following function  $Y = \overline{(A + B)}C$ .

$$P_{(A+B)=1} = P_{A=0} \cdot P_B = 0 = 0.5 * (1-0.2) = 0.4,$$

$$P_{(A+B)=0} = 1 - 0.4 = 0.6,$$

$$P_{Y=0} = P_{(A+B)=1} \cdot P_C = 1 = 0.6 * 0.3 = 0.18$$

$$P_{Y=1} = 1 - 0.18 = 0.82$$

$$P_{Y=0 \Rightarrow 1} = 0.18 * 0.82 = 0.1476$$

$$\text{So } P_{dyn} = P_{Y=0 \Rightarrow 1} C_{out} V_{DD}^2 f_{clk} = (0.1476)(30.10^{-15})(2.5^2)(250.10^6) = 6.92 \mu W.$$

4. [M, None, 4.2] CMOS Logic

- a. Do the following two circuits (Figure 6.4) implement the same logic function? If yes, what is that logic function? If no, give Boolean expressions for both circuits.

**Solution**

Yes, they implement the same logic function :

$$F = (ABCD + E) = (A + B + C + D).E$$

- b.** Will these two circuits' output resistances always be equal to each other?

**Solution**

No

- c.** Will these two circuits' rise and fall times always be equal to each other? Why or why not?

**Solution**

No. Circuit B appears optimized for the case where the transistor with input E is on the critical path since it is closer to the output node than in circuit A. Therefore, if input E arrives later, circuit B will be faster than circuit A since the internal node will already be charged and only the output capacitance needs to be switched. Even if we assume, all inputs arrive at the same time, however, the two circuits rise and fall times will not be equal to each other. Consider an input combination where E,A,B,C,D are all low. Circuit A has only one body-affected device while circuit B has four. Since the associated rise in  $V_t$  and fall in output resistance affects only one resistor in circuit A, but four parallel resistors in circuit B, we expect a difference in the timing waveforms.



Figure 6.4 Two static CMOS gates.

- 5.** [E, None, 4.2] The transistors in the circuits of the preceding problem have been sized to give an output resistance of  $13 \text{ k}\Omega$  for the worst-case input pattern. This output resistance can vary, however, if other patterns are applied.

- a.** What input patterns ( $A-E$ ) give the lowest output resistance when the output is low? What is the value of that resistance?

**Solution**

The lowest output resistance is obtained when all inputs (A, B, C, D and E) are equal to 1. In that case, the output resistance is the parallel of the resistance of a nMOS of width 1, with a series of four equal nMOS of width 4. Both combinations have the same resistance, equal to the worst-case output resistance,  $13 \text{ k}\Omega$ . Then the output resistance, in this case, is half this value,  $6.5 \text{ k}\Omega$ .

- b.** What input patterns ( $A-E$ ) give the lowest output resistance when the output is high? What is the value of that resistance?

**Solution**

The lowest output resistance is obtained when all inputs are equal to zero. Each of the pMOS have the same width, so all of them have the same resistance. The worst case resistance happens when only one of the inputs (A, B, C or D) is equal to 0 while all the rest are equal to 1. The output resistance in that case is the series of the resistance of two of the pMOS and it is equal to  $13 \text{ k}\Omega$ . Then, each of the pMOS has an output resistance equal to  $6.5 \text{ k}\Omega$ . The output resistance is equal to the series of one of these resistance with the parallel of four of the same resistances. Then, the minimum output resistance is  $6.5 \text{ k}\Omega + 6.5 \text{ k}\Omega / 4 = 8.125 \text{ k}\Omega$ .

6. [E, None, 4.2] What is the logic function of circuits A and B in Figure 6.5? Which one is a dual network and which one is not? Is the nondual network still a valid static logic gate? Explain. List any advantages of one configuration over the other.



**Figure 6.5** Two logic functions.

**Solution**

Both circuits A and B implement the XOR logic function. Circuit A is a dual network because the pull up network is dual with the pull down network.

However, circuit B is still a valid static logic gate, because for any combination of the inputs, there is either a low resistance path from  $V_{DD}$  or ground to the output. Circuit B has an extra advantage. The internal node capacitances are less compared to Circuit A, which make it faster than Circuit A.

7. [E, None, 4.2] Compute the following for the pseudo-NMOS inverter shown in Figure 6.6:  
a.  $V_{OL}$  and  $V_{OH}$

**Solution**

To find  $V_{OH}$ , set  $V_{in}$  to 0, because  $V_{OL}$  is likely to be below  $V_{T0}$  for the NMOS. If  $V_{in}=0$ , then  $M_1$  is off, so the PMOS pulls the output all the way to the rail. So,  $V_{OH}=V_{DD}=2.5\text{V}$ .

To find  $V_{OL}$ , set  $V_{in} = V_{OH} = 2.5\text{V}$ . The NMOS is all the way on, but so is the PMOS. To find  $V_{OL}$ , we can write a current balancing equation at the output node:  $I_{DP}+I_{DN}=0$ . First, we must determine the region of operation for each device. We can assume that  $V_{DS} = V_{OL}$  for the NMOS is less than  $V_{DSAT}$ , so the NMOS is in the linear region.  $V_{DS}$  for the PMOS will be more negative than  $V_{DSAT}$ , and  $V_{GTP} = -2.1$ , so the PMOS is velocity saturated. The equation is therefore:

$$k_p \cdot \frac{W}{L} \cdot V_{DSAT} \cdot (V_{GT} - 0.5V_{DSAT}) \cdot (1 + \lambda V_{DS}) + k_n \cdot \frac{W}{L} \cdot V_o \cdot (V_{GT} - 0.5V_o) \cdot (1 + \lambda V_o) = 0$$

Plugging in numbers (process parameters such as  $V_{DSAT}$  appear in tables in previous chapters) gives:

$$-30 \cdot 2 \cdot -1 \cdot (-1.6) \cdot (1 - 0.1(V_o - 2.5)) + 115(16) \cdot V_o \cdot (2.07 - 0.5V_o) \cdot (1 + 0.06V_o) = 0$$

Solving for  $V_o$  gives  $V_{OL} = 31.6\text{mV}$ .

- b.**  $NM_L$  and  $NM_H$

**Solution**

Rather than calculating the derivative of the current, we will estimate  $V_{IL}$  and  $V_{IH}$  from the simulated VTC. This approach estimates that the noise margin low is about 0.47V and the noise margin high is about 1.67V.

- c.** The power dissipation: (1) for  $V_{in}$  low, and (2) for  $V_{in}$  high

**Solution**

For  $V_{in}$  low, the NMOS is off, so the power dissipation is 0W. For  $V_{in}$  high,  $P=VI=2.5*I_{DP}$ . We saw in part a) the equation for  $I_{DP}$ . Plugging in the value for  $V_{OL}$ , we get  $P=VI=2.5*120\mu\text{A}=300\mu\text{W}$ .

- d.** For an output load of 1 pF, calculate  $t_{PLH}$ ,  $t_{PHL}$ , and  $t_p$ . Are the rising and falling delays equal? Why or why not?



Figure 6.6 Pseudo-NMOS inverter.

**Solution**

We cannot use the estimate of resistance from the I-V curve for the HL transition because the PMOS is still on. Therefore, we will use the average current method for estimating delay. The average current for the HL transition through the PMOS is  $0.5(I_{VDD=2.5} + I_{VDD=1.25})$ .  $I_{VDD=2.5} = 0$ .  $I_{VDD=1.25} = -30(2)(-1)(-2.1+0.5) * (1+0.1(1.25)) = 108\mu\text{A}$ . Thus,  $I_{avg}$  for the PMOS is  $54\mu\text{A}$ .

For the NMOS,  $I_{VDD=2.5} = 115(16)(0.63)(2.07-0.63/2)(1+0.06*2.5)=2.4\text{mA}$  and  $I_{VDD=1.25} = 115(16)(0.63)*(2.07-0.63/2)(1+0.06*1.25) = 2.2\text{mA}$ . So,  $I_{avg}$  for the NMOS is  $2.3\text{mA}$ . The average current discharging the capacitor is then  $2.3\text{mA}-54\mu\text{A} = 2.25\text{mA}$ . Then  $t_{PHL} = C*delV/I_{avg} = 556\text{ps}$ .

For  $t_{PLH}$ , the NMOS is off, so we can use equivalent resistance to find the transition time. From the table of resistances in the text, we can calculate  $R_{EQ} = 31\text{k}\Omega/(W/L_p) = 15.5\text{k}\Omega$ . Then  $t_{PLH} = 0.69*C*R_{EQ}$ . So  $t_{PLH} = 10.7\text{ns}$ .

$t_p = (t_{PLH} + t_{PHL})/2 = 5.6\text{ns}$ . The rising delay is much longer because the PMOS is very weak relative to the NMOS.

- 8.** [M, SPICE, 4.2] Consider the circuit of Figure 6.7.

- a.** What is the output voltage if only one input is high? If all four inputs are high?

**Solution**

$$I_D = k' \cdot \frac{W}{L} \cdot \left( V_{GT} \cdot V_{min} - \frac{V_{min}^2}{2} \right) \cdot (1 + \lambda \cdot V_{DS})$$

Consider a case when one input is high:  $A = V_{DD}$  and  $B = C = D = 0$  V. Assume that  $V_{out}$  is small enough that  $V_{min} = V_{DSAT}$  for the PMOS device, and  $V_{min} = V_{DS} = V_{out}$  for the NMOS devices. Solve for  $V_{out}$  by setting the drain currents in the PMOS and NMOS equal to each other,  $|I_{Dp}| = |I_{Dn}|$ , where the drain currents are functions of  $V_{out}$ ,  $V_{DD}$ , and the device parameters.

$$V_{out} = 102 \text{ mV, and } I_D = 35.7 \mu\text{A}$$

Now verify that the assumptions for  $V_{min}$  are correct. For the PMOS:  $V_{DS} = -2.34$  V,  $V_{DSAT} = -1$  V,  $V_{GT} = -2.1$  V, therefore  $V_{min} = V_{DSAT}$ . For the NMOS:  $V_{DS} = 102$  mV,  $V_{DSAT} = 630$  mV,  $V_{GT} = 2.07$  V, therefore  $V_{min} = V_{DS}$ .

Consider the case when all inputs are high:  $A = B = C = D = V_{DD}$ . For these hand calculations, this is numerically equivalent to a circuit with a single NMOS device with  $W/L = 4*1.5$  and its gate tied to  $V_{DD}$ . Now, the analysis used above for the case when one device is on can be reused, replacing  $W/L$  of the NMOS with 6, and using the same assumptions for  $V_{min}$ ,  $V_{out} = 25$  mV, and  $I_D = 35.9 \mu\text{A}$ . The assumptions for  $V_{min}$  are correct.

- b.** What is the average static power consumption if, at any time, each input turns on with an (independent) probability of 0.5? 0.1?

#### Solution

Notice in part a) that the drain current in the PMOS is  $35.7 \mu\text{A}$  with one NMOS on and  $35.9 \mu\text{A}$  with four NMOS devices on. The current in the PMOS can be approximated as  $35.8 \mu\text{A}$  when any number of NMOS devices are on and  $0 \mu\text{A}$  when all four are off. The probability that all four NMOS devices are off is  $(1-\rho)^4$  where  $\rho$  is the probability an input is high. Therefore,

$$P_{AVG} = P_{OFF} \cdot (1-\rho)^4 + P_{ON} \cdot \left[ 1 - (1-\rho)^4 \right]$$

where  $P_{OFF} = 0$  W, and  $P_{ON} = 89.5 \mu\text{W}$ .  $P_{AVG} = 83.9 \mu\text{W}$  when  $\rho = 0.5$  and  $P_{AVG} = 30.7 \mu\text{W}$  when  $\rho = 0.5$ .

- c.** Compare your analytically obtained results to a SPICE simulation.

#### Solution

From SPICE:  $V_{out} = 98.7$  mV, and  $I_D = 38.2 \mu\text{A}$  with one NMOS device on and  $V_{out} = 23.5$  mV, and  $I_D = 38.3 \mu\text{A}$  with all NMOS devices on.



Figure 6.7 Pseudo-NMOS gate.

- 9.** [M, None, 4.2] Implement  $F = \overline{ABC} + \overline{ACD}$  (and  $\overline{F}$ ) in DCVSL. Assume  $A, B, C, D$ , and their complements are available as inputs. Use the minimum number of transistors.

**Solution**

10. [E, Layout, 4.2] A complex logic gate is shown in Figure 6.8.

- a. Write the Boolean equations for outputs  $F$  and  $G$ . What function does this circuit implement?

**Solution**

$$G = A \oplus B$$

$$F = A \oplus B$$

- b. What logic family does this circuit belong to?

**Solution**

It belongs to the DCVSL logic family.

- c. Assuming  $W/L = 0.5\mu/0.25\mu$  for all nmos transistors and  $W/L = 2\mu/0.25\mu$  for the pmos transistors, produce a layout of the gate using Magic. Your layout should conform to the following datapath style: (1) Inputs should enter the layout from the left in polysilicon; (2) The outputs should exit the layout at the right in polysilicon (since the outputs would probably be driving transistor gate inputs of the next cell to the right); (3) Power and ground lines should run vertically in metal 1.



**Figure 6.8** Two-input complex logic gate.

- d. Extract and netlist the layout. Load both outputs ( $F, G$ ) with a  $30fF$  capacitance and simulate the circuit. Does the gate function properly? If not, explain why and resize the transistors so that it does. Change the sizes (and areas and perimeters) in the HSPICE netlist.

**Solution**

The gate doesn't function properly, because the PMOS devices are strong and the NMOS pull down network can not switch the output nodes.

If you decrease the PMOS sizes to  $W=0.5\mu m$ , then the logic gate will function properly.

11. Design and simulate a circuit that generates an optimal differential signal as shown in Figure 6.9. Make sure the rise and fall times are equal.



Figure 6.9 Differential Buffer.

**Solution**

The circuit is shown below.



If the inverters are sized for equal rise and fall times then you can achieve equal rise and fall times on the differential outputs, as long as the other FETs are sized symmetrically.

12. What is the function of the circuit in Figure 6.10?



Figure 6.10 Gate.

**Solution**

The circuit implements an S-R latch. Set is A and Reset is B. The invalid state is when both A and B are 0.

13. Implement the function  $S = ABC + \overline{ABC} + \overline{ACB} + \overline{BAC}$ , which gives the sum of two inputs with a carry bit, using NMOS pass transistor logic. Design a DCVSL gate which implements the same function. Assume A, B, C, and their complements are available as inputs.

**Solution**

The two cases are shown in the figure below.



14. Describe the logic function computed by the circuit in Figure 6.11. Note that all transistors (except for the middle inverters) are NMOS. Size and simulate the circuit so that it achieves a 100 ps delay (50-50) using  $0.25\mu\text{m}$  devices, while driving a  $100 \text{ fF}$  load on both differential outputs. ( $V_{DD} = 2.5\text{V}$ ). Assume A, B and their complements are available as inputs.



Figure 6.11 Cascoded Logic Styles.

For the drain and source perimeters and areas you can use the following approximations:  $AS=AD=W*0.625u$  and  $PS=PD=W+1.25u$ .

**15.**

**Solution**

The circuit implements an XOR. The sizes of the transistors are M1:  $28u/0.25u$ , M2:  $28u/0.25u$ , M3:  $10u/0.25u$ , M4:  $10u/0.25u$ .  $M_{P_{inv}}: 4u/0.25$ ,  $M_{N_{inv}}: 0.375u/10u$

**16.** [M, None, 4.2] Figure 6.12 contains a pass-gate logic network.

- a. Determine the truth table for the circuit. What logic function does it implement?

**Solution**

The truth table is shown below

| AB | Out |
|----|-----|
| 00 | 1   |
| 01 | 0   |
| 10 | 0   |
| 11 | 1   |

The circuit implements an XNOR.

- b. Assuming 0 and 2.5 V inputs, size the PMOS transistor to achieve a  $V_{OL} = 0.3$  V.

**Solution**

The PMOS device will be velocity saturated and the NMOS passgate will be in the linear region.  $I_{DN}+I_{DP}=0$ , so

$$k'_p \cdot \frac{W}{L} \cdot V_{DSAT} \cdot (V_{GT} - 0.5V_{DSAT}) \cdot (1 + \lambda V_{DS}) + k'_n \cdot \frac{W}{L} \cdot V_o \cdot (V_{GT} - 0.5V_o) \cdot (1 + \lambda V_o) = 0$$

We know that  $V_o=0.3V$ , so we can plug in numbers and solve for W/L for the PMOS is 7. Let the PMOS be  $1.75/0.25$ .

- c. If the PMOS were removed, would the circuit still function correctly? Does the PMOS transistor serve any useful purpose?



**Figure 6.12** Pass-gate network.

**Solution**

No. If the PMOS were removed, the output node could remain low when AB=00 because it would be floating. The PMOS device pulls the output node high when it would otherwise be in a high impedance state.

**17.** [M, None, 4.2] This problem considers the effects of process scaling on pass-gate logic.

- a. If a process has a  $t_{buf}$  of 0.4 ns,  $R_{eq}$  of 8 k $\Omega$ , and  $C$  of 12 fF, what is the optimal number of stages between buffers in a pass-gate chain?

**Solution**

$$m_{opt} = 1.7 \sqrt{t_p / (R_{eq} \cdot C)} = 3.47 \approx 3 \text{ gates between buffers.}$$

- b. Suppose that, if the dimension of this process are shrunk by a factor  $S$ ,  $R_{eq}$  scales as  $1/S^2$ ,  $C$  scales as  $1/S$ , and  $t_{buf}$  scales as  $1/S^2$ . What is the expression for the optimal number of buffers as a function of  $S$ ? What is this value if  $S = 2$ ?

**Solution**

$$m_{opt} = 1.7 \sqrt{\frac{t_p / S^2}{R_{eq} / S^2 \cdot C / S}} = 1.7 \sqrt{\frac{S \cdot t_p}{R_{eq} \cdot C}} = 4.9 \approx 5 \text{ gates between buffers.}$$

18. [C, None, 4.2] Consider the circuit of Figure 6.13. Let  $C_x = 50 \text{ fF}$ ,  $M_r$  has  $W/L = 0.375/0.375$ ,  $M_n$  has  $W/L_{eff} = 0.375/0.25$ . Assume the output inverter doesn't switch until its input equals  $V_{DD}/2$ .

- a. How long will it take  $M_n$  to pull down node  $x$  from 2.5 V to 1.25 V if  $In$  is at 0 V and  $B$  is at 2.5V?

**Solution**

To determine the time required for these transitions, we will find the average currents in the FETs  $M_r$  and  $M_n$ . The equivalent resistance method will not suffice since it does not account for both devices being on.

For  $M_r$ ,  $I_{VDD=2.5} = 0$  since  $V_{DS} = 0$ . For the other case, the PMOS device is velocity saturated, so:

$I_{VDD=1.25} = (-30)(1)(-1)(-2.1+0.5)(1+0.1*1.25) = -54\mu\text{A}$ . The average current in the PMOS is  $-27\mu\text{A}$ .

$M_n$  is in the velocity saturation region for both endpoints of the transition. The two currents are therefore:

$$I_{VDD=2.5} = (115)(1.5)(0.63)(2.07-0.63/2)(1+0.06*2.5) = 219\mu\text{A}.$$

$$I_{VDD=1.25} = (115)(1.5)(0.63)(2.07-0.63/2)(1+0.06*2.5) = 205\mu\text{A}.$$

And the average current in the NMOS is  $212\mu\text{A}$ .

The total current DISCHARGING the capacitor is  $211\mu\text{A} - 27\mu\text{A} = 185\mu\text{A}$ .

The time for the transition is then

$$t = \frac{C \times DV}{I_{avg}} = \frac{50fF \times 1.25V}{185mA} = 338ps.$$

- b. How long will it take  $M_n$  to pull up node  $x$  from 0 V to 1.25 V if  $V_{In}$  is 2.5 V and  $V_B$  is 2.5 V?

**Solution**

For the LH transition, the PMOS “keeper” is off. The NMOS  $M_n$  is the only FET that is on for this transition. We present both methods for finding the pull-up time.

**Equivalent Resistance:** We need to perform a different sweep for this measurement than the regular  $I_D$  vs  $V_{DS}$  sweep. In this case,  $V_{DS}$  is changing because the *source node* of the FET is rising. Since the source voltage is changing,  $V_{GS}$  also is reducing as node  $x$  rises. This effectively “turns down” the current the NMOS can sustain. Performing the appropriate sweep and measuring  $R_{EQ}$  gives  $R_{EQ} = (11.3k\Omega + 34.7k\Omega) / 2 = 23k\Omega$ . Thus,  $t = 0.69*C*R_{EQ} = 0.69*50fF*23k\Omega = 794ps$ .

**Average Current:** When  $x = 0$ , the pass transistor has a  $V_{GS} = 2.5$  and a  $V_{DS} = 2.5$ , so it is velocity saturated.

$$I_{x=0} = (115)(1.5)(0.63)(2.07-0.63/2)(1+0.06*2.5) = 219\mu\text{A}.$$

When  $x = 1.25$ , the pass transistor has  $V_{DS} = 1.25$  and  $V_{GS} = 1.25$ . It is still velocity saturated, but notice that  $V_{GS}$  has decreased. Thus,

$$I_{x=1.25} = (115)(1.5)(0.63)(1.25-0.43-0.63/2)(1+0.06*1.25) = 59\mu A.$$

The average current is then  $I_{avg} = 139\mu A$ .

$$t = \frac{C \times DV}{I_{avg}} = \frac{50fF \times 1.25V}{139mA} = 450ps.$$

Clearly, the two solutions are not very close together. The actual **simulated transition time is about 644ps**. The  $I_{avg}$  approximation underestimates the solution because the true average current in this case is not close to the average of the endpoints. In a typical inverter (PMOS pullup and NMOS pulldown),  $V_{GS}$  doesn't change over the transition, so the current is reasonably linear with  $V_{DS}$ . For that case, the average current is close to the average of the endpoints. In this problem, the pinch-off of  $V_{GS}-V_T$  in the pass transistor means the average is closer to the smaller value. Numerical calculation of the average current from an HSPICE sim gives  $I_{avg} = 93\mu A$  which would give a transition time of  $t = 672ps$ , which is much closer to the actual value.

c. What is the minimum value of  $V_B$  necessary to pull down  $V_x$  to 1.25 V when  $V_m = 0$  V?

#### Solution

In order for  $M_n$  to pull node  $x$  low, the current in  $M_r$  must equal or exceed the current that charges up the capacitor at every point in the transition. The maximum current in  $M_r$  occurs when  $x = 1.25$  V, and it is (from part a)  $I_{Mr} = -54\mu A$ . We can write a current equation for  $M_n$  at this point in the transition and solve for  $V_B$ :

Note that  $M_n$  is velocity saturated at this point:  $54 = 115(1.5)(0.63)(VB-0.43-0.63/2)(1+0.06*1.25)$ .

Solving gives  $V_B = 1.207V$ .



**Figure 6.13** Level restorer.

## 19. Pass Transistor Logic



Figure 6.14 Level restoring circuit.

Consider the circuit of Figure 6.14. Assume the inverter switches ideally at  $V_{DD}/2$ , neglect body effect, channel length modulation and all parasitic capacitance throughout this problem.

- a. What is the logic function performed by this circuit?

**Solution**

The circuit is a NAND gate.

- b. Explain why this circuit has non-zero static dissipation.

**Solution**

When  $A=B=V_{DD}$ , the voltage at node  $x$  is  $V_x=V_{DD}-V_{thN}$ . This causes static power dissipation at the inverter the pass transistor network is driving.

- c. Using only just 1 transistor, design a fix so that there will not be any static power dissipation. Explain how you chose the size of the transistor.

**Solution**

The modified circuit is shown in the next figure.



The size of  $M_r$  should be chosen so that when one of the inputs A or B equals 0, either  $M_{n1}$  or  $M_{n2}$ , would be able to pull node X to  $V_{DD}/2$  or less.

- d. Implement the same circuit using transmission gates.

**Solution**

The circuit is shown below.



- e. Replace the pass-transistor network in Figure 6.14 with a pass transistor network that computes the following function:  $x = ABC$  at the node  $x$ . Assume you have the true and complementary versions of the three inputs  $A, B$  and  $C$ .

**Solution**

One possible implementation is shown.



20. [M, None, 4.3] Sketch the waveforms at  $x$ ,  $y$ , and  $z$  for the given inputs (Figure 6.15). You may approximate the time scale, but be sure to compute the voltage levels. Assume that  $V_T = 0.5$  V when body effect is a factor.

21. [E, None, 4.3] Consider the circuit of Figure 6.16.

- a. Give the logic function of  $x$  and  $y$  in terms of  $A, B$ , and  $C$ . Sketch the waveforms at  $x$  and  $y$  for the given inputs. Do  $x$  and  $y$  evaluate to the values you expected from their logic functions? Explain.

**Solution**

$$x = \overline{AB} \text{ and } y = ABC$$

The circuit does not correctly implement the desired logic function. This stems from the fact that  $x$  is pre-charged high, and thus node  $y$  is discharged as soon as the evaluation phase starts. Although  $x$  is eventually discharged by the first stage,  $y$  cannot be charged high again since it is a dynamic node with no low-impedance path to  $V_{dd}$  (during evaluate). Common solutions to this problem are to either place an inverter between the two stages (thus allowing only 0-to-1 transitions on the inputs to each stage during evaluate) as in Domino logic or employing np-CMOS. The latter is presented in (b).

- b. Redesign the gates using np-CMOS to eliminate any race conditions. Sketch the waveforms at  $x$  and  $y$  for your new circuit.

**Solution**



Figure 6.15 Dynamic CMOS.

The modified circuit using np-CMOS is shown below together with the waveforms at x and y. The desired logic function is now correctly implemented

22. [M, None, 4.3] Suppose we wish to implement the two logic functions given by  $F = A + B + C$  and  $G = A + B + C + D$ . Assume both true and complementary signals are available.

- a. Implement these functions in dynamic CMOS as cascaded  $\phi$  stages so as to minimize the total transistor count.

#### Solution

Dynamic gates with NMOS pull-down networks cannot be directly cascaded. This solution uses a domino logic approach.



- b. Design an np-CMOS implementation of the same logic functions.

#### Solution



Figure 6.16 Cascaded dynamic gates.



The circuit is shown below



23. Consider a conventional 4-stage Domino logic circuit as shown in Figure 6.17 in which all precharge and evaluate devices are clocked using a common clock  $\phi$ . For this entire problem, assume that the pulldown network is simply a single NMOS device, so that each Domino stage consists of a dynamic inverter followed by a static inverter. Assume that the precharge

time, evaluate time, and propagation delay of the static inverter are all  $T/2$ . Assume that the transitions are ideal (zero rise/fall times).



**Figure 6.17** Conventional DOMINO Dynamic Logic.

- Complete the timing diagram for signals  $Out_1$ ,  $Out_2$ ,  $Out_3$  and  $Out_4$ , when the  $IN$  signal goes high before the rising edge of the clock  $\phi$ . Assume that the clock period is  $10 T$  time units.

#### Solution

The timing diagram is shown below.



- Suppose that there are no evaluate switches at the 3 latter stages. Assume that the clock  $\phi$  is initially in the precharge state ( $\phi=0$  with all nodes settled to the correct precharge states), and the block enters the evaluate period ( $\phi=1$ ). Is there a problem during the evaluate period, or is there a benefit? Explain.

#### Solution

There is no problem during the evaluate stage. The precharged nodes remain charged until a signal propagates through the logic, activating the pull-down network and discharging the node. In fact, this topology improves the circuit's robustness in terms of charge sharing affecting the output for any generic pull-down network, and reduces the body effect in the pull-down network.

- Assume that the clock  $\phi$  is initially in the evaluate state ( $\phi=1$ ), and the block enters the precharge state ( $\phi = 0$ ). Is there a problem, or is there any benefit, if the last three evaluate switches are removed? Explain.

#### Solution

There is a problem during the precharge stage. If all precharged nodes are discharged during the evaluate stage, when the precharge FETs simultaneously turn on, the pull-down

networks will initially remain on, creating a short circuit. This continues in each gate until the previous gate charges, disabling its pull-down network.

24. [C, Spice, 4.3] Figure 6.18 shows a dynamic CMOS circuit in Domino logic. In determining source and drain areas and perimeters, you may use the following approximations:  $AD = AS = W \times 0.625\mu\text{m}$  and  $PD = PS = W + 1.25\mu\text{m}$ . Assume 0.1 ns rise/fall times for all inputs, including the clock. Furthermore, you may assume that all the inputs and their complements are available, and that all inputs change during the precharge phase of the clock cycle.
- What Boolean functions are implemented at outputs  $F$  and  $G$ ? If  $A$  and  $B$  are interpreted as two-bit binary words,  $A = A_1A_0$  and  $B = B_1B_0$ , then what interpretation can be applied to output  $G$ ?

**Solution**

$$F = A_0B_0 + \bar{A}_1\bar{B}_1, G = F(A_0B_0 + \bar{A}_1\bar{B}_1)$$

If  $A$  and  $B$  are interpreted as two-bit binary words, output  $G$  is high if  $A = B$ : a comparator

- Which gate (1 or 2) has the highest potential for harmful charge sharing and why? What sequence of inputs (spanning two clock cycles) results in the worst-case charge-sharing scenario? Using SPICE, determine the extent to which charge sharing affects the circuit for this worst case.



Figure 6.18 DOMINO logic circuit.

**Solution**

Gate 2 has the higher potential for harmful charge sharing because the capacitance that contributes to charge sharing is larger than in gate 1.

The sequence of inputs resulting in the worst-case charge sharing is  $A_0 = B_0$  and  $A_1 = B_1$  for the first cycle. Then  $A_0 = B_0$  and  $A_1 \neq B_1$  for the second cycle such that  $A_1/\bar{A}_1$  transistor that is on during the second cycle is the same as in the first cycle. For example,  $A_0 = B_0 = A_1 = B_1 = V_{DD}$  in cycle 1 and  $A_0 = B_0 = A_1 = V_{DD}, B_1 = 0 \text{ V}$  in cycle 2. This

will cause the charge at the output of gate 2 to be shared with the total parasitic capacitance at the drains of the  $A_I$ ,  $\bar{A}_I$ , and  $B_I$  transistors.



25. [M, Spice, 4.3] In this problem you will consider methods for eliminating charge sharing in the circuit of Figure 6.18. You will then determine the performance of the resulting circuit.

- a. In problem 24 you determined which gate (1 or 2) suffers the most from charge sharing. Add a single 2/0.25 PMOS precharge transistor (with its gate driven by the clock  $\phi$  and its source connected to  $V_{DD}$ ) to one of the nodes in that gate to maximally reduce the charge-sharing effect. What effect (if any) will this addition have on the gate delay? Use SPICE to demonstrate that the additional transistor has eliminated charge sharing for the previously determined worst-case sequence of inputs.

#### Solution

The additional precharge transistor should charge the node that is shared by the  $A_I$  and  $\bar{A}_I$  transistor drains and the  $F$  transistor source. Assuming the gate delay is dominated by the precharge stage, this will reduce the gate delay by briefly aiding the precharging of gate 2. SPICE output with additional precharge transistor.



- b.** For the new circuit (including additional precharge transistor), find the sequence of inputs (spanning two clock cycles) that results in the worst-case delay through the circuit. Remember that precharging is another factor that limits the maximum clocking frequency of the circuit, so your input sequence should address the worst-case precharging delay.

**Solution**

The worst-case delay results from  $A = B$  for two consecutive cycles. This results in the maximum charging and discharging of the internal nodes

- c.** Using SPICE on the new circuit and applying the sequence of inputs found in part (b), find the maximum clock frequency for correct operation of the circuit. Remember that the precharge cycle must be long enough to allow all precharged nodes to reach ~90% of their final values before evaluation begins. Also, recall that the inputs ( $A$ ,  $B$  and their complements) should not begin changing until the clock signal has reached 0 V (precharge phase), and they should reach their final values before the circuit enters the evaluation phase.

**Solution**

The maximum clock frequency is ~4.4 GHz.



- 26.** [C, None, 4.2–3] For this problem, refer to the layout of Figure 6.19.

- a.** Draw the schematic corresponding to the layout. Include transistor sizes.

**Solution**



- b.** What logic function does the circuit implement? To which logic family does the circuit belong?

**Solution**

The circuit implements  $\text{Out} = \overline{A+BC}$ . It is in the pseudo NMOS family.

- c.** Does the circuit have any advantages over fully complementary CMOS?

**Solution**

The circuit uses less area than a fully complementary CMOS implementation.

- d.** Calculate the worst-case  $V_{OL}$  and  $V_{OH}$ .

**Solution**

$V_{OH} = V_{DD} = 2.5V$ . To find  $V_{OL}$ , assume that we can combine  $M_B$  and  $M_C$  into one NMOS with  $W/L = 0.75/0.25$ . Then the worst case  $V_{OL}$  occurs when  $A=0$  and the combined BC NMOS is on. Assume that  $V_{OL}$  is less than  $V_{DSATn}$ . Then the NMOS device is in the linear region. The PMOS device will be velocity saturated. Equating the currents at the output gives:

$$k'_p \cdot \frac{W}{L} \cdot V_{DSAT} \cdot (V_{GT} - 0.5V_{DSAT}) \cdot (1 + \lambda V_{DS}) + k'_n \cdot \frac{W}{L} \cdot V_o \cdot (V_{GT} - 0.5V_o) \cdot (1 + \lambda V_o) = 0$$

The only unknown in this 3rd order polynomial is  $V_o$ . Solving for  $V_o$  gives  $V_{OL} = 51.2\text{mV}$

- e.** Write the expressions for the area and perimeter of the drain and source for all of the FETs in terms of  $\lambda$ . Assume that the capacitance of shared diffusions divides evenly between the sharing devices. Copy the layout into Magic, extract and simulate to find the worst-case  $t_{pHL}$  time. For what input transition(s) does this occur? Name all of the parasitic capacitances that you would need to know to calculate this delay by hand (you do not need to perform the calculation).



**Figure 6.19** Layout of complex gate.

**Solution**

Call the PMOS device P, and name the other devices by their input signal.

$$AD_P = AS_P = 19\lambda^2, PD_P = PS_P = 15\lambda.$$

$$AS_A = 40\lambda^2, PS_A = 18\lambda.$$

$$AD_A = (3 \times 8 + 3 \times 12) \lambda^2 / 2 = 30 \lambda^2. PD_A = 16 \lambda / 2 = 8 \lambda.$$

$$AD_B = AD_A, PD_B = PD_A.$$

$$AS_B = 36 \lambda^2 / 2 = 18 \lambda^2. PS_B = 6 \lambda / 2 = 3 \lambda.$$

$$AD_C = AS_B, PD_C = PS_C.$$

$$AS_C = 60 \lambda^2. PS_C = 22 \lambda.$$

We can narrow the number of transitions to look at for determining the worst case  $t_{pHL}$ . The worst case capacitance occurs when the internal node between  $M_B$  and  $M_C$  is charged up to  $V_{DD}$ . Then the worst case delay will occur when either  $M_A$  or the  $M_B, M_C$  pair discharges this capacitance. If the series devices are doing the discharging, we need to consider the case where  $M_B$  is initially on and where  $M_B$  is initially off.

The simulation shows that the worst-case transition occurs over three cycles:  $ABC = 010$  to  $000$  to  $011$  produces the worst-case  $t_{pHL}$ . This is worse than when  $MA$  discharges the node ( $ABC = 010$  to  $110$ ) or when  $MB$  is initially on ( $ABC = 010$  to  $011$ ).

We could calculate  $t_{pHL}$  using either the equivalent resistance method or the average current method. In either case,  $C_L$  would include the following parasitic capacitances:

$$C_{GDP MOS} + C_{DBP MOS} + C_{GDA} (\text{no Miller effect b/c input not changing}) + C_{DBA} + C_{GDB} + C_{DBB} + C_{GSB} + C_{GDC} + C_{DBC}.$$

- 27.** [E, None, 4.4] Derive the truth table, state transition graph, and output transition probabilities for a three-input XOR gate with independent, identically distributed, uniform white-noise inputs.

#### Solution

The truth table of a three-input XOR gate is:

| A | B | C | Y |
|---|---|---|---|
| 0 | 0 | 0 | 0 |
| 0 | 0 | 1 | 1 |
| 0 | 1 | 0 | 1 |
| 0 | 1 | 1 | 0 |
| 1 | 0 | 0 | 1 |
| 1 | 0 | 1 | 0 |
| 1 | 1 | 0 | 0 |
| 1 | 1 | 1 | 1 |

**Table 1: Truth table**

As the inputs are independent, identically distributed, uniform white noise, each of the possible combinations of three input values, has a probability equal to  $1/8$ . From the table, the probability of having the output equal to 0 is  $p_0 = 0.5$ . In the same way

- 28.** [C, None, 4.4] Figure 6.20 shows a two-input multiplexer. For this problem, assume independent, identically-distributed uniform white noise inputs.

- a. Does this schematic contain reconvergent fan-out? Explain your answer.

#### Solution

This schematic has reconvergent fan-out because both inputs of the or gate depend on the value of S.

- b.** Find the exact signal ( $P_1$ ) and transition ( $P_{0 \rightarrow 1}$ ) formulas for nodes X, Y, and Z for: (1) a static, fully complementary CMOS implementation, and (2) a dynamic CMOS implementation.



Figure 6.20 Two-input multiplexer

#### Solution

Assuming a fully complementary CMOS implementation:

X is the output of an AND gate with independent, identically-distributed uniform white noise inputs. As only when both inputs are equal to 1 the output is 1,  $P_1 = 0.25$ . On the other hand  $P_{0 \rightarrow 1} = P_0 P_1 = 0.25(1 - 0.25) = 0.1875$ .

Y is also the output of an AND gate with independent, identically distributed uniform white noise inputs. The analysis is the same as with X.

If we represent the truth table of the schematic we will see that  $P_1 = 0.5$ . Then  $P_{0 \rightarrow 1} = P_0 P_1 = 0.5(1 - 0.5) = 0.25$ .

Assuming a dynamic CMOS implementation:

In the same way as before, for X,  $P_1 = 0.25$ . In order to obtain the transition probability, an n-tree dynamic gate will be assumed. In this case:  $P_{0 \rightarrow 1} = P_0 = 0.75$ .

The analysis for Y is equal to the analysis for X.

For Z, using the truth table of the schematic we obtain, again,  $P_1 = 0.5$ . For the transition probability, it will be assumed that a np-CMOS structure is used.. Then, Z is the output of a p-tree dynamic gate. Then:  $P_{0 \rightarrow 1} = P_1 = 0.5$ .

- 29.** [M, None, 4.4] Compute the switching power consumed by the multiplexer of Figure 6.20, assuming that all significant capacitances have been lumped into the three capacitors shown in the figure, where  $C = 0.3 \text{ pF}$ . Assume that  $V_{DD} = 2.5 \text{ V}$  and independent, identically-distributed uniform white noise inputs, with events occurring at a frequency of 100 MHz. Perform this calculation for the following:

- a.** A static, fully-complementary CMOS implementation

#### Solution

Switching power is:

$$P_{SW} = \alpha \cdot f \cdot C \cdot V_{DD}^2 = (\alpha_{X0 \rightarrow 1} + \alpha_{Y0 \rightarrow 1} + \alpha_{Z0 \rightarrow 1}) \cdot f \cdot C \cdot V_{DD}^2$$

We calculated in Problem 27 the probabilities of a 0->1 transition for each node:  $P_{0 \rightarrow 1}$  for X and Y is 0.1875 and  $P_{0 \rightarrow 1}$  for Z is 0.25. Thus,  $P_{SW} = (2 * 0.1875 + 0.25) * 100\text{MHz} * 0.3\text{pF} * 2.5^2 = 117.2\mu\text{W}$ .

- b.** A dynamic CMOS implementation

#### Solution

In Problem 27 for a dynamic np-CMOS gate, we calculated the probabilities:  $P_{0 \rightarrow 1}$  for X and Y is 0.75 and  $P_{0 \rightarrow 1}$  for Z is 0.5. Thus,  $P_{SW} = (2*0.75+0.5)*100MHz*0.3pF*2.5^2 = 375\mu W$ .

30. For the circuit shown Figure 6.21 ignore DIBL and S=100mV/decade.
- What is the logic function implemented by this circuit? Assume that all devices (M1-M6) are  $0.5\mu m/0.25\mu m$ .

**Solution**

$$\overline{A(B+C)}$$

- Let the drain current for each device (NMOS and PMOS) be  $1\mu A$  for NMOS at  $V_{GS} = V_T$  and PMOS at  $V_{SG} = V_T$ . What input vectors cause the worst case leakage power for each output value? Explain (state all the vectors, but do not evaluate the leakage).

**Solution**

When the output is high, the worst-case leakage occurs when two transistors leak in parallel: ABC = 100. When the output is low, the worst-case leakage also occurs when two transistors leak in parallel: ABC = 110 or ABC = 101.

- Suppose the circuit is active for a fraction of time  $d$  and idle for  $(1-d)$ . When the circuit is active, the inputs arrive at 100 MHz and are uniformly distributed ( $Pr_{(A=1)} = 0.5$ ,  $Pr_{(B=1)} = 0.5$ ,  $Pr_{(C=1)} = 0.5$ ) and independent. When the circuit is in the idle mode, the inputs are fixed to one you chose in part (b). What is the duty cycle  $d$  for which the active power is equal to the leakage power?



Figure 6.21 CMOS logic gate.

**Solution**

$$d * P_{active} = (1-d) P_{leakage}. P_{active} = \alpha_{0 \rightarrow 1} * f * C_L * V_{DD}^2 = (3/8 * 5/8) * (100 * 10^6) * (50 * 10^{-15}) * (2.5^2) = 7.3 \mu W.$$

$$P_{leakage} (ABC = 100) = V_{DD} * 2 * I_{leakM1} = 5 * I_o 10^{-\frac{V_T}{S}} = 5 * 1 \mu A 10^{-\frac{0.43}{0.1}} = 251 \mu W.$$

Plugging the power numbers into the activity equation and solving for d gives  $d = 3.4 * 10^{-8}$ .

### DESIGN PROJECT

Design, lay out, and simulate a CMOS four-input XOR gate in the standard 0.25 micron CMOS process. You can choose any logic circuit style, and you are free to choose how many stages of logic to use: you could use one large logic gate or a combination of smaller logic gates. The supply voltage is set at 2.5 V! Your circuit must drive an external 20 fF load in addition to whatever internal parasitics are present in your circuit.

The primary design objective is to minimize the propagation delay of the worst-case transition for your circuit. The secondary objective is to minimize the area of the layout. At the very worst, your design must have a propagation delay of no more than 0.5 ns and occupy an area of no more than 500 square microns, but the faster and smaller your circuit, the better. Be aware that, when using dynamic logic, the precharge time should be made part of the delay.

The design will be graded on the magnitude of  $A \times t_p^2$ , the product of the area of your design and the square of the delay for the worst-case transition.

# Chapter 10 SOLUTIONS

1. [C, None, 9.2] For the circuit in Figure 0.1, assume a unit delay through the Register and Logic blocks (i.e.,  $t_R = t_L = 1$ ). Assume that the registers, which are positive edge-triggered, have a set-up time  $t_S$  of 1. The delay through the multiplexer  $t_M$  equals  $2 t_R$
- Determine the minimum clock period. Disregard clock skew.

**Solution**

The circuit and paths of interest has been reproduced for convenience in Figure 0.1



**Figure 0.1** Sequential circuit.

Out of the 4 paths shown in the figure, p1 is the critical one and determines the lower bound on the clock period. Using  $T \geq t_{\text{reg}} + t_{\text{logic}} + t_{\text{setup}} - \delta$ , we get  $T_{\min} = 1+7+1 = 9$ .

- Repeat part a, factoring in a nonzero clock skew:  $\delta = t'_S - t_S = 1$ .

**Solution**

With finite clock skew, the time periods for different paths are as follows :  $T_{\min}(p1) = 9 - 1 = 8$ ,  $T_{\min}(p2) = 6$ ,  $T_{\min}(p3) = 7$ ,  $T_{\min}(p4) = 7 - 1 = 6$  (Note that the clock skew is 0 for paths p2 and p3). Therefore the minimum clock period is  $T_{\min} = 8$ .

- Repeat part a, factoring in a non-zero clock skew:  $\delta = t'_S - t_S = 4$ .

**Solution**

As the clock skew increases, the most significant path changes. Repeating the calculations in part (b) we get :  $T_{\min}(p1) = 9 - 4 = 5$ ,  $T_{\min}(p2) = 6$ ,  $T_{\min}(p3) = 7$ ,  $T_{\min}(p4) = 7 - 4 = 3$ (Note that the clock skew is 0 for paths p2 and p3). Therefore the minimum clock period is  $T_{\min} = 7$ .

- d. Derive the maximum positive clock skew that can be tolerated before the circuit fails.

**Solution**

The maximum positive clock skew is determined by the inequality  $\delta \leq t_{cd,reg} + t_{cd,logic}$ . Assuming that the contamination delay is same as the propagation delay, we get.  $\delta_{max} = 1 + 3 + 2 = 6$ . Note that p4 determines the maximum tolerable skew (the fastest path will produce the earliest contamination). Paths p3 and p2 do not matter since there is no skew involved.

- e. Derive the maximum negative clock skew that can be tolerated before the circuit fails.

**Solution**

The maximum positive negative skew has no bound since the clock period has no upper bound.

2. This problem examines sources of skew and jitter.

- a. A balanced clock distribution scheme is shown in Figure 0.2. For each source of variation, identify if it contributes to skew or jitter. Circle your answer in Table 0.1



**Figure 0.2** Sources of Skew and Jitter in Clock Distribution.

|                                                |      |        |
|------------------------------------------------|------|--------|
| 1) Uncertainty in the clock generation circuit | Skew | Jitter |
| 2) Process variation in devices                | Skew | Jitter |
| 3) Interconnect variation                      | Skew | Jitter |
| 4) Power Supply Noise                          | Skew | Jitter |
| 5) Data Dependent Load Capacitance             | Skew | Jitter |
| 6) Static Temperature Gradient                 | Skew | Jitter |

**Table 0.1** Sources os Skew and Jitter

- b. Consider a Gated Clock implementation where the clock to various logical modules can be individually turned off as shown in Figure 0.3. (i.e.,  $Enable_1, \dots, Enable_N$  can take on dif-



**Figure 0.3** Jitter in clock gating

ferent values on a cycle by cycle basis). Which approach (A or B) results in lower jitter at the output of the input clock driver? (hint: consider gate capacitance) Explain.

**Solution**

Approach A results in lower jitter. For Approach A, the capacitance seen by CLK is independent of data (the Enable signals) to first order.

3. Figure 0.4 shows a latch based pipeline with two combinational logic units.



**Figure 0.4** Latch Based Pipeline

Recall that the timing diagram of a combinational logic block and a latch can be drawn as follows, where the shaded region represents that the data is not ready yet.



**Figure 0.5** Timing diagrams of combinational logic and latch

Assume that the contamination delay  $t_{cd}$  of the combinational logic block is zero, and the  $t_{clk-q}$  of the latch is zero too.

- a. Assume the following timing for the input  $I$ . Draw the timing diagram for the signals  $a$ ,  $b$ ,  $c$ ,  $d$  and  $e$ . Include the clock in your drawing.



Figure 0.6 Input timing

**Solution**



- b. State the deadline for the computation of the signal  $b$  and  $d$ , i.e. when is the latest time they can be computed, relative to the clock edges. In your diagram for part (a), label with a “ $<-->$ ” the “slack time” that the signals  $b$  and  $d$  are ready before the latest time they must be ready.

**Solution**

$b$  should be ready before the rising edge of  $CLK$  for the negative latch to latch and hold its value.  $d$  should be ready before the falling edge of  $CLK$  for the second positive latch to latch and hold its value.

- c. Hence deduce how much the clock period can be reduced for this shortened pipeline. Draw the modified timing diagram for the signals  $a$ ,  $b$ ,  $c$ ,  $d$ , and  $e$ . Include the clock in your drawing.

**Solution**

The clock can be reduced by 20 ns.

In general, it may be difficult to identify how much slack can be removed from the



clock because it depends on the length of the pipeline too.

4. Consider the circuit shown in Figure 0.7.



**Figure 0.7** Sequential Circuit

- a. Use SPICE to measure  $t_{max}$  and  $t_{min}$ . Use a minimum-size NAND gate and inverter. Assume no skew and a zero rise/fall time. For the registers, use the following:
- A TSPC Register.
  - A C<sup>2</sup>MOS Register.

#### Solution

From Figure 0.8 and Figure 0.9 we can see that for the TSPC Register:

$t_{r,max}=175\text{ps}$ ,  $t_{r,min}=94\text{ps}$ ,  $t_{and,max}=90\text{ps}$ ,  $t_{and,min}=83\text{ps}$ . (Note that we don't need the inverter to implement the logic when we use TSPC Registers).

$$\text{So } T > t_{r,max} + t_{and,max} = 265\text{ps.}$$

From Figure 0.10 and Figure 0.11 we can see that for the C<sup>2</sup>MOS Register:

$$t_{r,max}=82\text{ps}, t_{r,min}=49\text{ps}, t_{inv,max}=40\text{ps}, t_{inv,min}=33\text{ps}.$$

$$\text{So } T > t_{r,max} + t_{inv,max} + t_{and,max} = 212.$$

**Figure 0.8**  $t_{r,\max}$  and  $t_{r,\min}$ **Figure 0.9**  $t_{and,\max}$  and  $t_{and,\min}$ **Figure 0.10**  $t_{r,\max}$  and  $t_{r,\min}$

**Figure 0.11**  $t_{inv,max}$  and  $t_{inv,min}$ 

- b.** Introduce clock skew, both positive and negative. How much skew can the circuit tolerate and still function correctly?

#### Solution

We will examine the case with the TSPC Register.

The maximum positive skew that the circuit can tolerate is 100ps. Figure 0.12 show the correct operation with no skew. The next two figures show the cases with skew of 100ps and 110ps. It is obvious that in the second case there is some corruption.

**Figure 0.12** No skew**Figure 0.13** 100ps skew and 110ps skew

When the clock is routed in the opposite direction of the data (negative skew) the circuit operates correctly, with a negative impact on the circuit performance.

- c. Introduce finite rise and fall time to the clocks. Show what can occur and describe why.

**Solution**

As the rise and fall times of the clock increase, both chains of the C<sup>2</sup>MOS chains are on simultaneously. The first graph shows the correct operation of the register, while in the next two graphs rise and fall times of 2ns and 3ns respectively are introduced.



**Figure 0.14** Ons rise and fall times



**Figure 0.15** 2ns and 3ns rise and fall times

5. Consider the following latch based pipeline circuit shown in Figure 0.16.

Assume that the input, *IN*, is valid (i.e., set up) 2ns before the falling edge of *CLK* and is held till the falling edge of *CLK* (there is no guarantee on the value of *IN* at other times). Determine the maximum *positive* and *negative* skew on *CLK'* for correct functionality.



Figure 0.16 Latch based

**Solution**

The positive and negative skew are given after the analysis below.

Positive Skew

$$\delta_{MAX}^+ = t_{D-Q} + t_{CD} = 2\text{ns}$$

Negative Skew

$$\delta_{MAX}^- = 2 + \frac{T}{2} + t_p + t_{su} = 3\text{ns}$$

6. For the L1-L2 latch based system from Figure 0.17, with two overlapping clocks derive all the necessary constraints for proper operation of the logic. The latches have setup times  $T_{SU1}$  and  $T_{SU2}$ , data-to-output delays  $T_{D-Q1}$  and  $T_{D-Q2}$ , clock-to-output delays  $T_{C_{lk-Q1}}$  and  $T_{C_{lk-Q2}}$ , and hold times  $T_{H1}$  and  $T_{H2}$ , respectively. Relevant clock parameters are also illustrated in Figure 0.17. The constraints should relate the logic delays, clock period, overlap time  $T_{OV}$  pulse widths  $PW1$  and  $PW2$  to latch parameters and skews.



Figure 0.17 Timing constraints

#### Solution

Latest arrival of the D2 signal in the current clock cycle (“Setup 2”)

$$PW2 \geq T_{ov} + T_{sw2} - T_{sw1} + T_{D-Q1} + T_{skr2} - T_{skr1}$$

$$PW1 + PW2 \geq T_{ov} + T_{sw2} + T_{C-Q1} + T_{skl1} - T_{skr2}$$

Latest arrival of the D1 signal in the next clock cycle (“Setup 1”)

$$P \geq T_{D-Q1} + T_{D-Q2} + T_{gates}$$

$$PW1 \geq -P + T_{C-Q1} + T_{D-Q1} + T_{sw1} + T_{gates} + T_{skl2} + T_{skr2}$$

$$P \geq -T_{ov} + T_{C-Q2} + T_{sw1} + T_{gates} + T_{skl1} + T_{skl2}$$

Earliest changes of D1 signal (“Hold 1”)

$$T_{d, logic} > T_{ov} + T_{H1} + T_{skr1} + T_{skl2} - T_{C-Q2}$$

$$T_{d, logic} > PW1 + T_{H1} + T_{skr1} + T_{skl1} - T_{D-Q1} - T_{C-Q2}$$

Earliest changes of D2 signal (“Hold 2”)

$$PW2 < T_{H1} - T_{H2} + T_{D-Q1} + T_{ov} + T_{skr1} - T_{skr2}$$

$$PW1 + PW2 \geq T_{ov} + P + T_{C-Q1} - T_{H2} - T_{skr1} - T_{skr2}$$

7. For the self-timed circuit shown in Figure 0.18, make the following assumptions. The propagation through the NAND gate can be 5 nsec, 10 nsec, or 20 nsec with equal probability. The logic in the succeeding stages is such that the second stage is always ready for data from the first.

- a. Calculate the average propagation delay with  $t_{hs} = 6$  nsec.

**Solution**

$$t_p = \frac{5 + 10 + 20}{3} + 6 = 17.67\text{ns}$$

$$f = 56.6\text{MHz}$$

- b.** Calculate the average propagation delay with  $t_{hs}=12$  nsec.

**Solution**

$$t_p = \frac{5 + 10 + 20}{3} + 12 = 23.67\text{ns}$$

$$f = 42.2\text{MHz}$$

- c.** If the handshaking circuitry is replaced by a synchronous clock, what is the smallest possible clock frequency?



**Figure 0.18** Self-timed circuit.

**Solution**

In setting clock frequency, we account for the longest delay:

$$f = \frac{1}{20\text{ns}} = 50\text{MHz}$$

Note that the delay in the handshaking circuit can be a strong factor in choosing clocking strategies.

- 8.** Lisa and Marcus Allen have a luxurious symphony hall date. After pulling out of their driveway, they pull up to a four-way stop sign. They pulled up to the sign at the same time as a car on the cross-street. The other car, being on the right, had the right-of-way and proceeded first. On the way they also have to stop at traffic signals. There is so much traffic on the freeway, the metering lights are on. Metering lights regulate the flow of merging traffic by allowing only one lane of traffic to proceed at a time. With all the traffic, they arrive late for the symphony and miss the beginning. The usher does not allow them to enter until after the first movement.

On this trip, Lisa and Marcus proceeded through both synchronizers and arbiters. Please list all and explain your answer.

**Solution**

At the stop sign, the law of "right of way" is the **arbiter** of two cars arriving at the same time.

The stop light may also be considered an **arbiter** as it ensured that two cars don't try to merge simultaneously, however it is more like a **synchronizer** as it allows traffic into an intersection only at specific times.

The metering lights are **synchronizers** as they allow cars to enter the freeway at distinct times.

The user is a **synchronizer** making sure that people go in and out of the concert at the proper times.

9. Design a self-timed FIFO. It should be six stages deep and have a two phase handshaking with the outside world. The black-box view of the FIFO is given in Figure 0.19.



**Figure 0.19** Overall structure of FIFO.

### Solution

The block diagram of the FIFO is given in the next figure.



The registers are dual-edge triggered on the enable signal, and Done is just a delayed version of enable. The FIFO is full if the Enable Signals alternate between 0's and 1's. On the other hand, the FIFO is empty if all enable signals are equal (either 0 or 1).

10. System Design issues in self-timed logic

One of the benefits of using self-timed logic is that it delivers average-case performance rather than the worst-case performance that must be assumed when designing synchronous circuits. In some applications where the average and worst cases differ significantly you can have significant improvements in terms of performance. Here we consider the case of ripple carry addition. In a synchronous design the ripple carry adder is assumed to have a worst case performance which means a carry-propagation chain of length  $N$  for an  $N$ -bit adder. However, as we will prove during the course of this problem the average length of the carry-propagation chain assuming uniformly distributed input values is in fact  $O(\log N)$ !

- a. Given that  $p_n(v) = \Pr(\text{carry-chain of an } n\text{-bit addition is } \geq v \text{ bits})$ , what is the probability that the carry chain is of length  $k$  for an  $n$ -bit addition?

### Solution

The  $\Pr(\text{carry-chain} = k \text{ bits}) = \Pr(\text{carry-chain is } \geq k \text{ bits}) - \Pr(\text{carry-chain is } \geq k+1 \text{ bits})$ , which is:

$$P_k = p_n(v) - p_n(v + 1)$$

- b. Given your answer to part (a), what is the average length of the carry chain (i.e.,  $a_n$ )? Simplify your answer as much as possible.

Now  $p_n(v)$  can be decomposed into two mutually-exclusive events, A and B. Where A represents that a carry chain of length  $\geq v$  occurs in the first  $n-1$  bits, and B represents that a carry chain of length  $v$  ends on the  $n$ th bit

### Solution

The average length of the carry chain is simply the expected value of  $P_k$ , which is:

$$\begin{aligned}
 a_n &= E[P_k] = \sum_{i=0}^n i \cdot (p_n(i) - p_n(i+1)) \\
 &= p_n(1) - p_n(2) + 2p_n(2) - 2p_n(3) + \dots \\
 &= p_n(1) + p_n(2) + \dots + p_n(n) \\
 &= \sum_{i=1}^n p_n(i)
 \end{aligned}$$

c. Derive an expression for  $\Pr(A)$ .

**Solution**

$\Pr(A)$  is simply  $p_{n-1}(v)$ .

d. Derive an expression for  $\Pr(B)$ . (HINT: a carry bit  $i$  is propagated only if  $a_i \neq b_i$ , and a carry chain begins only if  $a_i = b_i = 1$ ).

**Solution**

For  $B$  to occur a carry must be generated in bit  $(n - v)$  and then propagated all the way to bit  $n$ . In addition we must ensure that no carry chain of length  $v$  occurs in the initial  $(n - v)$  bits. The probability of a carry being generated is  $\Pr(A = B = 1) = (1/2)^2 = 1/4$ , and the probability of this carry being propagated until bit  $n$  is  $\Pr(A \neq B)^{v-1} = (1/2)^{v-1}$ . The probability of a carry chain of length  $v$  not occurring in the first  $(n-v)$  bits is  $(1 - p_{n-v}(v))$ . Hence the probability of event  $B$  occurring is:

$$\Pr(B) = (1 - p_{n-v}(v)) \cdot \frac{1}{4} \cdot \frac{1}{2^{v-1}} = \frac{1 - p_{n-v}(v)}{2^{v+1}}$$

e. Combine your results from (c) and (d) to derive an expression for  $p_n(v) - p_{n-1}(v)$  and then bound this result from above to yield an expression in terms of only the length of the carry chain (i.e.,  $v$ ).

**Solution**

From the question we are given that:

$$p_n(v) = \Pr(A) + \Pr(B) = p_{n-1}(v) + \frac{1 - p_{n-v}(v)}{2^{v+1}}$$

So all we have to do is substitute in our values of  $\Pr(A)$  and  $\Pr(B)$  and then rearrange the equation to yield the required expression:

$$\begin{aligned}
 p_n(v) &= p_{n-1}(v) + \frac{1 - p_{n-v}(v)}{2^{v+1}} \\
 \Rightarrow p_n(v) - p_{n-1}(v) &= \frac{1 - p_{n-v}(v)}{2^{v+1}}
 \end{aligned}$$

Since  $p_{n-v}(v)$  is a probability it is non-negative and hence we can bound  $(1 - p_{n-v}(v))$  from above by 1, thus:

$$p_n(v) - p_{n-1}(v) \leq \frac{1}{2^{v+1}}$$

**f.** Using what you've shown thus far, derive an upper bound for the expression:

$$\sum_{i=v}^n (p_i(v) - p_{i-1}(v))$$

Use this result, coupled with the fact that  $p_n(v)$  is a probability (i.e., it's bounded from above by 1), to determine a two-part upper bound for  $p_n(v)$ .

**Solution**

To derive the first upper bound we expand the given summation and collect terms:

$$\sum_{i=v}^n (p_i(v) - p_{i-1}(v)) = p_v(v) - p_{v-1}(v) + p_{v+1}(v) - p_v(v) + \dots + p_n(v) - p_{n-1}(v) = p_n(v)$$

where  $p_n(v)$  can be bounded from above by 1.

The second upper bound is calculated using the expression that we derived in (e), and substituting it into the given summation. Note that the expression derived in (e) is independent of the summation variable and hence the result is simply  $(n - v + 1)$  times the bound given in (e):

$$\sum_{i=v}^n (p_n(v) - p_{n-1}(v)) \leq \frac{n-v+1}{2^{v+1}}$$

Combining the two results we get the final, dual-valued upper bound on  $p_n(v)$ :

$$p_n(v) \leq \min\left\{1, \frac{n-v+1}{2^{v+1}}\right\}$$

**g.** (The magic step!) Bound  $n$  by a clever choice of  $k$  such that  $2^k \leq n \leq 2^{k+1}$  and exploit the fact that  $\log_2 x$  is concave down on  $(0, \infty)$  to ultimately derive that  $a_n \leq \log_2 n$ , which concludes your proof!

**Solution**

Go back to our original derivation of the average carry chain length (i.e.,  $E[P_k]$ ), and split the summation into two parts: those terms from 1 to  $(k-1)$ , and those terms from  $k$  to  $n$ .

$$E[P_k] = \sum_{i=1}^{k-1} p_n(v) + \sum_k^n p_n(v)$$

Now utilize your two upper bounds from (f) to bound the above expression:

$$\begin{aligned}
 E[P_k] &\leq \sum_{i=1}^{k-1} 1 + \sum_{i=k}^n \frac{n-i+1}{2^{i+1}} \\
 &= (k-1) + \sum_{i=k}^n \frac{n-i+1}{2^{i+1}}
 \end{aligned}$$

$$\leq (k-1) + \sum_{i=k}^n \frac{n}{2^{i+1}}$$

=

$$\begin{aligned}
 &(k-1) + \frac{n}{2} \sum_{i=k}^n 2^i \\
 &= (k-1) + \frac{n}{2} \left( \frac{1}{2^k} - \frac{1}{2^n} \right)
 \end{aligned}$$

$$\leq (k-1) + \frac{n}{2^k}$$

which is a linear function of  $n$ . At the limits defined for  $n$  we have:

$$n = 2^k \rightarrow E[P_k] = (k-1) + \frac{2^k}{2^k} = k = \log n$$

$$n = 2^{k+1} \rightarrow E[P_k] = (k-1) + \frac{2^{k+1}}{2^k} = k+1 = \log n$$

Since  $\log_2 n$  is concave down on  $(0, \infty)$  we have that  $\log_2 n$  is an upper bounds of the linear function of  $n$  (e.g., Figure 0.20) derived above. Hence  $E[P_k] \leq \log_2 n$  and we are finished.



Figure 0.20 Comparison of  $\log_2 n$  and a Linear Function of  $n$

- h.** Theoretically speaking, how much faster would a self-timed 64-bit ripple carry adder be than its synchronous counterpart? (You may assume that the overhead costs of using self-timed logic are negligible).

**Solution**

Given that a self-timed ripple-carry adder requires a delay on the order of  $\log_2 n$ , while a synchronous version requires a delay on order  $n$ , then the improvement is:

$$\text{Speedup} = \frac{\text{Speed}_{\text{self-timed}}}{\text{Speed}_{\text{synchronous}}} = \frac{1/(\log 64)}{1/64} = \frac{64}{\log 64} = \frac{32}{3}$$

- 11.** Figure 0.23 shows a simple synchronizer. Assume that the asynchronous input switches at a rate of approximately 10 MHz and that  $t_r = 2$  nsec,  $f_\phi = 50$  MHz,  $V_{IH} - V_{IL} = 0.5$  V, and  $V_{DD} = 2.5$  V.

- a.** If all NMOS devices are minimum-size, find  $(W/L)p$  required to achieve  $V_{MS} = 1.25$  V. Verify with SPICE.

**Solution**

Metastability occurs when both inputs to the cross coupled NANDs are high. One NAND of the cross-coupled pair is shown in Figure 0.21 a.



Figure 0.21 a) Nand Gate, b) Simplified Gate

To solve for the metastable point, we can simplify it to the gate shown in Figure 0.21 b, where the NMOS device size has been modified accordingly. Setting  $\text{Out} = \overline{\text{Out}} = 1.25\text{V}$  and assuming that both devices are velocity saturated, I have:

$$k_n \frac{W_n}{L} V_{DSATn} \left( V_{OUT} - V_{TN} - \frac{V_{DSATn}}{2} \right) + k_p \frac{W_p}{L} V_{DSATp} \left( V_{OUT} - V_{DD} - V_{TP} - \frac{V_{DSATp}}{2} \right) = 0$$

$$115 \times 10^{-6} \frac{0.375}{0.5} 0.63 \left( 1.25 - 0.43 - \frac{0.63}{2} \right) + (-30 \times 10^{-6}) \frac{W_p}{L} (-1) \left( 1.25 - 2.5 - (-0.4) - \frac{(-1)}{2} \right) = 0$$

which results in  $(W/L)p=2.6$ .

Hspice simulation gives  $V_M=1.23\text{V}$ .

**b.** Use SPICE to find  $\tau$  for the resulting circuit.

### Solution

The time constants for the metastable point to Vdd and ground are measured in Figure 0.22 and the two values are 237ps and 362ps.



**Figure 0.22** Measuring  $\tau$

**c.** What waiting time  $T$  is required to achieve a MTF of 10 years?

### Solution

We can use the following equation, with the largest time constant, to find the waiting time  $T$ .

$$\frac{1}{MTF} = \frac{(V_{IH} - V_{IL})e^{-T/\tau}}{V_{SWING}} \frac{t_r}{T_{SIGNAL} T_\phi}$$

$$(315360000\text{s})^{-1} = \frac{(0.5)e^{-T/(362\text{ps})}}{2.5} \frac{2\text{ns}}{\frac{1}{10\text{MHz}} \times \frac{1}{50\text{MHz}}}$$

$$(315360000\text{s})^{-1} = 200 \times 10^3 e^{-T/(362\text{ps})}$$

from which we get:

$$T = 11.5\text{ns}$$

**d.** Is it possible to achieve an MTF of 1000 years (where  $T > T_\phi$ )? If so, how?

### Solution



Figure 0.23 Simple synchronizer

$$\text{We have: } (315360000\text{s})^{-1} = 50 \times 10^3 e^{-T/(362\text{ps})}, \quad (T_\phi = 5\text{ns})$$

Solving yields:  $T = 9.42\text{ns} > T_\phi$

12. Explain how the phase-frequency comparator shown in Figure 0.24 works.



Figure 0.24 Phase-frequency comparator

The operation of the circuit is best explained with the timing diagrams below:



If the VCO clock ( $\bar{V}$ ) leads the reference clock then the  $\overline{\text{DOWN}}$  pulse is wider than the  $\overline{\text{UP}}$  pulse. That will eventually shift  $\bar{V}$  to the left so that the two clocks are locked.

The locked operation is shown in the next diagram. When the two clocks are locked



then the  $\bar{\text{UP}}$  and  $\bar{\text{DOWN}}$  pulses have equal widths.

13. The heart of any static latch is the cross-coupled structure shown in Figure 0.25 (part a).

- a. Assuming identical inverters with  $W_p/W_n = k_n'/k_p'$ , what is the metastable point of this circuit? Give an expression for the time trajectory of  $V_Q$ , assuming a small initial  $V_{d0}$  centered around the metastable point of the circuit,  $V_M$ .



Figure 0.25 Simple synchronizer

### Solution

To find the metastable point of the circuit we just need to find the gate voltage of one inverter that gives the same voltage at the inverter output.

Assuming that both devices are velocity saturated and neglecting channel length modulation, we can add the pmos and nmos currents. The equation looks like:

$$k_n \cdot \frac{W_n}{L} V_{DSATn} \left( V_M - V_{TN} - \frac{V_{DSATn}}{2} \right) + k_p \cdot \frac{W_p}{L} V_{DSATp} \left( V_M - V_{DD} - V_{TP} - \frac{V_{DSATp}}{2} \right) = 0$$

Solving for  $V_M$ ,

$$V_M = \frac{V_{TN} + \frac{V_{DSATn}}{2} + \frac{V_{DSATp}}{V_{DSATn}} \left( V_{DD} + V_{TP} + \frac{V_{DSATp}}{2} \right)}{1 + \frac{V_{DSATp}}{V_{DSATn}}}$$

The time trajectory for the output can be modeled by:

$$v(t) = V_{MS} + (V_{d0} - V_{MS}) e^{t/\tau}$$

- b.** The circuit in part b has been proposed to detect metastability. How does it work? How would you generate a signal M that is high when the latch is metastable?

**Solution**

If  $Q = \bar{Q}$ , then NMOS are off, so the PMOS devices will pull A and B high. That means that, when M goes high, the latch goes into the metastable state.



- c.** Consider the circuit of part c. This circuit was designed in an attempt to defeat metastability in a synchronizer. Explain how the circuit works? What is the function of the delay element?

**Solution**

If the latch becomes metastable, then M will go high and turn on the appropriate NMOS pulling the latch out of metastability. The time delay  $\tau$  gives the Metastability detector and the latch time to pull out of metastability.

- 14.** An adjustable duty-cycle clock generator is shown in Figure 0.26. Assume the delay through the delay element matches the delay of the multiplexer.

- a.** Describe the operation of this circuit

**Solution**

The circuit works by using the overlap of two clocks from a ring oscillator to dictate the duty cycle. Longer overlap yields a greater duty cycle.



Clock signal from ANDing 1 + 4 gives 28.6% duty cycle.

- b. What is the range of duty-cycles that can be achieved with this circuit.

**Solution**

The range of duty cycles is: 7-50%.

- c. Using an inverter and an additional multiplexer, show how to make this circuit cover the full range of duty cycles.



**Figure 0.26** Clock duty-cycle generator.

**Solution**

Inverting the output signal converts a 25% duty cycle to a 75% duty cycle.



- 15.** The circuit style shown in Figure 0.27.a has been proposed by Acosta et. al. as a new self-timed logic style. This structure is known as a Switched Output Differential Structure<sup>1</sup>.
- Describe the operation of the SODS gate in terms of its behavior during the pre-charge phase, and how a valid completion signal can be generated from its outputs.

**Solution**

Pre-charging is active low, and the inputs must become valid prior to the rising edge of  $\Phi$ . During pre-charge the outputs are shorted together and at some point one of the pull-down networks will provide a path to ground. This path to ground will turn on one of the two pull-up PMOS transistors, connecting the two outputs to  $V_{dd}$ . Hence the outputs are high during the pre-charge phase. During the evaluate phase the outputs will become complimentary so you can use a NAND gate to signal completion when it's output goes high.

- What are the advantages of using this logic style in comparison to the DCVSL logic style given in the notes?

**Solution**

The advantage of using SODS is that the delay of the gate is independent of the topology of the pull-down networks, and the gate will be faster due to reduced output capacitance of the switching nodes.

- What are the disadvantages of using this style in comparison to DCVSL?

**Solution**

The disadvantage is that it requires the inputs to become valid before the pre-charge phase has ended, the outputs also exhibit reduced noise margins due to the problem discussed in (d). In addition, there can be significant static power dissipation during the evaluate phase if the gate is not designed carefully.

- Figure 0.27.b shows a 2-input AND gate implemented using a SODS style. Simulate the given circuit using Hspice. Do you notice any problems? Explain the cause of any problems that you may observe and propose a fix. Re-simulate your corrected circuit and verify that you have in fact fixed the problem(s).

---

<sup>1</sup> A.J. Acosta, M. Valencia, M.J. Bellido, J.L. Huertas, "SODS: A New CMOS Differential-type Structure," *IEEE Journal of Solid State Circuits*, vol. 30, no. 7, July 1995, pp. 835-838



Figure 0.27 a - SODS Logic Style



Figure 0.27 b - 2-input And Gate in SODS Style

**Solution**

Using an input of  $A=B=0$ , followed by  $A=B=1$  yields the outputs shown in the next graph. Note that there is a reduced noise margin in the outputs as they cannot be pulled to ground.



The reduced noise margins are caused by the fact that during the evaluate phase either node A or B will be pulled up to  $V_{dd}$  through an NMOS pass gate (Figure 8). For example: suppose that during the pre-charge phase node A is pulled low by its PDN and B is left to float, while nodes C and D are pre-charged to  $V_{dd}$ .

When  $\Phi$  goes high the outputs are connected to A and B via the  $M_3$  and  $M_4$ . Since A is connected to ground it will discharge C, pulling out low as well. Node D will remain pulled up to  $V_{dd}$  via  $M_2$  and hence B will be pulled up to  $(V_{dd} - V_{tN})$  through  $M_4$ . This leaves the gate of  $M_1$  at  $(V_{dd} - V_{tN})$  as well, which will turn on  $M_1$ .

to some degree (as determined by the ratio of  $V_{tp}$  to  $V_{tn}$ ), causing some static current to flow through  $M_3$ , generating some voltage at out given by  $I_{leakage}R_{on}(M_3)$ .



To reduce the problems the designer can either minimize the width of  $M_{1,2}$  (to reduce  $I_{leakage}$ ), or increase the width of  $M_{3,4}$  (to reduce  $R_{on}$ ). Figure 9 shows the effects of having the size of  $M_{1,2}$ , while doubling the size of  $M_{3,4}$ .

#### 16. Voltage Control Ring Oscillator.

In this problem, we will explore a voltage controlled-oscillator that is based upon John G. Maneatis' paper in Nov. 1996, entitled "Low Jitter Process-Independent DLL and PLL Based on Self-Biased Techniques," appeared in the Journal of Solid-State Circuits. We will focus on a critical component of the PLL design: the voltage-controlled ring oscillator. Figure 0.28 shows the block diagram of a voltage controlled ring oscillator:



**Figure 0.28** Voltage Controlled Ring Oscillator

The control voltage,  $Vctl$ , is sent to a bias generator that generates two voltages used to properly bias each delay cell equally, so that equal delay (assuming no process variations) appear across each delay cell. The delay cells are simple, "low-gain" fully differential input and output operational amplifiers that are connected in such a way that oscillations will occur at any one of the outputs with a frequency of  $1/(4*delay)$ . Each delay is modeled as an RC time constant;  $C$  comes from parasitic capacitances at the output nodes of the delay element,

and R comes from the variable resistor that is the load for the delay cell. Below is a circuit schematic of a typical delay cell.



**Figure 0.29** One delay Cell

As mentioned before, the value of R is set by a variable resistor. How can one make a variable resistor? The object in the delay cell that is surrounded by a dotted line is called a “symmetric load,” and provides the answer to a voltage-controlled variable resistor. R should be linear so that the differential structure cancels power supply noise. We will begin our analysis with the symmetric load.

- In Hspice, input the circuit below and plot Vres on the X axis and Ires on the Y axis, for the following values of Vctlp: 0.5, 0.75, 1.0, 1.25, 1.5, 1.75, and 2.0 volts, by varying Vtest from Vctlp to Vdd, all on the same graph. For each curve, plot Vres from 0 volts to Vdd-Vctlp. When specifying the Hspice file, be sure to estimate area and perimeter of drains/sources.



**Figure 0.30** :Symmetric Load Test Circuit

After you have plotted the data and printed it out, use a straight edge to connect the end points for each curve. What do you notice about intersection points between the line you drew over each curve, and the curves themselves? Describe any symmetries you see.

#### Solution

The lines intersect curves at the “point of inflection” of the curves. At these points, the point of symmetry is x-y symmetric, if we take the drawn line as the x axis, and the y axis as a line drawn perpendicular to the x axis and intersecting the “point of inflection.” Ideally, for symmetric loads, we should notice that the “point of inflection” should occur at 1/2 the voltage sweep range and 1/2 the current output range; but due to non-linear effects, the “point of inflection” is shifted towards the power rail.



- b. For each  $V_{ctlp}$  curve that you obtained in a), extract the points of symmetries ( $V_{res}$ ,  $I_{res}$ ), and find the slope of the line around these points of symmetry. These are the effective resistances of the resistors. Also, for each  $V_{ctl}$  curve, state the maximum amplitude the output swing can be, without running into asymmetries. Put all of this data in an worksheet format.

#### Solution

Although the problem did not ask for  $1/gm$  of the symmetric load when  $V_{res}=V_{ctl}$ , it is good to look at them, because it is good to compare which ‘effective resistance’ to use as an estimation: the slope at the “point of inflection”, or the effective resistance the symmetric load offers when it is in the lowest impedance state (when  $V_{res}=V_{ctl}$ ... that gives the lowest  $gm$ ).

$$gm=W/L*k'*(Vdd-Vctl-Vt)$$

where  $W$ =sum of both transistor widths in a symmetric load

$$k'=30e-6$$

$$W=48$$

$$L=2$$

$$Vt=.4$$

( $V_{ctl}$ ,  $V_{res}$ ,  $I_{res}$ , Slopes,  $1/gm$  of the symmetric load when  $V_{res}=V_{ctl}$ )

$$(0.5, 1.0, 500\mu A, 2.26K, 868)$$

$$(0.75, 0.7, 300\mu A, 3K, 1.028K)$$

$$(1, 0.6, 200\mu A, 4.76K, 1.262K)$$

$$(1.25, 0.5, 120\mu A, 9K, 1.633K)$$

$$(1.5, 0.4, 80\mu A, 22.3K, 2.314K)$$

$$(1.75, 0.25, 25\mu A, 69K, 3.968K)$$

$$(2, 0.22, 0.6\mu A, 2.2M, 13.888K)$$

For each symmetric load  $V_{ctl}$  setting, the theoretical upper swing limit is  $V_{dd}$ , and lower swing limit is  $V_{ctl}$ . Thus, the total swing is  $V_{dd}-V_{ctl}$ . We will be using the bias generator to “set” the  $V_{ctln}$  for a given  $V_{ctl}$  ( $V_{ctlp}$  is essentially  $V_{ctl}$ ) such that half the current runs through each of the delay cell legs. This theoretically biases the delay cell common mode outputs to  $V_{dd}-V_{ctl}/2$ , which is supposed to be the point of inflection for a given  $V_{ctl}$  voltage; however, since the symmetric loads are not perfectly symmetric, we will analyze and see how well the assumption holds. A way to correct this so that the symmetry point occurs

where we expect it to, would be relative sizing between the symmetric load transistors; but keep in mind that this actually invalidates the symmetry altogether, for certain ranges of bias currents (try it, you will see).

- c. Using the estimations you made for area and perimeter of drain and source that you put in your Hspice file, calculate the effective capacitance. (Just multiply area and perimeter by CJ and CJSW from the spice deck). Since we are placing these delay elements in a cascaded fashion, remember to INCLUDE THE GATE CAPACITANCE of the following stage. Each delay element is identical to one another. Now, calculate the delay in each cell, according to each setting of Vctlp that you found in a): delay=0.69\*R\*C. Then, write a general equation, in terms of R and C, for the frequency value that will appear at each delay output. Why is it necessary to cross the feedback lines for the ring oscillator in the first figure? Finally, draw a timing/transient analysis of each output node of the delay lines. How many phases of the base frequency are there?

### Solution

Capacitance Estimations: ( $\lambda$  is .125e-6)

Area of drain/source pmos in symmetric:  $24 * \lambda * .625e-6 = 1.875e-12$

Perimeter of drain/source pmos in symmetric:  $24 * \lambda + 1.5e-6 = 4.5e-6$

Area of drain/source nmos input:  $36 * \lambda * .625e-6 = 2.8125e-12$

Perimeter of drain/source nmos input:  $36 * \lambda + 1.5e-6 = 6e-6$

$C_{gdon} = 3.1e-10$

$C_{gdop} = 2.7e-10$

$C_{jn} = 2e-3$

$C_{jp} = 1.9e-3$

$C_{jswn} = 2.75e-10$

$C_{jswp} = 2.232e-10$

$C_{ox} = 6e-3$

diode connected pmos contributes:

$$C_{gp} = C_{gdop} * W_p + C_{ox} * W_p * L_p = 2.7e-10 * 24 * \lambda * 6e-3 * 24 * 2 * \lambda^2 = 5.31e-15$$

$$C_{db} = C_{jp} * A_{Dp} + C_{jswp} * P_{Dp} = 1.9e-3 * 1.875e-12 + 2.232e-10 * 4.5e-6 = 4.566e-15$$

current source pmos contributes:

$$C_{gd} = C_{gdop} * W_p = 2.7e-10 * 24 * \lambda = 8.1e-16$$

$$C_{db} = C_{jp} * A_{Dp} + C_{jswp} * P_{Dp} = 1.9e-3 * 1.875e-12 + 2.232e-10 * 4.5e-6 = 4.566e-15$$

Input gate transistor contributes:

$$C_{gd} = C_{gdon} * W_n = 3.1e-10 * 36 * \lambda = 1.395e-15$$

$$C_{db} = C_{jn} * A_{Dn} + C_{jswn} * P_{Dn} = 2e-3 * 2.8125e-12 + 2.75e-10 * 6e-6 = 7.275e-15$$

load capacitance presented by gate capacitance of following stage:

$$C_g = (2 * C_{gdon} + C_{gs}) * W_n + C_{ox} * W_n * L_n = (3 * 3.1e-10) * 36 * \lambda + 6e-3 * \lambda^2 * 36 * 2 = 10.9e-15$$

Total load capacitance:  $34.817e-15$  farads

In spice, the actual capacitance is  $27.379e-15$  farads. Pretty good estimation!

Here, we are also adding the analysis of  $R_{gm}$  (1/gm of symmetric load when  $Vctl = V_{res}$  of symmetric load) on delay:

( $vctl, 0.69 * Rslope * C, 0.69 * R_{gm} * C$ )

(.5, 5.457e-11, 1.79e-11)

|                  |            |
|------------------|------------|
| (.75, 7.24e-11,  | 2.127e-11) |
| (1, 1.1e-10,     | 2.61e-11)  |
| (1.25, 2.17e-10, | 3.38e-11)  |
| (1.5, 5.38e-10,  | 4.78e-11)  |
| (1.75, 1.666e-9, | 8.2e-11)   |
| (2, 5.3e-8,      | 2.87e-10)  |

delay=0.69\*R\*C

The frequency will be  $1/(2*4*delay)$ , because a “high” and “low” level output on the ring oscillator will be valid for 4 delay times, equivalently. Thus, it will take two times the four delay blocks to form 1 frequency. It’s necessary to cross the lines so we can get an odd number of inversions, while exceeding the “hold time” of the “first” delay block when we feedback the inverted signal.

Basically you will have 4 phases of a clock, and for each phase, you will also have the inverted phase. 8 signals total. Some of them are overlapping one another.

**d.** Now, we will look at the bias generator. The circuit for the bias generator is as follows:



**Figure 0.31 :Bias Generator**

Implement this circuit in Hspice, and use the ideal voltage controlled voltage source for your amplifier. Use a value of 20 for A. This circuit automatically sets the Vctln and Vctlp voltages to the buffer delays to set the DC operating points of the delay cells such that the symmetric load is swinging reflected around its point of symmetry for a given Vctl voltage. Also, it is important to note that Vctl is the same as Vctlp. It must go through this business to obtain Vctln (which sets the bias current to the correct value, which sets the DC operating point of the buffer). Do a transient run in Hspice to verify that Vctlp is indeed very close to Vctl over a range of inputs for Vctl. Show a Spice transient simulation that goes for 1uS, and switches Vctl in a pw1 waveform across a range of inputs between 0.5V and 2.0V. For extra points, explain how this circuit works.

#### Solution

See the following figure.



We will refer the two legs of current that contain 1 symmetric load in each leg of the bias generator to be “delay cell replicas.” These replicas serve the purpose so that we can put one of them in a feedback loop such that we can set  $V_{ctlp}$  equal to  $V_{ctl}$  (thereby setting the symmetric load to the lowest swing point for the given  $V_{ctl}$  voltage). Through this process, it also generates the correct  $V_{ctln}$ , which gives rise to a certain current ‘ $I$ ’, will produce the desired voltage for  $V_{res}$  such that  $V_{res}=V_{ctl}$ . Note the sizes of the transistors; the lowest NMOS device has the same width as an actual delay cell’s NMOS current sink device. Thus, both the delay cell replica and actual delay cell’s NMOS current source sink the same current. However, since there are two incoming current legs to the NMOS current sink of the delay cell, the current that the symmetric loads in the delay cell each see  $I/2$ , this automatically biasing the dc operating point of the delay cell to the symmetric load’s point of symmetry (theoretically).

- e. Now, hook up the bias generator you just built with 4 delay cells, as shown in the first figure. For each control voltage  $V_{ctlp}$  from part c), verify your hand calculations with spice simulations. Show a spreadsheet of obtained frequencies vs. hand-calculation predictions, and in a separate column, calculate % error. Give a brief analysis of what you see. Print out all of the phases (4) of the clock, for a  $V_{ctl}$  value of your choice.

### Solution

The spreadsheet is given here.

( $V_{ctl}$ , measured, calculated w/Rslope, calculated w/Rgm, % error from Rslope, % err from Rgm)

|        |          |          |         |      |         |
|--------|----------|----------|---------|------|---------|
| (2,    | 3.95MHz, | 2.3MHz,  | 435MHz, | 41%, | 11000%) |
| (1.75, | 384MHz,  | 75MHz,   | 1.5GHz, | 80%, | 390%)   |
| (1.5,  | 1.6GHz,  | 232MHz,  | 2.6GHz, | 85%, | 162.5%) |
| (1.25, | 2GHz,    | 576MHz,  | 3.6GHz, | 72%, | 180%)   |
| (1,    | 2.5GHz,  | 1.13GHz, | 4.7GHz, | 54%, | 188%)   |
| (.75,  | 3.7GHz,  | 1.72GHz, | 5.8GHz, | 53%, | 156%)   |
| (.5,   | 4.2GHz,  | 2.29GHz, | 6.9GHz, | 45%, | 164%)   |

See the following figures.





This begs the question: why are these OFF by so much?

The delay equation of  $0.69*R*C$  is what we used. However, in Maneatis' paper, he calculates delay using just  $R*C$ , where  $R$  is  $R_{gm}$  that we have included in this problem set solution. If we go back and calculate our estimations using Maneatis' estimation of delay, we come up with less % error in the % err from  $R_{gm}$  column. However, it still does not explain the still glaringly large %error. If we look again at the figure in part (d), we can see that the lower swing limit of the buffers never reach the lowest point,  $V_{ctlp}$ . This is due to the fact that we are not putting in enough delay elements so that the overall frequency is slow enough, so that the delay cells can input and output the full swing range. Thus, our estimation of  $R$  using the slope and  $1/gm$  is inaccurate. The overall delay only allows "limited swinging." Another effect that may be appearing, is the much degraded gds output resistance of short channel devices. if gds begins to appear in the range of gm, then we will see a reduction in measured frequency vs. calculated frequency. In any case, the VCO does not need to be characterized in an absolute voltage to frequency relation; only that the transfer from voltage to frequency is linear, or at least the slope of the voltage to frequency curve has the same polarity at all times. When placed in a feedback loop, the non-linearities of the voltage to frequency curve of the VCO will be compensated for.

