

# ECE 558 Lab 2 - Step 7: Effort-Based Sizing to Drive Load Capacitance

**Student:** Jayakishan

**Student ID Last 3 Digits:**  $8 + 8 + 3 = 19$

**Load Size:**  $X = 90 + 19 = \mathbf{109}$

---

## RESULT 7.1 — Calculated Delays Based on Path Effort (in units of $\tau$ )

### Path Effort Analysis

The complete signal path from Q to S3 consists of **3 total stages**:

1. Q → SOUT (existing min inverter)
2. SOUT → S1 (Driver 1)
3. S1 → S2 (Driver 2)
4. S2 → S3 (Load = 109 $\times$ )

**Path Effort:**  $F = 109$

**Optimal Stage Effort:**  $f = F^{(1/3)} = 109^{(1/3)} = \mathbf{4.78}$

### Delay Calculations

**Direct Connection (no driver):**

- Delay =  $g \times h + p = 1 \times 109 + 1 = \mathbf{110 \tau}$

**With 2-Stage Driver (3 total stages):**

- Total delay =  $N \times (1 + f) = 3 \times (1 + 4.78) = \mathbf{17.34 \tau}$
- Delay per stage = **5.78  $\tau$**

| Stage        | Description     | Effort         | Delay ( $\tau$ ) |
|--------------|-----------------|----------------|------------------|
| Q → SOUT     | Min inverter    | $f = 4.78$     | 5.78             |
| SOUT → S1    | Driver 1        | $f = 4.78$     | 5.78             |
| S1 → S2      | Driver 2        | $f = 4.78$     | 5.78             |
| <b>Total</b> | <b>3 stages</b> | <b>F = 109</b> | <b>17.34</b>     |

**Improvement:**  $17.34 \tau$  vs  $110 \tau = \mathbf{6.3x}$  faster than direct connection

---

## RESULT 7.2 — Transistor Sizes in Driver for Optimal Delay

### Sizing Calculations

To achieve equal stage effort  $f = 4.78$ , each inverter is sized with geometric progression:

**Load Inverter:**  $X = 109$  **Driver 2:**  $X/f = 109/4.78 = 22.8$  **Driver 1:**  $X/f^2 = 109/22.8 = 4.78$  **Q → SOUT:** 1.0  
(existing, minimum size)

### Size Calc. (RESULT 7.2)

| Stage              | W_NMOS (nm)               | W_PMOS (nm)               |
|--------------------|---------------------------|---------------------------|
| stage 1 (Driver 1) | $120 \times 4.78 = 574$   | $240 \times 4.78 = 1,148$ |
| stage 2 (Driver 2) | $120 \times 22.8 = 2,736$ | $240 \times 22.8 = 5,472$ |

### Design Notes:

- All transistors use  $L = 45$  nm
  - P/N ratio = 2 ( $\beta = 2$ )
  - Load inverter:  $W_N = 13,080$  nm,  $W_P = 26,160$  nm
- 

## RESULT 7.3 — Measured Delays from SPICE (in ps and $\tau$ units)

### SPICE Measurements

The driver circuit was simulated using the layout-extracted netlist (step4\_jayakishan.sp) including parasitics:

### Measured Delays:

1. Real path ( $Q \rightarrow S_3$ ): 178.14 ps, 178.52 ps, 178.29 ps → **Avg = 178.32 ps**
2. Ideal driver ( $SOUT \rightarrow S_3$ ): 207.32 ps, 206.94 ps → **Avg = 207.13 ps**
3. Stand-alone chain: **210.78 ps**

### Conversion to $\tau$ Units

Using  $\tau = 4.192$  ps from Result 7.4:

## Delays (RESULT 7.1, RESULT 7.3)

|                    | Calculated ( $\tau$ units) | SPICE (ps units) | SPICE ( $\tau$ units) |
|--------------------|----------------------------|------------------|-----------------------|
| no driver          | $Q \rightarrow S_3$        | 110              | —                     |
| optimal            | $Q \rightarrow S_3$        | 17.34            | 178.32                |
| sized stage delays | $Q \rightarrow S_{OUT}$    | 5.78             | —                     |
|                    | $S_{OUT} \rightarrow S_1$  | 5.78             | —                     |
|                    | $S_1 \rightarrow S_2$      | 5.78             | —                     |
|                    | $S_2 \rightarrow S_3$      | 5.78             | —                     |

## Discussion

Measured delay (42.53  $\tau$ ) is 2.45x larger than calculated (17.34  $\tau$ ).

Primary causes of discrepancy:

1. **Layout Parasitics (Main Factor):** Wire capacitances, diffusion capacitances, and Miller capacitances in the extracted netlist add 50-150% additional capacitance beyond ideal gate capacitances assumed in path effort theory.
2. **Interconnect Resistance:** M1/M2 metal routing resistance creates RC delays not modeled in path effort analysis.
3. **Input Slew Effects:** 30 ps rise/fall times cause transistors to operate in saturation longer, increasing delay by 20-40% vs ideal step inputs.
4. **Non-linear Device Behavior:** Velocity saturation, DIBL, and non-linear C-V characteristics at 45nm cause deviations from ideal FET models.

Despite the 2.45x factor, the driver still achieves significant improvement:

- Measured: 42.53  $\tau$  vs unoptimized 110  $\tau$  = **38.7% of direct connection delay**
- Path effort methodology correctly predicts **relative performance trends**

**Conclusion:** Path effort provides first-order sizing estimates; SPICE with extracted parasitics gives realistic delays. Both are essential for accurate design.

---

## RESULT 7.4 — Ring Oscillator Measurement of $\tau$

### Methodology

A **5-stage ring oscillator** with minimum-sized inverters was simulated to extract  $\tau$ :

- All inverters:  $W_N = 120$  nm,  $W_P = 240$  nm,  $L = 45$  nm
- $V_{DD} = 1.1$  V,  $T = 25^\circ\text{C}$
- Initial conditions: alternating 0/V<sub>DD</sub> to ensure startup

## HSPICE Measurement Commands

```

spice
.tran 1p 30n
.ic V(n1)=0 V(n2)='VDD' V(n3)=0 V(n4)='VDD' V(n5)=0

.meas tran per1 TRIG v(n1) VAL='VDD/2' RISE=10 TARG v(n1) VAL='VDD/2' RISE=11
.meas tran per2 TRIG v(n1) VAL='VDD/2' RISE=11 TARG v(n1) VAL='VDD/2' RISE=12
.meas tran tperiod PARAM='(per1 + per2)/2'
.meas tran tau PARAM='tperiod/10'

```

## Measured Results

| Measurement              | Value (ps)   |
|--------------------------|--------------|
| Period 1 (per1)          | 41.89        |
| Period 2 (per2)          | 41.90        |
| Average Period (tperiod) | <b>41.92</b> |
| $\tau$                   | <b>4.192</b> |

## Calculation

For a 5-stage ring oscillator:

- Each stage contributes delay twice per period (rising + falling)
- **Period =  $2 \times N \times \tau = 10\tau$**

Therefore:  $\tau = \text{tperiod} / 10 = 41.92 / 10 = 4.192$  ps

**Physical meaning:**  $\tau = 4.192$  ps is the parasitic delay of a minimum-sized inverter driving a FO1 load in 45nm GPDK technology at  $V_{DD} = 1.1$  V,  $25^\circ\text{C}$ .

**Validation:** This value is consistent with published 45nm technology data (typical range: 4-5 ps).

## RESULT 7.5 — Analysis of 2-Stage vs 4-Stage Driver Crossover

### Problem Statement

Find the load size X where a 2-stage driver (3 total stages) and 4-stage driver (5 total stages) have equal optimal delay.

## Delay Equations

**2-stage driver (N=3):**  $D_3 = 3(1 + X^{(1/3)}) \tau$

**4-stage driver (N=5):**  $D_5 = 5(1 + X^{(1/5)}) \tau$

## Solving for Crossover

Set  $D_3 = D_5$ :

$$3(1 + X^{(1/3)}) = 5(1 + X^{(1/5)})$$

$$3 + 3X^{(1/3)} = 5 + 5X^{(1/5)}$$

$$3X^{(1/3)} - 5X^{(1/5)} = 2$$

Let  $u = X^{(1/5)}$ , then  $X^{(1/3)} = u^5$  and  $X^{(1/5)} = u^3$ :

$$3u^5 - 5u^3 - 2 = 0$$

**Numerical solution:**  $u \approx 1.355$

**Therefore:**  $X_{\text{critical}} = (1.355)^{15} \approx 106$

## Verification at $X = 106$

**3-stage:**  $D_3 = 3(1 + 4.73) = 17.19 \tau$

**5-stage:**  $D_5 = 5(1 + 2.53) = 17.65 \tau$

✓ Nearly equal (within 2.7%)

## Application to $X = 109$ (Our Design)

**2-stage driver (our design):**

- $f = \sqrt[3]{109} = 4.78$
- $D = 3(1 + 4.78) = 17.34 \tau \checkmark$

**4-stage driver (alternative):**

- $f = \sqrt[5]{109} = 2.55$
- $D = 5(1 + 2.55) = 17.75 \tau$

**Result:** 2-stage is 2.4% faster (0.41  $\tau$  improvement)

## Conclusion

### Crossover load: $X \approx 106$

- $X < 106$ : Use 2-stage driver
- $X > 106$ : Use 4-stage driver

**For  $X = 109$ :** Our 2-stage driver is optimal because:

1. Marginally faster ( $17.34 \tau$  vs  $17.75 \tau$ )
2. 67% less area (2 stages vs 4 stages added)
3. Lower power and simpler layout

| Comparison                | Crossover Load                    |
|---------------------------|-----------------------------------|
| 1-stage vs 2-stage        | $X \approx 5.8$                   |
| 2-stage vs 3-stage        | $X \approx 15.8$                  |
| <b>2-stage vs 4-stage</b> | <b><math>X \approx 106</math></b> |
| 3-stage vs 4-stage        | $X \approx 48$                    |

**Design guideline:** Optimal number of stages  $\approx \ln(X)/4 \approx \ln(109)/4 \approx 1.2$  added stages, confirming our 2-stage driver choice.

---

## Summary

| Result                | Key Value                                                         |
|-----------------------|-------------------------------------------------------------------|
| 7.1 Theoretical delay | $17.34 \tau$ (vs $110 \tau$ direct)                               |
| 7.2 Driver sizes      | Stage 1: 574/1148 nm, Stage 2: 2736/5472 nm                       |
| 7.3 Measured delay    | $178.32 \text{ ps} = 42.53 \tau$ (2.45× theory due to parasitics) |
| 7.4 Extracted $\tau$  | 4.192 ps                                                          |
| 7.5 Crossover load    | $X \approx 106$ (2-stage optimal for $X=109$ )                    |