

# EE3230 VLSI Design (2023 Spring) HW #4

110061217 王彥智

1. Please design a master-slave flip-flop with the following schematics.

- $V_{DD} = 1.8V$ , and the input clock CLK runs at 50MHz.
- There are ONLY 2 inputs to this module, D and CLK.
- In your simulations, EACH INPUT sees two unit inverters in series (for proper slope shaping) with the following specified size:  $(W/L)_N=0.5\mu/0.18\mu$  and  $(W/L)_P=1.5\mu/0.18\mu$ .
- You'll need to generate the signal CLKB so that the circuit can work properly.
- The output (Q) drives a capacitor load of  $50fF$ .
- The rise and fall times of input signals are 0.2ns.
- You are allowed to insert inverters wherever you like to improve performance. In this case, please provide an updated schematic and explain your design considerations. However, remember to keep the polarity of Q correct. (In other words, Q should follow the polarity of D.)
- You can decide all the transistor sizes by yourself except the two unit inverters that inputs see.
- TA will provide a testbench file for your convenience later.



- A. (15%) Please characterize the flip-flop's setup time, hold time, and propagation delays, for both rising and falling input transitions. Also, with an input signal D that transitions once every clock cycle, please measure the power consumption.

- The measurement accuracy should be better than 1ps. That is to say, when sweeping the relative delay between D and clk, change it with a step smaller than 1ps, so that you can clearly see how  $t_{C2Q}$  increases as you decrease  $t_{D2C}$ .
- In the report, please provide the timing waveforms of D, clk, and Q for all the characteristics you measure. The following shows one example that found on

the Internet of setup time for rising input. Please put all delay cases into ONE figure.

- Please also plot  $t_{D2Q}$  vs.  $t_{D2C}$  for all the characteristics that you measure. Label the curve and show how you measure the setup time like the following example that I found on Internet.





timing waveforms of D, clk, and Q



$t_{D2Q}$  vs.  $t_{D2C}$

```
tb_hw4_1_presim.mt0 ✘
1 $DATA1 SOURCE='PrimeSim HSPICE' VERSION='R-2020.12-SP2 linux64' PARAM_COUNT=0
2 .TITLE '*****eee3230 vlsi hw4 testbench*****'
3 pvd temper alter#
4 1.764e-05 25.0000 1
```

### Power consumption

How to measure setup time and hold time?



hold time : since in testbench we sweep t-hold from -9op to 3op but x-axis in graph is from op to 12op , so the real hold time should be [ r3mg : 88.4 ps - 90ps = -1.6 ps ] fallmg : 49.4 ps - 90ps = -40.6 ps



$$\text{min tdc2q rising} : 5386.81 \text{ ps} - 5100 \text{ ps} = 286.81 \text{ ps}$$

$$\text{min tdc2q falling} : 5409.04 \text{ ps} - 5100 \text{ ps} = 309.04 \text{ ps}$$

\* Use the same method to measure Ts<sub>u</sub>, T<sub>h</sub>, Min tdc2q , Min tdc2q in part1 post-layout simulation and unit size flip flop simulation.

- B. (15%) Explain what you have done to improve the performance (i.e., to speed up the operation and/or to reduce the power consumption). If you ever modify the schematics, provide the updated version in your report.

Speed up the operation by changing the size of last inverter:



critical path :  $D \rightarrow \text{inverter} \rightarrow \text{inverter} \rightarrow Q$   
 $C_m = 3.7803 \text{ fF}$        $C_{mt} = 50 \text{ fF}$

measuring  $C_m$  of unit inverter by  $\omega p$ :

$$C_m = 2.8502 \text{ fF} + 0.9301 \text{ fF} = 3.7803 \text{ fF}$$

$$\therefore F = GBH = 1 \times 1 \times \frac{50 \text{ fF}}{37803 \text{ fF}} = 13.2264$$

$\therefore f_i = 3.636 \rightarrow$  choose less inverter size : 4

$\therefore D \rightarrow \text{inverter} \rightarrow \text{inverter} \rightarrow Q$   
 $C_m = 3.7803 \text{ fF}$        $C_{mt} = 50 \text{ fF}$

Compare with a unit size flip-flop:





```
$DATA1 SOURCE='PrimeSim HSPICE' VERSION='R-2020.12-SP2 linux64' PARAM_COUNT=0
.TITLE '*****eee3230 vlsi hw4 testbench*****'
pvdd      temper      alter#
  1.555e-05    25.0000      1
```

#### Power consumption

|                               | Unit size flip-flop simulation |          |
|-------------------------------|--------------------------------|----------|
|                               | Rising                         | Falling  |
| $T_{su}$                      | 89.3ps                         | 88.3ps   |
| $T_h$                         | -4.1ps                         | -46.8ps  |
| Minimum $t_{D2Q}$             | 434.5ps                        | 470.6ps  |
| Minimum $t_{CK2Q}$            | 329.1ps                        | 368.53ps |
| Power consumption ( $\mu W$ ) | 15.55 $\mu W$                  |          |

We can see Minimum  $t_{D2Q}$  and Minimum  $t_{CK2Q}$  of flip-flop I design (see table on page 9) is smaller than that of unit size flip-flop, which means the circuit is faster.

- C. (15%) Complete the layout. Snapshot the screens that show DRC and LVS clean. Show a snapshot of the layout (with rulers that show x and y dimensions) in your report. Report the area. Furthermore, explain your layout considerations.



Layout



DRC result



LVS result

Layout considerations: Use finger to draw the last inverter which size is 4



- D. (5%) Run post-layout simulation (R-C-CC extraction) and measure the power consumption, setup time, hold time, and propagation delays, for both rising and falling input transitions again. Complete the following table and show it in your report.



timing waveforms of D, clk, and Q



$t_{D2Q}$  vs.  $t_{D2C}$

```
tb_hw4_1_postsim.mt0
1 $DATA1 SOURCE='PrimeSim HSPICE' VERSION='R-2020.12-SP2 linux64' PARAM_COUNT=0
2 .TITLE '*****eee3230 vlsi hw4 testbench*****'
3 pvd़ temper alter#
4 2.129e-05 25.0000 1
```

#### Power consumption

|                    | Pre-layout simulation |          | Post-layout simulation |          |
|--------------------|-----------------------|----------|------------------------|----------|
|                    | Rising                | Falling  | Rising                 | Falling  |
| $T_{su}$           | 87.3ps                | 92.6ps   | 82.8ps                 | 95.9ps   |
| $T_h$              | -1.6ps                | -40.6ps  | 14.5ps                 | -30ps    |
| Minimum $t_{D2Q}$  | 390.6ps               | 414.1ps  | 461.4ps                | 479.6ps  |
| Minimum $t_{CK2Q}$ | 286.81ps              | 309.04ps | 361ps                  | 381.21ps |

|                                     |                                       |                     |
|-------------------------------------|---------------------------------------|---------------------|
| Power consumption ( $\mu\text{W}$ ) | 17.64 $\mu\text{W}$                   | 21.29 $\mu\text{W}$ |
| Layout area ( $\mu\text{m}^2$ )     | 10.195*21.96=223.8822 $\mu\text{m}^2$ |                     |

2. With the following master-slave flip-flop, repeat the characterization in the previous question (Q1A to Q1D). Explain what and why the differences are in detail.



- A. (15%) Please characterize the flip-flop's setup time, hold time, and propagation delays, for both rising and falling input transitions. Also, with an input signal D that transitions once every clock cycle, please measure the power consumption.
- The measurement accuracy should be better than 1ps. That is to say, when sweeping the relative delay between D and clk, change it with a step smaller than 1ps, so that you can clearly see how tC2Q increases as you decrease tD2C.
  - In the report, please provide the timing waveforms of D, clk, and Q for all the characteristics you measure. The following shows one example that found on the Internet of setup time for rising input. Please put all delay cases into ONE figure.
  - Please also plot tD2Q vs. tD2C for all the characteristics that you measure. Label the curve and show how you measure the setup time like the following example that I found on Internet.



timing waveforms of D, clk, and Q



$t_{D2Q}$  vs.  $t_{D2C}$

```
tb_hw4_2_presim.mt0 ✘
1 $DATA1 SOURCE='PrimeSim HSPICE' VERSION='R-2020.12-SP2 linux64' PARAM_COUNT=0
2 .TITLE '*****ee3230 vlsi hw4 testbench*****'
3 pvdः temper alter#
4 2.869e-05 25.0000 1
```

### Power consumption

How to measure setup time and hold time?



hold time: since in testbench we sweep t-hold from -170p to 30p but x-axis in graph is from 0p to 200p (sweep every 0.1ps)  
so the real hold time should be  

$$[t3mg: 56.1\text{ps} - 170\text{ps} = -113.9\text{ps}$$
  

$$\text{fallmg: } 113.3\text{ps} - 170\text{ps} = -56.6\text{ps}$$



mm tdcq rising: 5474.63 ps - 5100 ps = 374.63 ps  
 mm tdcq falling: 5569.48 ps - 5100 ps = 469.48 ps

\* Use the same method to measure Tsu, Th, Min tdcq, Min tdcq in part2 post-layout simulation and unit size flip flop simulation.

- B. (5%) Explain what you have done to improve the performance (i.e., to speed up the operation and/or to reduce the power consumption). If you ever modify the schematics, provide the updated version in your report.

Speed up the operation by changing the size of last two NAND:



measuring  $C_m$  of unit inverter by  $\omega_p$ :

$$C_m = 3.3054 \text{ fF} + \frac{0.9229 + 1.0452}{2} \text{ fF} \approx 4.2895 \text{ fF}$$

considering 2 path:

$$1. \text{ } \textcolor{blue}{\cancel{3}} : f_t = \sqrt[2]{f} = \sqrt[2]{\frac{50}{4.2895}} = 3.4141$$

$$2. \text{ } \textcolor{green}{\cancel{3}} : f_t = \sqrt[3]{f} = \sqrt[3]{\frac{50}{4.2895}} = 2.2673$$

$\therefore$  In average, I think  $h=3$  can improve the performance the most.  $\rightarrow$  choose  $h=3$

Compare with unit size flip-flop:



timing waveforms of D, clk, and Q



$t_{D2Q}$  vs.  $t_{D2C}$

```
$DATA1 SOURCE='PrimeSim HSPICE' VERSION='R-2020.12-SP2 linux64' PARAM_COUNT=0
.TITLE '*****eee3230 vlsi hw4 testbench*****'
pvdd      temper      alter#
 2.354e-05    25.0000      1
```

#### Power consumption

|                               | Unit size flip-flop simulation |          |
|-------------------------------|--------------------------------|----------|
|                               | Rising                         | Falling  |
| $T_{su}$                      | 80.4ps                         | 127.9ps  |
| $T_h$                         | -114.13ps                      | -57.1ps  |
| Minimum $t_{D2Q}$             | 505.5ps                        | 707.6ps  |
| Minimum $t_{CK2Q}$            | 412.23ps                       | 566.43ps |
| Power consumption ( $\mu W$ ) | 23.54 $\mu W$                  |          |

We can see Minimum  $t_{D2Q}$  and Minimum  $t_{CK2Q}$  of flip-flop I design (see table on page 9) is smaller than that of unit size flip-flop, which means the circuit is faster.

- C. (10%) Complete the layout. Snapshot the screens that show DRC and LVS clean. Show a snapshot of the layout (with rulers that show x and y dimensions) in your report. Report the area. Furthermore, explain your layout considerations.



Layout



DRC result

Calibre - RVE v2016.4\_15.11 : svdb flip\_flop\_b

File View Highlight Tools Window Setup Help

Navigator Comparison Results

Results Extraction Results Comparison Results Reports Extraction Report LVS Report Rules Rules File View Info Finder Schematics Setup Options

Layout Cell / Type : flip\_flop\_b Source Cell : MS\_DFF Nets : 16L, 16S Instances : 20L, 20S Ports : 6L, 6S

Cell flip\_flop\_b Summary (Clean)

CELL COMPARISON RESULTS ( TOP LEVEL )

Warning: Ambiguity points were found and resolved arbitrarily.

LAYOUT CELL NAME: flip\_flop\_b  
SOURCE CELL NAME: MS\_DFF

INITIAL NUMBERS OF OBJECTS

|             | Layout | Source | Component   | Type |
|-------------|--------|--------|-------------|------|
| Ports:      | 6      | 6      |             |      |
| Nets:       | 31     | 31     |             |      |
| Instances:  | 31     | 31     | MN (4 pins) |      |
|             | 31     | 31     | MP (4 pins) |      |
| Total Inst: | 62     | 62     |             |      |

NUMBERS OF OBJECTS AFTER TRANSFORMATION

LVS result

Layout considerations: Minimize the size of last two NAND gates.



D. (10%) Run post-layout simulation (R-C-CC extraction) and measure the power consumption, setup time, hold time, and propagation delays, for both rising and falling input transitions again. Complete the following table and show it in your report.



timing waveforms of D, clk, and Q



t<sub>D2Q</sub> vs. t<sub>D2C</sub>

```
tb_hw4_2_postsim.mt0 ✘
1 $DATA1 SOURCE='PrimeSim HSPICE' VERSION='R-2020.12-SP2 linux64' PARAM_COUNT=0
2 .TITLE '*****ee3230 vlsi hw4 testbench*****'
3 .pvdd temper alter#
4 3.506e-05
```

Power consumption

|                                     | Pre-layout simulation                    |          | Post-layout simulation |         |
|-------------------------------------|------------------------------------------|----------|------------------------|---------|
|                                     | Rising                                   | Falling  | Rising                 | Falling |
| $T_{su}$                            | 81.8ps                                   | 129.4ps  | 107.4ps                | 164.7ps |
| $T_h$                               | -113.9ps                                 | -56.60ps | -141.1ps               | -85.5ps |
| Minimum $t_{D2Q}$                   | 470.1ps                                  | 611.1ps  | 598.2ps                | 765.6ps |
| Minimum $t_{CK2Q}$                  | 374.63ps                                 | 469.48ps | 477.8ps                | 586.5ps |
| Power consumption ( $\mu\text{W}$ ) | 28.69 $\mu\text{W}$                      |          | 35.06 $\mu\text{W}$    |         |
| Layout area ( $\mu\text{m}^2$ )     | $10.91 * 34.86 = 380.3226 \mu\text{m}^2$ |          |                        |         |

#### E. (10%) Difference between two types of flip-flops

- Layout area

Total NMOS and PMOS in type2 flip-flops is larger than total NMOS and PMOS in type1 flip-flops, and we use same size of unit size transistor in two types of flip-flops, so the total area of type2 is larger than that of type1.

- Power

In type1 flip-flops, we only need to connect the source of PMOS to VDD for each inverter, but in type2 flip-flops, we need to connect two sources of different PMOS to VDD for each NAND gate and source of PMOS to VDD for each inverter. Also, the number of NAND gates and inverters we use in type2 is more than the number of inverters we used in type1. So the power consumption of type2 is larger than type1.

- Timing

- Minimum  $t_{CK2Q}$

We can see in type1, CLK signal only need to pass one inverter, but in type2, CLK signal need to pass two or three NAND gate. So Minimum  $t_{CK2Q}$  of type2 is larger than Minimum  $t_{CK2Q}$  of type1.



Type1

Type2

- Minimum  $t_{D2Q}$

Same as Minimum  $t_{CK2Q}$ , D signal pass more stage in type2, so Minimum  $t_{D2Q}$  of type2 is larger than Minimum  $t_{D2Q}$  of type1.