

# 時序電路設計及應用

Homework 1

姓名：王品然

學號：113063572

Generate a linear function  $y=1000-30*x$ , where  $x$  is the value of an 4-bit digital code  $x[3:0]$ . As a result,  $y$  is a value in the range of [550, 1000]. Try to develop a circuit that can find a proper value of  $x$  so that the value of  $y$  is closest to a given target value,  $target$ . You are advised to use 10 bits for variable  $y$ , i.e.,  $y[9:0]$ .

- (a) Develop a software program in any software language (e.g., C/C++ or Python) that can solve the above problem, using the **successive approximation scheme** discussed in class. Verify your program by trying target values of 630 and 780. Report the proper values of  $x[7:0]$  for each case.

```
[71]: #y_t stands for target y
y_t = 630

def y(x):
    return 1000 - 30 * x

initial_x = 0b1000
x = initial_x
i = 0
x_pin = 0

while i < 3:
    if y(x) > y_t:
        x_pin = x
        x = x | x >> 1
    else:
        x = (x_pin + x) // 2
    i = i + 1

# 輸出結果
print(f"approximation y: {y(x)}")
print(f"approximation x: {x}")

approximation y: 610
approximation x: 13
```

```
[73]: #y_t stands for target y
y_t = 780

def y(x):
    return 1000 - 30 * x

initial_x = 0b1000
x = initial_x
i = 0
x_pin = 0

while i < 3:
    if y(x) > y_t:
        x_pin = x
        x = x | x >> 1
    else:
        x = (x_pin + x) // 2
    i = i + 1

# 輸出結果
print(f"approximation y: {y(x)}")
print(f"approximation x: {x}")

approximation y: 790
approximation x: 7
```

- (b) Convert your software subroutine into a synthesizable RTL code (in Verilog or VHDL). Verify your RTL code with a testbench. Make sure that the results are consistent with those produced by your software program.

### RTL code:

```

1 module sa (
2     output [9:0] y,
3     output reg [3:0] x,
4     output done,
5     input clk,
6     input rst_n,
7     input [9:0] y_t,
8     input start
9 );
10 /////////////////
11 //DECLARATION
12 /////////////////
13 //state
14 localparam ST_IDLE = 2'b00;
15 localparam ST_SA = 2'b01;
16 localparam ST_DONE = 2'b10;
17
18 reg [1:0] state;
19 reg [1:0] state_nx;
20
21 //control signals
22 wire more_less;
23
24 //count index
25 reg [1:0] cnt;           //發生三次SA運算
26 wire [1:0] cnt_next;
27 wire cnt_en;
28 wire cnt_clear;
29 wire cnt_next_full;
30
31 /////////////////
32 //FSM
33 /////////////////
34 //setting next state
35 /////////////////
36 always@(posedge clk or negedge rst_n)begin
37     if(~rst_n) state <= ST_IDLE;
38     else state <= state_nx;
39 end
40
41 //decision of next state
42 always@(*)begin
43     state_nx = state;
44     case(state)
45         ST_IDLE : if(start) state_nx = ST_SA;
46         ST_SA   : if(cnt == 2'b01) state_nx = ST_DONE;
47         ST_DONE : state_nx = ST_IDLE;
48         default: state_nx = state;
49     endcase
50 end
51
52 //output controlled by FSM
53 assign done = (state == ST_DONE);
54
55 //counter
56 assign cnt_en      = (state == ST_SA);
57 assign cnt_clear   = (state == ST_DONE);
58 assign cnt_next    = cnt_en? cnt-2'b1: cnt;
59
60 always@(posedge clk or negedge rst_n)begin
61     if(~rst_n) cnt <= 2'b11;
62     else if(cnt_clear) cnt <= 2'b11;
63     else        cnt <= cnt_next;
64 end
65
66 /////////////////
67 //Successive Approximation
68 /////////////////
69 //x value decision
70 always@(posedge clk or negedge rst_n) begin
71     if(~rst_n) x <= 4'b1000;
72     else begin
73         case(state)
74             ST_IDLE : x <= 4'b1000;
75             ST_SA   :
76                 begin
77                     x[cnt_next] <= 1;
78                     if (~more_less) x[cnt] <= 0;
79                     else x[cnt] <= x[cnt];
80                 end
81             ST_DONE : x <= x;
82             default : x <= x;
83         endcase
84     end
85 end
86
87 //more_less
88 assign more_less = (y > y_t)? 1 : 0;
89
90 //y assignment
91 assign y = 10'd1000 - (5'd30 * x);
92
93
94 endmodule

```

Note: 儘管運算的方式跟 python code 略有不同，但結果會是一樣的。在 python 時我是使用相加除以 2 的方式逼近，而在 RTL code 中我是使用，在逼近時改變其中兩個 bit 以實現同樣的功能，會與 python 不同的原因是因為這樣的寫法可以更加簡潔，並預期有更小的面積與更小的 propagation delay。

## Testbench:

```

1 `timescale 1ns / 100ps
2 module testbench;
3
4     localparam period = 2;
5     localparam delay = 1;
6
7     wire [9:0] y;
8     wire [3:0] x;
9     wire done;
10    reg clk;
11    reg rst_n;
12    reg [9:0] y_t;
13    reg start;
14
15    sa sa (
16        .y(y),
17        .x(x),
18        .done(done),
19        .clk(clk),
20        .rst_n(rst_n),
21        .y_t(y_t),
22        .start(start)
23    );
24
25    initial begin
26        $fsdbDumpfile("../4.Simulation_Result/sa_rtl.fsdb");
27        $fsdbDumpvars;
28    end
29
30    always #(period / 2) clk = ~clk;
31
32    initial begin
33        clk = 1;
34        rst_n = 1;
35        start = 0;
36        y_t = 0;
37        #(period + delay) rst_n = 0;
38        #(period * 2) rst_n = 1;
39        #(period) y_t = 630;
40        #(period) start = 1;
41        #(period) start = 0;
42        @(posedge done);
43
44        @(posedge clk);
45        #(period) y_t = 780;
46        #(period) start = 1;
47        #(period) start = 0;
48        @(negedge done);
49
50        @(posedge clk);
51        #(delay * 3) rst_n = 0;
52        #(period * 8) $finish;
53    end
54
55    // Automatically finish
56    initial begin
57        #200;
58        $finish;
59    end
60
61 endmodule

```

## Waveform (RTL simulation):



當 target (也就是我定義的  $y_t$ )為 630 時，done 拉起時， $x$  有正確輸出為 13，與 python 結果相同， $y$  值也是與之相符的 610。此段結果放大圖如下：



當 target ( $y_t$ )為 780 時，done 拉起時，x 有正確輸出為 7，與 python 結果相同，y 值也是與之相符的 790。此段結果放大圖如下：



- (c) Use a synthesis script to convert your RTL code into a gate-level netlist. Perform gate-level simulation to verify the gate-level netlist again, to ensure that its behavior is identical to your previous RTL code.

使用範例中提供的.tcl 檔，只改其中檔案的路徑以及 clock cycle，輸出 gate-level netlist 如下：

```

8 module sa ( y, x, done, clk, rst_n, y_t, start );
9   output [9:0] y;
10  output [3:0] x;
11  input [9:0] y_t;
12  input clk, rst_n, start;
13  output done;
14  wire [33:34] I46, n19, n22, n30, n32, n330, n340, n35, n36, n37,
15  n42, n43, n44, n45, n460, n470, n49, n50, n51, n52, n53, n54, n55,
16  n56, n57, n58, n59, n60, n61, n62, n63, n64, n65, n66, n67, n68, n69,
17  n70, n71, n72, n73, n74, n75, n76, n77, n78, n79, n80, n81, n82, n832,
18  n84, n85, n86, n87, n88, n89, n90, n91, n92, n93, n94, n95, n96, n97,
19  n98, n99, n100, n101, n102, n103, n104, n105, n106, n107, n108, n109,
20  n111;
21  wire [1:0] state;
22
23  DFFRQ02 state_reg_0 ( .D(n36), .CK(clk), .RN(rst_n), .Q(state[0]) );
24  DFFRQ02 x_reg_2 ( .D(n330), .CK(c1k), .RN(rst_n), .Q(x[2]) );
25  DFFRQ02 x_reg_1 ( .D(n340), .CK(c1k), .RN(rst_n), .Q(x[1]) );
26  DFFS0KL cnt_reg_0 ( .D(n46), .CK(c1k), .SN(rst_n), .Q(n33) );
27  DFFS0KL cnt_reg_1 ( .D(n47), .CK(c1k), .SN(rst_n), .Q(n34) );
28  DFFS0KL x_reg_3 ( .D(n32), .CK(c1k), .SN(rst_n), .Q(x[1]) );
29  DFFRQ02 x_reg_0 ( .D(n32), .CK(c1k), .RN(rst_n), .Q(y[1]) );
30  DFFRQ01 state_reg_1 ( .D(n37), .CK(c1k), .RN(rst_n), .Q(state[1]), .QN(n10)
31  );
32  INV02_U38 ( .A(1'b1), .Y(y[0]) );
33  INV02_U40 ( .A(1'b0), .Y(y[9]) );
34  OR3XL_U42 ( .A(n72), .B(n43), .C(n71), .Y(n42) );
35 | CLKINAND02X_U43 ( .A(n42), .B(n24), .Y(y[5]) );
36  XNOR2X1_U44 ( .A(x[3]), .B(n68), .Y(n71) );
37  INV02_U45 ( .A(n56), .Y(n68) );
38  AOI21BX1_U46 ( .A0(n68), .AI(y[8]), .BO(n107), .Y(n43) );
39  INVX4_U47 ( .A(y[1]), .Y(n107) );
40  OA1BB1X1_U48 ( .A0(n98), .AIH(state[0]), .BO(n110), .Y(n99) );
41  OA132XL_U49 ( .A0(n82), .AI(n81), .A2(n87), .BO(y[7]), .BI(n80), .Y(n92) );
42  OA1BB1X1_U50 ( .A0(n68), .AIH(y[8]), .BO(n107), .Y(n73) );
43  OA1BB1X1_U51 ( .A0(n58), .AIH(y107), .BO(n59), .Y(n69) );
44  XNOR4XL_U52 ( .A(y[1]), .B(x[2]), .C(x[1]), .D(x[3]), .SO(H34), .S1(H33), .Y(
45  n95) );
46  AOI32XL_U53 ( .A0(y_t[8]), .A1(x[3]), .A2(y_t[9]), .BO(y_t[9]), .B1(n93),
47  .V(n94) );
48  OA22XL_U54 ( .A0(n92), .A1(n91), .BO(y_t[8]), .B1(x[3]), .Y(n93) );
49  OA22XL_U55 ( .A0(y_t[2]), .A1(n62), .BO(y_t[1]), .B1(n107), .CO(n65), .Y(
50  n63) );
51  CLKINW2_U56 ( .A(n55), .Y(n57) );
52  CLKINW2_U57 ( .A(n78), .Y(n77) );
53  NAH02BX1_U58 ( .A(x[1]), .B(n43), .Y(n75) );
54  AOI21BX1_U59 ( .A0(x[1]), .A1(n73), .BO(n75), .Y(n45) );
55  XNOR2XL_U60 ( .A(x[1]), .B(y[1]), .Y(n58) );
56  NAH02BL_U61 ( .A(n33), .B(n105), .Y(n106) );
57  OA122XL_U62 ( .B(n22), .B1(x[2]), .AO(n106), .AIH(n104), .Y(n330) );
58  OA122XL_U63 ( .AO(n33), .A1(n104), .BO(n103), .B1(n105), .CO(n102), .C1(
59  n101) );
60  INVXL_U64 ( .A(x[1]), .Y(n101) );
61  XNOR2XL_U65 ( .A(y[1]), .B(n60), .Y(n62) );
62  OR3XL_U66 ( .A(n60), .B(y[1]), .C(n59), .Y(n44) );
63  CLKINAND02X_U67 ( .A(n44), .B(n69), .Y(y[3]) );
64  INVXL_U68 ( .A(n58), .Y(n60) );
65  NAH02BL_U69 ( .A0(y_t[3]), .B(y[3]), .Y(n65) );
66  OA1BB1X1_U70 ( .A0(n76), .AIH(n69), .BO(n43), .Y(n74) );
67  INVX2_U71 ( .A(n69), .Y(n72) );
68  INVX2_U72 ( .A(y[5]), .Y(n84) );
69  XNOR2X1_U73 ( .A(n70), .B(n72), .Y(y[4]) );
70  OR2X2_U74 ( .A(n45), .B(n74), .Y(n79) );
71  OA1BB1X1_U75 ( .A0(y[6]), .AIH(n89), .BO(n83), .Y(n87) );
72  OR2X2_U76 ( .A(n95), .B(n105), .Y(n97) );
73  INVX2_U77 ( .A(n71), .Y(n70) );
74  OA12BB1X1_U78 ( .A0(n45), .AIH(n74), .BO(n78), .Y(y[6]) );
75  INVX2_U79 ( .A(y[3]), .Y(n61) );
76  INVX2_U80 ( .A(n62), .Y(y[2]) );
77  INVX2_U81 ( .A(n54), .Y(done) );
78  INVX2_U82 ( .A(y_t[7]), .Y(n88) );
79  INVX2_U83 ( .A(n85), .Y(n81) );
80  OA12BB1X1_U84 ( .A0(y[4]), .AIH(n67), .BO(n66), .Y(n82) );
81  NAHD2BX2_U85 ( .A(n95), .B(n94), .Y(n105) );
82  NAHD2BX2_U86 ( .A0(y_t[1]), .B(y[7]), .Y(n83) );
83  XNOR2X1_U87 ( .A(x[1]), .B(n76), .Y(n79) );
84  INVX2_U88 ( .A(n75), .Y(n76) );
85  NAHD2BX2_U89 ( .A0(x[2]), .B(n57), .Y(n56) );
86  OA1BB1X1_U90 ( .A0(n108), .AIH(x[3]), .BO(n99), .Y(n32) );
87  NAHD3BX2_U91 ( .A(n96), .B(n33), .C(n110), .Y(n108) );
88  PX2X1_U92 ( .A(n34), .B(n97), .SO(H33), .Y(n98) );
89  NAHD2BX2_U93 ( .A0(x[1]), .B(n107), .Y(n55) );
90  OA132XL_U94 ( .A0(n90), .AI(y[6]), .A2(n89), .BO(n88), .B1(n87), .Y(n91) );
91  OA132XL_U95 ( .A0(y_t[4]), .A1(n86), .A2(n85), .BO(y_t[5]), .B1(n84), .Y(n88) );
92  INVX2_U96 ( .A(n83), .Y(n90) );
93  INVX2_U97 ( .A(y[4]), .Y(n96) );
94  XNOR2X1_U98 ( .A(x[2]), .B(n57), .Y(n59) );
95  OA121XL_U99 ( .A0(H34), .A1(n111), .BO(n110), .Y(n22) );
96  NAHD2BX2_U100 ( .A0(y_t[5]), .B(y[5]), .Y(n85) );
97  OA132XL_U101 ( .A0(n108), .A1(H34), .A2(n19), .BO(n470), .B1(n107), .Y(n35)
98  );
99  INVX2_U102 ( .A(n106), .Y(n108) );
100 OA12BB1X1_U103 ( .A0H(n65), .AIH(n64), .BO(n63), .Y(n66) );
101 AO2B2X2_U104 ( .B0(y_t[3]), .B1(n61), .A0(y_t[2]), .AIH(y[2]), .Y(n64) );
102 OA21XL_U105 ( .A0(n33), .A1(n111), .BO(n470), .Y(n102) );
103 INVX2_U106 ( .A(x[3]), .Y(y[8]) );
104 INVX2_U107 ( .A(n34), .Y(n96) );
105 NAHD2BX2_U108 ( .A0(state[1]), .B(state[0]), .Y(n19) );
106 NAHD2BX2_U109 ( .A0(n19), .B(H34), .Y(n104) );
107 MCX1_U110 ( .A(n52), .B(state[0]), .SO(n53), .Y(n36) );
108 NORX2_U111 ( .A(state[1]), .B(n50), .Y(n52) );
109 INVX2_U112 ( .A(n103), .Y(n58) );
110 INVX2_U113 ( .A(n51), .Y(n53) );
111 OA12B1X2_U114 ( .A0H(n11), .A0(state[1]), .BO(n54), .CO(n19), .Y(n51) );
112 NAHD2BX2_U115 ( .A0(state[1]), .B(state[1]), .Y(n54) );
113 NAHD2BX2_U116 ( .A0(n49), .B(n54), .Y(H47) );
114 XNOR2XL_U117 ( .A(n30), .B(n96), .Y(n99) );
115 NORX2_U118 ( .A(H33), .B(n19), .Y(n38) );
116 NAHD3BX2_U119 ( .A(n111), .B(H33), .C(n96), .Y(n103) );
117 INVX2_U120 ( .A(state[0]), .Y(n111) );
118 OA12BB1X1_U121 ( .A0H(n53), .AIH(state[1]), .BO(n103), .Y(n37) );
119 NAHD2BX2_U122 ( .A(n460), .B(n54), .Y(H46) );
120 XOR2X1_U123 ( .A(H33), .B(n19), .Y(n460) );
121 OA12BB1X2_U124 ( .A0H(H34), .A1(state[0]), .BO(n110), .Y(n470) );
122 INVX2_U125 ( .A(y_t[6]), .Y(n89) );
123 INVX2_U126 ( .A(y_t[4]), .Y(n67) );
124 BUFX2_U127 ( .A(y[1]), .Y(x[0]) );
125 A02B2X4_U128 ( .B0(n79), .B1(n78), .A0(n77), .A1H(n79), .Y(y[7]) );
126 endmodule

```

Testbench 架構與 synthesis 前的相同，只有多了 sdf 的 annotate，模擬的波型圖如下。Waveform (gate level simulation):



當 target ( $y_t$ )為 630 時，done 拉起時， $x$  有正確輸出為 13，與 python 結果相同， $y$  值也在短暫 delay 後，在同個 clock cycle 轉成與之相符的 610。此段結果放大圖如下：



當 target ( $y_t$ )為 780 時，done 拉起時， $x$  有正確輸出為 7，與 python 結果相同， $y$  值也在短暫 delay 後，在同個 clock cycle 轉成與之相符的 790。此段結果放大圖如下：



- (d) Report the final gate count, the maximum operating speed (in MHz), and the estimated power consumption in (mW) of your design using *Design Compiler*.

## 1. 計算 gate count

從 synthesis 完成後的報告可以看到 total cell area 約為  $599.76\mu\text{m}^2$ ，如下表：

|                        |                                    |
|------------------------|------------------------------------|
| Combinational area:    | 452.995210                         |
| Buf/Inv area:          | 72.676802                          |
| Noncombinational area: | 146.764803                         |
| Macro/Black Box area:  | 0.000000                           |
| Net Interconnect area: | undefined (No wire load specified) |
| Total cell area:       | 599.760013                         |

而從 tsmc090g.pdf 中查詢之 NAND2X1 area =  $2.52 \times 1.12 = 2.8224(\mu\text{m}^2)$

| Drive Strength | Height (um) | Width (um) |
|----------------|-------------|------------|
| NAND2XL        | 2.52        | 1.12       |
| NAND2X1        | 2.52        | 1.12       |

可以得到 **gate count** =  $599.76 \div 2.8224 \approx 212.5$ 。

## 2. 尋找 max operating speed

藉由切換 testbench 中的 clock period 長度，並觀察模擬波型，去尋找什麼時候電路功能會出現錯誤。當 clock period 是 1.3ns 時，可以看到 waveform 結果依舊正確：



但當 clock period 是 1.2ns 時，可以看到 waveform 結果其中之一已經錯誤：



此時再檢查 clock period = 1.3ns 時會不會有 timing violation，如下表：

| Point                       | Incr  | Path   |
|-----------------------------|-------|--------|
| clock clk (rise edge)       | 0.00  | 0.00   |
| clock network delay (ideal) | 0.00  | 0.00   |
| x_reg_0/_CK (DFFRHQX2)      | 0.00  | 0.00 r |
| x_reg_0/_Q (DFFRHQX2)       | 0.14  | 0.14 f |
| U119/Y (NAND2BX4)           | 0.10  | 0.24 f |
| U99/Y (INVXL)               | 0.04  | 0.28 r |
| U92/Y (NAND2BX2)            | 0.05  | 0.33 f |
| U43/Y (XOR2X1)              | 0.08  | 0.41 r |
| U97/Y (OAI2BB1X4)           | 0.10  | 0.51 r |
| U80/Y (OR2X4)               | 0.08  | 0.58 r |
| U79/Y (OAI2BB1X2)           | 0.06  | 0.64 f |
| U56/Y (OAI2BB1XL)           | 0.16  | 0.80 f |
| U53/Y (OR3X2)               | 0.13  | 0.94 f |
| U54/Y (OR2X4)               | 0.08  | 1.01 f |
| U84/Y (OAI2B11X4)           | 0.04  | 1.05 r |
| U51/Y (NAND2BX2)            | 0.05  | 1.10 f |
| U48/Y (INVX1)               | 0.03  | 1.13 r |
| U148/Y (OAI32X1)            | 0.06  | 1.20 f |
| x_reg_0/_D (DFFRHQX2)       | 0.00  | 1.20 f |
| data arrival time           |       | 1.20   |
| clock clk (rise edge)       | 1.30  | 1.30   |
| clock network delay (ideal) | 0.00  | 1.30   |
| x_reg_0/_CK (DFFRHQX2)      | 0.00  | 1.30 r |
| library setup time          | -0.10 | 1.20   |
| data required time          |       | 1.20   |
| data required time          |       | 1.20   |
| data arrival time           |       | -1.20  |
| slack (MET)                 |       | 0.00   |

發現仍沒有 violation，因此可以推測 max operating speed 約為 769MHz。

### 3. 查找 estimated power consumption

Design Compiler 紛出的 estimated power consumption 如下表所示：

| Power Group   | Internal Power | Switching Power | Leakage Power | Total Power ( mW ) | Attrs     |
|---------------|----------------|-----------------|---------------|--------------------|-----------|
| io_pad        | 0.0000         | 0.0000          | 0.0000        | 0.0000             | ( 0.00%)  |
| memory        | 0.0000         | 0.0000          | 0.0000        | 0.0000             | ( 0.00%)  |
| black_box     | 0.0000         | 0.0000          | 0.0000        | 0.0000             | ( 0.00%)  |
| clock_network | 0.0000         | 0.0000          | 0.0000        | 0.0000             | ( 0.00%)  |
| register      | 9.8965e-02     | 4.2075e-03      | 4.7681e+05    | 0.1036             | ( 75.29%) |
| sequential    | 0.0000         | 0.0000          | 0.0000        | 0.0000             | ( 0.00%)  |
| combinational | 1.9224e-02     | 1.2581e-02      | 2.2184e+06    | 3.4023e-02         | ( 24.71%) |
| Total         | 0.1182 mW      | 1.6788e-02 mW   | 2.6952e+06 pW | 0.1377 mW          |           |

綜合以上，gate count 約為 212.5，max operating speed 約為 769MHz，total power consumption 約為 0.1377mW。

- (e) Generate the layout by running some Automatic Placement and Routing Tool (e.g., SoC Encounter or the like). Report the size of the layout. Perform post-layout simulation or analysis and report again the maximum operating speed and power consumption. Compare how these results are different from those in pre-layout analysis.

Pre-CTS layout and setup time checking result after optimization:



Post-CTS layout and setup/hold time checking result after optimization:



Layout after design for manufacturing:



由圖中尺標計算 layout 面積約為  $1555.48\mu m^2$

Verify geometry result:

```
VERIFY GEOMETRY ..... Starting Verification
VERIFY GEOMETRY ..... Initializing
VERIFY GEOMETRY ..... Deleting Existing Violations
VERIFY GEOMETRY ..... Creating Sub-Areas
VERIFY GEOMETRY ..... bin size: 3840
VERIFY GEOMETRY ..... SubArea : 1 of 1
VERIFY GEOMETRY ..... Cells : 0 Viols.
VERIFY GEOMETRY ..... SameNet : 0 Viols.
VERIFY GEOMETRY ..... Wiring : 0 Viols.
VERIFY GEOMETRY ..... Antenna : 0 Viols.
VG: elapsed time: 1.00
Begin Summary ...
Cells : 0
SameNet : 0
Wiring : 0
Antenna : 0
Short : 0
Overlap : 0
End Summary
Verification Complete : 0 Viols. 0 Wrngs.
```

Verify connectivity result:

```
***** End: VERIFY CONNECTIVITY *****
Verification Complete : 0 Viols. 0 Wrngs.
(CPU Time: 0:00:00.0 MEM: 0.000M)
```

Verify antenna result:

```
innovus 10> verifyProcessAntenna -reportfile gcd.antenna.rpt -error 1000
***** START VERIFY ANTENNA *****
Report File: gcd.antenna.rpt
LEF Macro File: sa.antenna.lef
Verification Complete: 0 Violations
***** DONE VERIFY ANTENNA *****
(CPU Time: 0:00:00.0 MEM: 0.000M)
```

Power analysis:

| Total Power            |            |          |
|------------------------|------------|----------|
| Total Internal Power:  | 0.06635405 | 58.3552% |
| Total Switching Power: | 0.04600630 | 40.4603% |
| Total Leakage Power:   | 0.00134685 | 1.1845%  |
| Total Power:           | 0.11370719 |          |

Post-layout simulation waveform:

Clock period = 2.1ns



Clock period = 2ns



由 waveform 可以得知當 clock period 為 2.1ns 時，電路功能仍維持正確，且當 clock period 為 2ns 時，simulation 時出現 timing violation 的 message：

```
"/usr/cadtool/ee5216/CBDK.TSMC90GUTM.Arm_f1.0/CIC/Verilog/tsmc090.v", 11876: Timing violation in testbench.sa.cnt_reg_1_
$setuphold( posedge CK:8000, posedge SN:7074, limits: (1000,68) );
"/usr/cadtool/ee5216/CBDK.TSMC90GUTM.Arm_f1.0/CIC/Verilog/tsmc090.v", 11758: Timing violation in testbench.sa.x_reg_3_
$setuphold( posedge CK:8000, posedge SN:7069, limits: (1000,65) );
"/usr/cadtool/ee5216/CBDK.TSMC90GUTM.Arm_f1.0/CIC/Verilog/tsmc090.v", 11268: Timing violation in testbench.sa.cnt_reg_0_
$setuphold( posedge CK:8000, posedge SN:7071, limits: (1000,72) );
$finish called from file "testbench.v", line 59.
$finish at simulation time      500000
VCS Simulation Report
```

綜合以上推測 max operating speed 約為 476MHz、layout 面積約  $1555.48\mu m^2$ 、total power consumption 約為 0.1137mW。與 pre-layout simulation 的比較如下：

|                           | pre-layout | post-layout |
|---------------------------|------------|-------------|
| Max Operating Speed (MHz) | 769        | 476         |
| Area ( $\mu m^2$ )        | 599.76     | 1555.48     |
| Total Power (mW)          | 0.1377     | 0.1137      |

可以發現操作頻率在 post-sim 時大幅下降，理由可能是因為在實際 layout 過後，計入更多的寄生電容、電阻，使得 propagation delay 上升。面積在 layout 後也大幅超出預期，可能是因為在合成時主要考量 cell 大小而沒有正確計入 routing, clock tree, power ring 等等效應，使得實際 layout 面積比合成時來得大不少。最後在估計 power consumption 方面，雖然數字沒有差多少，但細看 internal, switching power 的占比會發現兩者間的差異甚大，猜測可能因為在不同 tool 間估算 power 的方式不同造成它們之間的差異。