

1. < Code Debugging and Simulation > (10pts)

A. (5pts) Identify syntax error, correct, and explain: 0.5pts for each. Identify inappropriate code (or semantics error), correct, and explain: 0.5pts for each.

```
module %shifter (out,clk,rst,in1,in2); ① 少了分号
    ② 數字不能當成變數
    input clk, rst;
    input [15:0] in1;
    input [2:0] in2;
    output [15:0] out; ③ out 放在 always 內，會造成 bug

    reg[15:0] shift; ④ shift 用 assign → 會造成 bug
    assign shift = in1 >> in2;
    /* variable shifter */ ⑤ 解釋 */

    Active high reset
    ⑥ 不用!
    always @ (posedge clk and posedge rst) begin ⑦ begin
        if (!rst) out <= 16'd0;
        else out <= shift;
    end ⑧ sequential block → non-blocking
endmodule ⑨ endmodule
```

B. (5pts) Finish the waveform below based on the circuit in part A. Note that you should use the decimal number representation to answer (such as 16'd0). Use "xx" to indicate values that cannot be determined from the information given.  
(In this period, signal rst is always 0)



2. < Logic Synthesis + Blocking & Non-Blocking > (10pts)

In the following table, the left column show some pieces of Verilog RTL code. Please draw the corresponding circuits in the right column. You can use AND, OR, NAND, NOR, XOR, XNOR, NOT, MUX, D Flip-Flop, Latch in the circuit diagram.

| (a) Verilog Code (2pts)                                                                                         | Circuit Diagram |
|-----------------------------------------------------------------------------------------------------------------|-----------------|
| <pre>always @(*) begin     X = A&amp;B;     Y = X^C; end</pre>                                                  |                 |
| <pre>always @ (posedge clk) begin     A &lt;= D;     B &lt;= A ^ D;     C &lt;= B;     D &lt;= C ^ D; end</pre> |                 |
| <pre>always@(A or B or C ) begin     if (C)         D = A &amp; B; end</pre>                                    |                 |
| <pre>always@( posedge clk) begin     if (C)         D &lt;= A &amp; B; end</pre>                                |                 |

### 3. < Finite State Machine and Simulation > (10pts)

Given below is a Finite-State-Machine (FSM).

```

module FSM  (clk, rst, in, out_r);
parameter S0 = 2'b00, S1 = 2'b01, S2 = 2'b10, S3 = 2'b11;

input  clk, rst, in;
output [1:0] out_r;

reg [1:0] out_r, out;
reg [1:0] current_state, next_state;
// Next State Logic
always@(*) begin
    case(current_state)
        S0: next_state = (in == 1'b0)? S0 : S2;
        S1: next_state = (in == 1'b0)? S0 : S2;
        S2: next_state = (in == 1'b0)? S1 : S3;
        S3: next_state = (in == 1'b0)? S1 : S3;
        default: next_state = 2'b00;
    endcase
end
// Current State Memory & Output Register
always@(posedge clk or posedge rst)
begin
    if(rst) begin
        current_state <= 0;
        out_r <= 0;
    end
    else begin
        current_state <= next_state;
        out_r <= out;
    end
end
// Output Logic
always@(*) begin
    out[1] = in ^ current_state[0];
    out[0] = in ^ current_state[0] ^ current_state[1];
end
endmodule

```

$\wedge: \text{XOR}$



$S_0: 00$

$S_1: 01$

$S_2: 10$

$S_3: 11$

$\text{out: } [1:0]$

$0 \wedge 0 \quad 0 \wedge 0 \wedge 0 \Rightarrow 00$

$1 \wedge 0 \quad 1 \wedge 0 \wedge 0 \Rightarrow 11$

4/20

$0 \wedge 1 \quad 0 \wedge 0 \wedge 1 \Rightarrow 11$

$1 \wedge 1 \quad 1 \wedge 0 \wedge 1 \Rightarrow 00$

$0 \wedge 0 \quad 0 \wedge 0 \wedge 1 \Rightarrow 01$

$$\begin{array}{ll}
 1 \wedge 0 & 1 \wedge 0 \Rightarrow 0 \\
 1 \wedge 1 & 1 \wedge 1 \Rightarrow 01 \\
 0 \wedge 1 & 0 \wedge 1 \wedge 1 \Rightarrow 0
 \end{array}$$

(a) (5pts) Please draw a state transition graph below for this FSM.

Example



- (b) (5pts) We have put this module in our testbench as a Design-Under-Test (DUT).  
 After the simulation, the command window has printed response from monitor.  
 Please finish the output results below based on this FSM and given information.

```
'timescale 1ns/1ns
```

```
module testbench;
reg clk, rst;
reg in;
wire [1:0] out;

FSM DUT(.clk(clk), .rst(rst), .in(in), .out_r(out));

// APPLY STIMULUS
$monitor("%t %b %b %b %b", $time, clk, rst, in, out);
endmodule
```



Monitor Output Response:

| Time | clk | reset | in | out |    |    |
|------|-----|-------|----|-----|----|----|
| 0    | 0   | 0     | 0  | xx  |    |    |
| 1    | 1   | 1     | 0  | 00  | S0 |    |
| 2    | 0   | 0     | 1  | 00  | S0 | S2 |
| 3    | 1   | 0     | 1  | 11  | S2 | S3 |
| 4    | 0   | 0     | 1  | 11  | S2 | S3 |
| 5    | 1   | 0     | 1  | 10  | S3 | S3 |
| 6    | 0   | 0     | 1  | 10  | S3 | S3 |
| 7    | 1   | 0     | 1  | 01  | S3 | S3 |
| 8    | 0   | 0     | 0  | 01  | S3 | S1 |
| 9    | 1   | 0     | 0  | 10  | S1 | S0 |
| 10   | 0   | 0     | 1  | 10  | S1 | S2 |
| 11   | 1   | 0     | 1  | 00  | S2 | S3 |
| 12   | 0   | 0     | 0  | 00  | S2 | S1 |
| 13   | 1   | 0     | 0  | 01  | S1 | S0 |
| 14   | 0   | 0     | 0  | 01  | S1 | S0 |
| 15   | 1   | 0     | 0  | 11  | S0 | S0 |
| 16   | 0   | 0     | 0  | 11  | S0 | S0 |

latency: How many cycle does it take for input Z<sub>j</sub> output P<sub>i</sub> if D<sub>j</sub> P<sub>i</sub>.

throughput: How many operations can be completed per second

#### 4. < Important Timing Parameters > (15pts)

Suppose that the timing characteristics of the flip-flops in the circuit are the same. Their timing diagrams and parameters can be described as follows:



The circuit below operates at the clock frequency of 250MHz. Suppose that the rise, fall, and turn-off delays for each combinational element are the same.



(a) (3pts) Write the timing inequality for setup time and hold time

(i) Setup time

$$T_{csetup} = 0.5 \text{ ns}$$

$$\begin{cases} 0.5 + 2 + 1 + T_{pcq} < 4 \\ 0.5 + 1 + 1 + T_{csetup} < 4 \\ 0.5 + 0.5 + T_{csetup} < 4 \end{cases}$$

$$\Rightarrow T_{csetup} < 0.5 \text{ ns}$$

no hold time

$$T_{chold} = 0.2$$

$$\begin{cases} 0.2 + 2 + 1 > T_{chold} \\ 0.2 + 1 + 1 > T_{chold} \\ 0.2 + 0.5 > T_{chold} \end{cases}$$

$$\Rightarrow 0.9 \text{ (ns)} > T_{chold}$$

$$t_{csetup} < 0.5 \text{ ns}$$

$$t_{chold} < t_{ccq} + t_{cd}$$

$$t_{chold} < 0.2 + 1 + 1 = 2.2$$

$$2.2 \\ 3.2 \\ 0.9$$

(b) (2pts) If  $T_{csetup}$  (Setup Time) = 1ns    $T_{chold}$  (Hold Time) = 1ns

Are there setup time and hold time violations in this circuit? Use the timing inequalities in (a) for setup/hold time at the clock frequency to check them.

$T_{csetup} = 1 \text{ ns}$    ( $T_{csetup} < 0.5 \text{ (ns)}$ )  $\Rightarrow$  setup time violation

$T_{chold} = 1 \text{ ns}$    ( $T_{chold} < 0.7 \text{ (ns)}$ )  $\Rightarrow$  hold time violation

(c) (5pts) Followed by part (b), if there are setup/hold time violations in this circuit, how to perform "retiming" to solve these issues? Suppose that all of combinational elements cannot be further separated. Please draw your circuit and write the timing inequalities of the modified circuits after retiming.

$$\text{(i) Setup} \\ 0.5 + T_{compute} + 1 \leq 4 \text{ (ns)} \\ \Rightarrow T_{compute} (\text{long}) \leq 2.5 \text{ (ns)}$$

$$\text{(ii) hold} \\ 0.2 + T_{compute} (\text{short}) \geq 1 \text{ (ns)} \\ \Rightarrow T_{compute} (\text{short}) \geq 0.8 \text{ (ns)}$$



$$0.5 + 2 + T_{setup} < 4 \Rightarrow T_{setup} < 1.5 \text{ (ns)} \\ 0.2 + 1 > T_{hold} \Rightarrow T_{hold} < 1.2 \text{ (ns)} \\ \Rightarrow \text{no timing violation}$$

(d) (5pts) If there is clock skew in this circuit, as shown in bellow. Please write the timing inequality for setup time and hold time at the clock frequency of 200MHz without any timing violation.



$$\frac{1}{200 \times 10^6} = \frac{5}{5 \times 200 \times 10^6} = 5 \text{ (ns)}$$

$$\frac{1}{200 \times 10^6} = 5 \text{ ns}$$

$$T_{clk} > t_{pcq} + t_{pd} + t_{setup}$$

$$(i) 0.5 + 3 + T_{setup} < 5 \Rightarrow T_{setup} < 2 \text{ (ns)}$$

$$5 - 0.5 > 0.5 + 3 + T_{setup}$$

$$(ii) 0.2 + 1 > T_{hold} + 0.5 \Rightarrow T_{hold} < 2.2 - 0.5 = 1.7 \text{ (ns)}$$

$$T_{hold} < t_{ccq} + t_{cd}$$

$$T_{hold} + 0.5 < 0.2 + 2$$

$$T_{hold} < 2.2 - 0.5 = 1.7 \text{ (ns)}$$

→ 不同操作環境下的 cell 及 timing 資訊  
Technology library：提供給 dc 做合成用。  
SDF：記錄合成完後電路的時間資訊。(delay)  
提供給 ncverilog 做模擬用

5. < Synthesis Issues > (30pts)

A. < Important files related to Design Compiler >

(30)

Please explain the meaning of the following terminologies and where to use them:

- (a) (2pts) Technology library (e.g: slow.db/fast.db)
- (b) (2pts) Standard Delay File (e.g: CHIP\_syn.sdf)
- (c) (2pts) tsmc13.v

(a) 記錄在不同操作環境下的 cell 以及 timing 資訊。(ss, ff, corner 等)  
(提供給 dc 做合成用)

(b) SDF 檔記錄了合成完後電路的時間資訊 (delay 等...)  
(提供給 ncverilog 做模擬用)

(c) tsmc13.v 為 library 檔，提供合成之後的 Verilog netlist 內部 cell 的資訊。  
(behavior model) ("")

B. < Synopsys Design Constraints File(SDC) >

Please explain the meaning of the following command and why we use them in Design Compiler:

- (a) (2pts) set\_dont\_touch\_network [get\_clocks clk]  
set\_ideal\_network [get\_ports clk]
- (b) (2pts) set\_clock\_uncertainty 0.1 [get\_clocks clk]

(a) 在 CLK 這個路徑上不做任何變化 (等到 APR 再做)，所以也不需考量  
這個路徑的 loading，故有 set\_ideal\_network

(b) 在評估 CLK tree 時可能 CLK 到首個 register 的時間會有些微差距，uncertainty  
為此可能時間，此指令提供計算 slack 時，用較差情況計算 skew

C. (3pts) < STA & Post-sim >

If we specify the clock to be 5.0ns during synthesis, the timing report shows that the constraints has been met. However, the gate-level simulation passed at 4.0ns with one set of test data. Is this possible? Why or why not?

有可能。critical path 的時間小於 5 ns，但其他路徑可能不到 5 ns。

此組測量可能未涵蓋 critical path，所以 4 ns 仍有機會 pass

#### D. < Area Report >

The following figure is the area report after synthesis.

```
*****
Report : area
Design : ALU
*****
...
Number of cells: 90
Number of references: 10
Combinational area: 1939.291626
Noncombinational area: 2049.062256
Net Interconnect area: undefined
Total cell area: 3988.353760
Total area: undefined
```

- (a) (2pts) It sometimes shows "undefined" in total area. Please explain it and describe how to fix it.
- (b) Followed by part(a),
- (2pts) In cell-based IC design flow, we will focus on total cell area instead of total area, please explain why.
  - (4pts) The total cell area will be underestimated in this situation. Please briefly explain why.
- (c) (3pts) With the same RTL code, if we reduce the clock cycle, what part in the report will increase? Please briefly explain why.

- (a) undefined 因為合成時未提供 wire load model 故無法得知 net area  
⇒ set wire load model 即可解決
- (b) (I) 因為合成時，每個 cell 的 實際位置 並未確定，所以實際晶片上的線條  
高度不確定，故不清楚 net 的 area
- (II) 因為未設定 wire load model，合成時會以為線的 delay = 0，因此可以  
計算的時間上升，tool 會用小一點的 cell 來合成
- (C) Combinational area 會上升，因為計算時間減少，需要用面積  
較大的 cell 來計算增加速度

## technology file

### E. < External IP issue >

If there's a memory module in DUT, and we generate several files from Memory Generator. Such as rom\_1024x4\_t13\_slow\_synth\_db rom\_1024x4\_t13.v.

- (a) (2pts) Please explain what's the purpose of these files and what will happen if we don't have these files.

與 standard cell 類似，db 檔為 technology file，而 .v 檔則為 Verilog 檔  
提供模擬時的資訊。若缺少 db 則不知道不用 operating-condition 下的資訊。  
若少了 .v 檔，gate-level simulation 時會無法模擬。

↑ nice!

- (b) (2pts) Please modify Design Compiler setting file .synopsys\_dc.setup as shown below. (JUST NEED TO POINT OUT WHERE TO MODIFY)

```
set search_path "Your_path/CBDK_TSMC013_Arm/CIC/SynopsysDC $search_path"  
set target_library "slow.db fast.db" rom_1024x4_t13_slow_synth_db  
set link_library "* $target_library dw_foundation.db"  
set symbol_library "tsmc13.sdb generic.sdb"  
set synthetic_library "dw_foundation.sldb"  
...
```

- (c) (2pts) Should we synthesis rom\_1024x4\_t13.v with DUT.v ? Please explain why.

不用。.v 檔內部為用來提供模擬時的資訊，要合成時無關。  
memory layout 檔已有 錄成

6. < Timing Analysis > (10pts)

(10)

Calculate the arrival time, required time, and slack at each gate output, and find a critical path from primary input to primary output. Assume the delays of NAND gates and NOR gates are 4ns and 3ns, respectively. The arrival time at primary inputs is 0ns and the required time at primary outputs is 14ns.



X1: Arrival 0; Required 2; Slack -1  
 X2: Arrival 0; Required 1; Slack 1  
 X3: Arrival 0; Required 1; Slack 1  
 Y1: Arrival 3; Required 2; Slack -1  
 Y2: Arrival 7; Required 6; Slack -1  
 Y3: Arrival 6; Required 6; Slack 0  
 Y4: Arrival 11; Required 10; Slack -1  
 Z1: Arrival 14; Required 14; Slack 0  
 Z2: Arrival 15; Required 14; Slack -1  
 Critical path:  $X_2, Y_1, Y_2, Y_4, Z_2$ , Slack = -1

$\pi_3$



7. < Design for Testability > (15pts)

- (a) (5pts) Given one OR gate for your reference below. Answer the following questions.

(17)



Complete the single stuck-at-fault (SSF) table of two-input OR gate below for the output value with SSF. By signal/logic\_value we mean single stuck-at-logic\_value fault on signal, e.g. A/0 means stuck-at-0 fault on signal A. Please also mark the output value at Y differing from fault free one with "\*" character (such as "1\*").

| Input |   | Fault-free Output | Output Value on Y with SSF |     |     |     |     |     |
|-------|---|-------------------|----------------------------|-----|-----|-----|-----|-----|
| A     | B | Y                 | A/0                        | A/1 | B/0 | B/1 | Y/0 | Y/1 |
| 0     | 0 | 0                 | 0                          | 1*  | 0   | 1*  | 0   | 1*  |
| 0     | 1 | 1                 | 1                          | 1   | 0*  | 1   | 0*  | 1   |
| 1     | 0 | 1                 | 0*                         | 1   | 1   | 1   | 0*  | 1   |
| 1     | 1 | 1                 | 1                          | 1   | 1   | 1   | 0*  | 1   |

| A | B | Y | A/0 | A/1 | B/0 | B/1 | Y/0 | Y/1 |
|---|---|---|-----|-----|-----|-----|-----|-----|
| 0 | 0 | 0 | 0   | 1*  | 0   | 1*  | 0   | 1*  |
| 0 | 1 | 1 | 1   | 1   | 0*  | 1   | 0*  | 1   |
| 1 | 0 | 1 | 0*  | 1   | 1   | 1   | 0*  | 1   |
| 1 | 1 | 1 | 1   | 1   | 1   | 1   | 0*  | 1   |

- (b) (10pts) Given the circuit below, please generate one test pattern to detect the faults given as below. You may use D-Algorithm to generate the pattern. Please write down your pattern in the form of {abc}, e.g. {01X}, where X is don't-care bit.



1. (5pts) Z s-a-1



2. (5pts) M s-a-1

No pattern can detect

每三條錯誤

傳出來

M S-A-1

所以沒有

pattern  
detect



**Bonus!! < STA & Post-sim inconsistent issue > (15pts)**

The following is the RTL code from testbench (tb) and Design Under Test (DUT) module. The RTL simulation with those files already passed. However, if we use the SDC (Synopsys Design Constraints) file shown below, the timing report shows that the constraints is met, but the post-sim will fail.

**Testbench (tb)**

```
'timescale 1ns/10ps
`define CYCLE 2.0
`define SDFFILE "./DUT_syn.sdf"
`define End_CYCLE 100

`define PAT "./pattern.dat"
`define EXP "./golden.dat"

module tb;

parameter N_EXP = 10;
parameter N_PAT = N_EXP;

integer      i, exp_num, err;
reg          over;

reg  [3:0]   in_mem [0:N_PAT-1];
reg  [3:0]   exp_mem [0:N_EXP-1];

reg          clk;
reg          rst;
reg          in_en;
reg  [1:0]   A, B;
wire         O_ready;
wire  [3:0]  O;
reg  [3:0]  O_exp;

DUT DUT(.clk(clk), .rst(rst), .in_en(in_en), .A(A), .B(B), .O_ready(O_ready), .O(O));

initial $sdf_annotation(`SDFFILE, DUT);

initial $readmemh(`PAT, in_mem);
initial $readmemh(`EXP, exp_mem);
```

```

initial begin
    clk          = 1'b0;
    rst          = 1'b0;
    in_en       = 1'b0;
    exp_num     = 0;
    err          = 0;
    over         = 0;
end

always #(`CYCLE/2) clk = ~clk;

// data input
initial begin
    @(negedge clk) rst = 1'b1;
    #(`CYCLE);      rst = 1'b0;

    @(negedge clk) i=0;
    while (i <= N_PAT) begin
        in_en = 1'b1;
        A      = in_mem[i][3:2];
        B      = in_mem[i][1:0];
        i = i +1;
        @(negedge clk);
    end
end

// result compare
always @ (negedge clk) begin
    O_exp = exp_mem[exp_num];
    if(O_ready) begin
        if(O != O_exp) begin
            $display("ERROR at %5d:O %4h != O_exp %4h ",exp_num,O,O_exp);
            err = err + 1;
        end
        exp_num = exp_num + 1;
    end
    if(exp_num == N_EXP) over = 1;
end

```

```

initial begin
#(`CYCLE * `End_CYCLE);
$display("-----\n");
$display("Error!!! Somethings' wrong with your code ...!\n");
$display("-----FAIL-----\n");
$display("-----\n");
$finish;
end

initial begin
@(posedge over)
if((over) && (exp_num!='d0)) begin
$display("-----\n");
if (err == 0)  begin
$display("All data have been generated successfully!\n");
$display("-----PASS-----\n");
end
else begin
$display("There are %d errors!\n", err);
$display("-----\n");
end
end
#(`CYCLE/2); $finish;
end

endmodule

```

## DUT

```
module DUT( clk, rst, in_en, A, B, O_ready, O);
    input      clk;
    input      rst;
    input      in_en;
    input [1:0] A;
    input [1:0] B;
    output     O_ready;
    output [3:0] O;

    reg      O_ready, O_ready_nxt;
    reg [3:0] O;
    reg [3:0] A_sqr, A_sqr_nxt;
    reg [3:0] B_sqr, B_sqr_nxt;
    always@(*) begin
        A_sqr_nxt = A*A;
        B_sqr_nxt = B*B;

        if(in_en)
            O_ready_nxt = 1'b1;
        else
            O_ready_nxt = 1'b0;

        O = A_sqr + B_sqr;
    end

    always@(posedge clk or posedge rst) begin
        if(rst) begin
            A_sqr <= 2'd0;
            B_sqr <= 2'd0;
            O_ready <= 1'b0;
        end
        else begin
            A_sqr <= A_sqr_nxt;
            B_sqr <= B_sqr_nxt;
            O_ready <= O_ready_nxt;
        end
    end
endmodule
```

### Synopsys Design Constraints (SDC)

```
set cycle 2.0
create_clock -name clk -period $cycle [get_ports clk]
set_fix_hold                                [get_clocks clk]
set_dont_touch_network                      [get_clocks clk]
set_ideal_network                           [get_ports clk]
set_clock_uncertainty                     0.1 [get_clocks clk]
set_clock_latency                          0.5 [get_clocks clk]

set_max_fanout 6 [all_inputs]

set_operating_conditions -min_library fast -min fast -max_library slow -max slow
set_drive      1      [all_inputs]
set_load       1      [all_outputs]
set_input_delay 0.1 -clock clk [remove_from_collection [all_inputs] [get_ports clk]]
set_output_delay 0.1 -clock clk [all_outputs]
set_wire_load_model -name tsmc13_wl10 -library slow
```

(a) (5pts) Please explain what message might show in terminal and why?

可能會有 setup time violation, 因為 Input delay in sdc = 0.1 但 tb 有 negedge clk  
+3. 累積時間可能會太高造成 setup time violation  
Error at ... /output delay

(b) (5pts) If we can modify the SDC file, please explain what's wrong in it and how to fix it.

Input delay 設成 \$cycle/2 要 tb 相符合  
+3. output delay " \$cycle/2 "

(c) (5pts) If we can only modify RTL code in DUT module, please describe what's wrong and how to fix it. (YOU DON'T NEED TO WRITE MODIFIED CODE)

+5 若不考慮 input/output delay, 則 RTL 寫的時候 input / output port 都  
指一個 register, tb 也一樣, 皆有一個完整的 clk cycle 可以計算, 不要  
Input or output delay 斷言



$A \rightarrow$  [comb.]  $\rightarrow Y$





$$\Rightarrow T_c > t_{\text{setup}} + t_{\text{pcq}} + t_{\text{pd}} \quad (\text{MET})$$

slack



$$\Rightarrow t_{\text{hold}} < t_{\text{ccq}} + t_{\text{cd}} \quad (\text{MET})$$



! |  $t_{cd}$  |

