

Campus | B5  
7mm×30行 40页

Digital System Design

Campus®

B5 点线本 | 7mm×30行 | 40页

KOKUYO

# Introduction

IC: Integrated circuit

ASIC: Application-specific integrated circuit

PLDs: Programmable logic devices

PLAs: Programmable logic arrays



## # Design flow

① Write a specification

- ② Partition the design into smaller parts and write a specification for each part (if necessary)
- ③ From the specification → draw state machine chart (each state, input conditions, change of state, output)
- ④ Minimize the number of states
- ⑤ Assign Boolean variables to represent each state
- ⑥ Derive the next state and output logic
- ⑦ Optimize the next state and output logic to minimize the number of gates needed.
- ⑧ Choose a suitable placement for the gates in terms of which gates share integrated circuits and in terms of where each integrated circuit is placed on the printed circuit board.
- ⑨ Design the routing between the integrated circuits.

EDA: Electronic design automation

CAD: Computer aided design

RTL: Register transfer level

Dominant technology:

CMOS ✓ Uses FETs

CMOS: Complementary metal oxide semiconductor

NMOS



The bulk is always connected to the GND (logic 0)

MOS inverters



PMOS



The bulk is always connected to the positive power supply (logic 1)

Transmission Gate

两个控制开关是串接的  
它需要双电源(供电或接地)

Three state buffer



EN=1

Normally

EN=0

Neither of the two inner  
transistor is conducting and  
the output floats



EN=1

Both transistors conduct

EN=0

Both transistors are open circuit



Multiplexer

PMOS passes a strong '1'  
NMOS passes a strong '0'



D-latch

## # Transfer characteristic of CMOS inverter

输入逻辑0的最大值，使输出跳变，一般大于输出逻辑0的电压  
输出逻辑1的最小值，一般大于输入逻辑1的电压



$$NML = V_{ILmax} - V_{OLmax}$$

$$NMH = V_{OHmin} - V_{IHmin}$$

The noise margin specifies how much noise, from electrical interface, can be added to a signal before a logic value is misinterpreted.

E.g.: The maximum voltage that a gate will generate to represent a logic '0' is 0.75V

Any voltage up to 1.05V is, however, recognized as a logic '0'

Therefore there is a 'spare' 0.3V, any noise added to a logic 0 within this band will be accepted.

## # Fan-out

The fan-out of a gate is the number of other gates that it can drive

[TTL] the input to a gate is resistive e.g.: 74ALS

$$\text{Logic 1 fan-out} = \frac{I_{OHmax}}{I_{IHmax}} = \frac{400\mu A}{20\mu A} = 20 \Rightarrow \text{choose the smaller value}$$

$$\text{Logic 0 fan-out} = \frac{I_{OLmax}}{I_{ILmax}} = \frac{8mA}{100\mu A} = 80$$

[CMOS] no dc input current while gate and substrate form a capacitor

CMOS gates draw almost no DC input current because there is no DC path between the gate of a transistor and the drain, source or substrate of the transistor.

~~Different effect:~~ the gate and substrate of a CMOS gate ~~form~~ form a capacitor, it takes a finite time to charge that capacitor, and hence the fan-out is determined by how fast the circuit is required to switch



$$V_o = V_{DD} e^{-\frac{t}{R_N C_G}}$$

$$t = 2.5\text{ns} \quad (\text{If } V_{DD} = 2.5\text{V}, R_N = 100\Omega, C_G = 100\text{pF})$$

If two inverters are driven, the capacitive load doubles, so the switching time doubles.

Although a CMOS gate can drive an almost unlimited number of other gates at a fixed logic level, the fan-out is limited by the speed required of the circuit.

- { Xilinx.com
- Actel.com

## Combinational logic design

### # Rules of Boolean algebra

① Commutativity  $A + B = B + A \quad A \cdot B = B \cdot A$

② Associativity  $A + (B + C) = (A + B) + C \quad A \cdot (B \cdot C) = (A \cdot B) \cdot C$

③ Distributivity  $A \cdot (B + C) = A \cdot B + A \cdot C$

### # De Morgan's law

$$(\overline{A \cdot B}) = \overline{A} + \overline{B} \quad (\overline{A+B}) = \overline{A} \cdot \overline{B}$$

### # Shannon's expansion theorem (将任意布尔函数表达为：集中任何一个变量乘以一个子函数，加上这个变量乘以另一个子函数)

$$F(A, B, C, D, \dots) = A \cdot F(1, B, C, D, \dots) + \overline{A} \cdot F(0, B, C, D, \dots) \quad F = AB + C = A \cdot (B + C) + \overline{A} \cdot C$$

$$= (A + F(0, B, C, D, \dots)) \cdot (\overline{A} + F(1, B, C, D, \dots)) \quad = AB + AC + \overline{AC} = AB + C$$

### # Logic gates

AND: 



OR: 

NOT: 

XOR: 



NAND: 



NOR: 

| A | B | C | Z                |
|---|---|---|------------------|
| 0 | 0 | 0 | 1 m <sub>0</sub> |
| 0 | 0 | 1 | 1 m <sub>1</sub> |
| 0 | 1 | 0 | 0 M <sub>2</sub> |
| 0 | 1 | 1 | 0 M <sub>3</sub> |
| 1 | 0 | 0 | 0 M <sub>4</sub> |
| 1 | 0 | 1 | 1 m <sub>5</sub> |
| 1 | 1 | 0 | 0 M <sub>6</sub> |
| 1 | 1 | 1 | 1 m <sub>7</sub> |

A minterm is a Boolean AND function containing one instance (function is true)  
A maxterm is a Boolean OR function with one instance (function is false)  
 $\Rightarrow Z = m_0 + m_1 + m_5 + m_7 = M_2 + M_3 + M_4 + M_6$   
 $= \overline{A}\overline{B}\overline{C} + \overline{A}\overline{B}C + \overline{A}B\overline{C} + ABC = \overline{A}\overline{B}\overline{C} + A\overline{B}\overline{C} + \overline{A}B\overline{C} + \overline{A}\overline{B}C$   
 $= (A + \overline{B} + C)(A + \overline{B} + \overline{C})(\overline{A} + B + C)(\overline{A} + \overline{B} + C)$

### # Logic minimization

#### [Karnaugh maps]

Rules : ① Circle the largest possible groups (1, 2, 4, ..., 2<sup>n</sup>)  
② Avoid circles inside circles

| CD | AB | DD | 00 | 01 | 11 | 10 |
|----|----|----|----|----|----|----|
| 00 | 00 | 00 | 00 | 01 | 11 | 10 |
| 01 | 00 | 01 | 01 | 11 | 11 | 11 |
| 11 | 00 | 01 | 11 | 11 | 11 | 10 |
| 10 | 00 | 01 | 10 | 00 | 11 | 01 |

Karnaugh maps is a graphical method, effectively with six or fewer variables

$$\Rightarrow Z = B \cdot D + \overline{B} \cdot \overline{D}$$

## # Timing

All gates have a finite delay between a change at an input and the change at an output. If gates are used, therefore, different paths may exist in the network, with different types.

$$\text{E.g.: } Z = A \cdot C + B \cdot \bar{C}$$



Let  $A=1$  and  $B=1$

$\Rightarrow Z=1$ , irrespective of the value of  $C$



## Types of hazard

|           | Occur in          |
|-----------|-------------------|
| Static 1  | AND-OR NAND-NAND  |
| Static 0  | OR-AND NOR-NOR    |
| Dynamic 0 | 3 or more unequal |
| Dynamic 1 | Signal paths      |

← This  $Z$  changes from 1 to 0 and back to 1, known as hazard

Static hazards can be avoided by designing with redundant logic

$$\begin{array}{c} AB \\ \diagdown \quad \diagup \\ C \end{array} \quad \begin{array}{c} 00 \quad 01 \quad 11 \quad 10 \\ | \quad | \quad | \quad | \\ 0 \quad 0 \quad 1 \quad 0 \\ | \quad | \quad | \quad | \\ 1 \quad 0 \quad 0 \quad 0 \end{array} \Rightarrow Z = A \cdot C + A \cdot B + B \cdot \bar{C}$$

$\Rightarrow Z=1$  hold the value, no glitch

## # Number codes

## [Integers]

Two's complement:  $-b: 2^n - b \rightarrow b$  is a binary number<sup>n</sup>

represented using  $n$  bits

$$\text{E.g.: } -6_{10} : 1000_2 - 0110_2 = 1010_2$$

The same result is obtained by inverting all bits and adding 1  
 $6_{10} \Rightarrow 0110_2 \Rightarrow 1001_2 \Rightarrow 1001_2 + 1_2 \Rightarrow 1010_2 \checkmark$

[Floating point numbers] ← allows a much wider range of accuracy

$$(-1)^s + 1.m \times 2^e \quad \text{Typically 32 bits: } 1 \Rightarrow \text{sign bit (s)} \quad 8 \Rightarrow \text{exponent (e)} \quad 23 \Rightarrow \text{mantissa (m)}$$

↑ 二进制

$$\text{E.g.: } 123.456 \xrightarrow{\text{①二进制}} 11110111.0111010010111001_2 \text{ (24位)} \xrightarrow{\text{②转换}} 1.1110110111010010111001_2 \text{ (23位, 小数点后6位)}$$

阶码要用移码表示 (符号位相反)  $b_{10} \xrightarrow{\text{移码}} 10000101_2$  (可以直接 ? +127 转二进制)

用补码表示阶码时, 当阶码无符号, 产生下溢的时候, 阶码变成了0, 浮点数的值变为1

而实际上这个数字无限接近于0 (下溢) 需要取出 "-0" 作为机器零

浮点数的阶码表示指数大小, 有正有负, 对每个阶码都加上一个正的常数(称偏移常数), 使能表示的所有阶码都为整数

就变成了偏移了的阶码也就是移码, 浮点数小数点的实际位置由移码减去偏移常数决定

### [Gray codes]

The transition from  $0111_2(7)$  to  $1000_2(8)$  causes three bits to change, it may be undesirable that several bits should change at once as bits <sup>change</sup> may not occur at exactly the same time.

000  
001  
011  
010  
110  
111  
101  
100

A gray code is one in which only one bit changes at a time.

### [Parity bits] 校验位

Use a parity bit to tell that whether there are an even number of ones in the data.

A single error can be corrected by using a two-dimensional parity scheme in which every ninth word is itself a set of parity bits. (row parity + column parity)

## Combinational logic using VHDL gate models

Combinational logic is stateless: changes in inputs are immediately reflected by changes in inputs. VHDL = Very-high-speed Hardware Description Language.

### # Entities and architectures

```
entity And2 is
  port (x,y : in BIT; z : out BIT); {in/out
end entity And2; {inout
{inout}  $\leftarrow$  Bidirectional

architecture ex1 of And2 is
begin
  z <= x and y;
end architecture ex1;
```

$\leftarrow$  Describe a block box, inputs and output with types  
 $\leftarrow$  Specify connections between entity and the outside world  
 $\leftarrow$  Bidirectional

### # Identifiers, spaces and comments

VHDL is not case-sensitive, VHDL description: ' -- '

Each VHDL should include a header:

- Design unit:
- File name:
- Description:
- Limitation:
- System: VHDL'93
- Author: Klein Kraang
- Revision: Version 1.0

```
architecture netlist2 of comb-function is
component And2 is
  port (x,y : in BIT; z : out BIT);
end component And2;
component Or2 is
  port (x,y : in BIT; z : out BIT);
end component Or2;
signal p,q,r : BIT;
begin
```

Component Declaration

```
g1 : Not1 port map(a,p);
g2 : And2 port map(p,b,q); ...
end architecture netlist2;
```

### # Netlists

For function:  $Z = \bar{A} \cdot B + A \cdot C$

```
entity comb-function is
  port (a,b,c : in BIT; z : out BIT);
end entity comb-function;
```

```
architecture expression of comb-function is
begin
  z <= (not a and b) or (a and c);
end architecture expression;
```

[Manual]

```
entity Or2 is
  port (x,y : in BIT; z : out BIT);
end entity Or2;
```

```
entity Not1 is
  port (x : in BIT; z : out BIT);
end entity Not1;
```

architecture ex1 of Or2 is
begin
 z <= x or y;
end architecture ex1;

architecture ex1 of Not1 is
begin
 z <= not x;
end architecture ex1;

The order of signal is important  
 We can assign signal in different order:

```
g2: entity WORK.And2(ex1) port map (z=>q, x=>p, y=>b);
```

Identify signals to connect gates → Create instance → architecture netlist of comb-function is  
 signal p,q,r : BIT;  
 begin  
 g1: entity WORK.Not1(ex1) port map (a,p);  
 g2: entity WORK.And2(ex1) port map (p,b,q);  
 g3: entity WORK.And2(ex1) port map (a,c,r);  
 g4: entity WORK.Or2(ex1) port map (q,r,z);  
 end architecture netlist;

$\rightarrow$  refers to the current working library  
 $\rightarrow$  refers, in each case, to the architecture of each gate model

Direct Instantiation Style

usually preferred for simple netlists.

## # Signal assignments

Inertial delay:  $Z \leftarrow X$  after 4 ns

默认的延时  
(不会将传输保持时间低于4ns的pulse)

Transport delay:  $Z \leftarrow \text{transport } X$  after 4 ns (用于模拟以连线延迟)

$Z \leftarrow X$  and  $Y$  after 5 ns ✓

$Z \leftarrow \text{transport not } ((X \text{ and } Y) \text{ or } (A \text{ and } B))$  after 8 ns ✓

Multiple changes are allowed in response to a single change:

$Z \leftarrow X$  after 4 ns, not  $X$  after 8 ns;

160

在整个时间轴

## # Generics

$Z \leftarrow X$  and  $Y$  after 5 ns; defines exactly delay for an AND gate. We could define different delay using parameter

entity And2 is

```
generic (delay : DELAY_LENGTH);
port (X, Y : in BIT; Z : out BIT);
end entity And2;
```

a value is passed,

g2: entity WORK.And2(ex2) generic map (5ns)  
port map (p, b, q);

architecture ex2 of And2 is

begin

$Z \leftarrow X$  and  $Y$  after delay;

end architecture ex2;

A component declaration must also include the generic declaration  
The nonpositional form can also be used.

It can be useful to specify a default value for a generic

generic (delay : DELAY\_LENGTH := 5 ns); ← A different value can override the default if it is explicitly stated.

If a component declaration is used, the default value can be specified in the entity declaration or the component declaration or in both. If different default values are specified in each declaration, the value in the component declaration will be used.

If no default value is given and the generic map part is omitted, delay would be undefined X.

If the component declaration does not include the generic part definition, the default value will automatically be used and the generic map part of the component statement in instantiation must be omitted

Finally, the default value of a generic will be used if the reserved word open is used:

g2: entity WORK.And2(ex2) generic map (open)  
port map (p, b, q);

## # Constant and open ports

There may be occasions on which not all the inputs or outputs of a component are needed.

E.g. 'universal' gate : invert=0  $\rightarrow$  AND , invert=1  $\rightarrow$  OR

entity universal is

```
port (x,y,invert : in BIT; a,o : out BIT);
end entity universal;
```

architecture univ of universal is

begin

```
a <= (y and (x xor invert) or (invert and not y));
o <= (not x and (y xor invert)) or (x and not invert);
```

end architecture univ;

Use this gate as an AND gate :

no : entity WORK.universal (univ) port map (x,y, '0', a, open);

Outputs can be left <sup>connected</sup> but inputs may be left open only if a default value has been specified.

## # Testbenches

entity TestAnd2 is  
end entity TestAnd2;

architecture io of TestAnd2 is  
signal a,b,c : BIT;

begin

```
g1: entity WORK.And2(ex2) port map (x=>, y=>, z=>c);
a <= '0', '1' after 100 ns;
b <= '0', '1' after 150 ns;
end architecture io;
```

## # Configuration

The architecture name was explicitly stated, but it can be omitted if an entity has only one architecture.

g1: entity WORK.Not1(ex1) port map (p,q);  
g1: entity WORK.Not1 port map (p,q);

Suppose that we wish to have more control over exactly which architecture to use. With direct instantiation, there is no difficulty.

With alternative style, however, there needs to be an explicit statement - the configuration specification

architecture alternate of TestAnd2 is

component A2 is

port (x, y : BIT; z : out BIT);

end component A2;

for all : A2 use entity WORK.And2(ex2);

signal a,b,c : BIT;

begin

g1: A2 port map (x=>a, y=>b, z=>c);

end architecture alternate;

For complex models, with several levels of hierarchy, it is often more appropriate to use a configuration unit.

configuration Tester1 of TestAnd2 is

```
for i0
    for g1 : And2
        use entity WORK.And2(ex1);
    end for;
end for;
end configuration Tester1;
```

A configuration declaration for the original testbench

The complete model consists of:

- ① entity and architecture of the And2 gate
- ② entity and architecture of the testbench
- ③ configuration

This style requires one configuration for the entire design

Another way to write configuration: (bound - socket - chip analogy) ↗ configuration is used to map between arbitrary internal and external names

architecture remapped of TestAnd2 is

```
component MyAnd2 is
    generic (dly : DELAY-LENGTH );
    port (in1, in2 : BIT; out1 : BIT);
end component MyAnd2;
signal a, b, c : BIT;
begin
    g1 : MyAnd2 generic map (dly) port map (in1, in2, out1);
end architecture remapped;
```

configuration Tester2 of TestAnd2 is

```
for remapped
    for g1 : MyAnd2
        use entity WORK.And2(ex2)
        generic map (delay => dly);
        port map (x => in1, y => in2, z => out1);
    end for;
end for;
end configuration Tester2;
```

A different style of configuration has one configuration for each per entity

configuration And2con of And2 is

```
for ex1
    end for;
end configuration And2con;
```

↑ This selects the architecture for an entity

This approach requires a great number of configuration units, but each unit is simpler.

For TestBench  $\Rightarrow$  configuration Tester3 of TestAnd2 is

```
for remapped
    for g1 : MyAnd2
        use configuration And2con;
    end for;
end for;
end configuration Tester3;
```

## Combinational building blocks

# Three state buffers. 高阻抗状态相当于隔断状态，对下一级无影响  
[Multi-valued logic]

Use switches to disconnect a signal from a wire, as being in a 'high-impedance' state.

Type BIT is no longer adequate to represent logical signal value { '0', '1' }

signal a, b, c: tri; -- tri is a new type

We need an AND operator described by the truth table

| AND | 0 | 1 | Z |
|-----|---|---|---|
| 0   | 0 | 0 | 0 |
| 1   | 0 | 1 | 1 |
| Z   | 0 | 1 | 1 |

In VHDL, functions and operators can be overloaded

Rewrite a new AND operator to take two operands of type tri

function "and" (Left, Right: tri) return tri is

type tri\_array is array (tri, tri) of tri;  
constant and\_table : tri\_array := ((0', 0', 0'),  
(0', 1', 1'),  
(0', 1', 1'));

Fail if input BIT and TRI  
as the operator has not been defined.

begin  
return and\_table (Left, Right);  
end function "and";

## [Standard logic package]

|     |                 |
|-----|-----------------|
| 'U' | Uninitialized   |
| 'X' | Forcing unknown |
| '0' | Forcing 0       |
| '1' | Forcing 1       |
| 'Z' | High impedance  |
| 'W' | Weak unknown    |
| 'L' | Weak '0'        |
| 'H' | Weak '1'        |
| '-' | Don't care      |

The standard logic type is defined by:

type std\_std\_logic is ('U', 'X', '0', '1', 'Z', 'W', 'L', 'H', '-');

If we write a model using signals of type BIT and/or std-logic, we must ensure  
that two models do not attempt to put a value onto the same signal.

std-logic allows more than one output to be connected to the same signal

subtype std-logic is resolved std-to std-logic;

|   |   |   |   |   |   |   |   |   |
|---|---|---|---|---|---|---|---|---|
| U | X | 0 | 1 | Z | W | L | H | - |
| U | U | V | V | V | V | V | V | V |
| X | U | X | X | X | X | X | X | X |
| 0 | V | X | O | O | O | O | X |   |
| 1 | V | X | X | I | I | I | I | X |
| Z | V | X | O | I | Z | W | W | H |
| W | V | X | O | I | W | W | W | W |
| L | V | X | O | I | L | W | W | W |
| H | V | X | O | I | H | W | W | H |
| - | U | X | X | X | X | X | X | X |

library IEEE;  
use IEEE.std\_logic\_1164.all;

## [When...else statement]

```
library IEEE;
use IEEE.std_logic_1164.all;

entity three_state is
    port (a, enable : in std_logic;
          z : out std_logic);
end entity three_state;
```

architecture when-else of three\_state is  
begin

$z \leftarrow a$  when enable = '1' else 'Z';

end architecture when-else;

$\rightarrow z \leftarrow a$  after 4 ns when enable = '1' else 'Z';

## # Decoders

### [2 to 4 decoder]

Vectors can be described using an array of Boolean signals:

**type** std-logic-vector **is array** (NATURAL range <>) **of** std-logic;  
 ↓                                                           ↓  
 Predefined subtype      Undefined  
 with integer values      range  
 from 0      to max

entity decoder is

port (a: in std\_logic\_vector (1 downto 0);

      z: out std\_logic\_vector (3 downto 0));

entity end entity decoder;

architecture when-else of decoder is

begin

$z \leftarrow "0001"$  when  $a = "00"$  else

$"0010"$  when  $a = "01"$  else

$"0100"$  when  $a = "10"$  else

$"1000"$  when  $a = "11"$  else

$"XXXX"$ ; -- Can be omitted

end architecture when-else;

### [With...select statement]

architecture with-select of decoder is

begin

with a select

$z \leftarrow "0001"$  when "00",

"0010" when "01",

"0100" when "10",

"1000" when "11",

"XXXX" when others;

end architecture with-select;

If one more than one pattern should give the same output,  
the pattern can be listed.

eg.

...     "0000000" when "1010" | "1011" | "1100",

...

## [n to $2^n$ decoder - shift operators]

```

library IEEE;
use IEEE.std_logic_1164.all;

entity decoder is
    generic (n: positive);          -- POSITIVE is a predefined subtype of INTEGER (1..MAX)
    port (a: in std_logic_vector (n-1 downto 0);
          z: out std_logic_vector (2**n-1 downto 0));   -- ** is the power operator
end entity decoder;

```

As the output value is always "00...01" rotated left by the number of places given by a, we can declare a vector of length  $2^n$  with all bits set to '0' other than bit '0' with a constant declaration:

```
constant z_out: std_logic_vector (2**n-1 downto 0) := (0 => '1', others => '0');
```

VHDL has six shift operators: sll, sla, rol, srl, sra, and ror.

sll:

sla:

rol:

srl:

sra:

ror:



G means select  
I is signal name



$\frac{D}{3}$  means 0-3



M: mode control (0 to 3)

Mode 1: shift right

Mode 2: shift left

1, 4D - in mode 1 D type behaviour when control signal 4 is asserted

## # State Machine



Two states : { { Major Road Red  
Minor Road Green }  
{ Major Road Green  
Minor Road Red }



## [ASM Symbol]



## # State

\* Output asserted for ONE CLOCK CYCLE

## [Synthesis from ASM]

| Present State            | CAR, TIMED |     |     |     |
|--------------------------|------------|-----|-----|-----|
|                          | 00         | 01  | 11  | 10  |
| G                        | G,0        | G,0 | R,1 | R,1 |
| R                        | R,0        | G,0 | G,0 | R,0 |
| Next state , START-TIMER |            |     |     |     |
| A                        | CAR, TIMED |     |     |     |
| 0                        | 0,0        | 0,0 | 1,1 | 1,1 |
| 1                        | 1,0        | 0,0 | 0,0 | 1,0 |

CAR, TIMED

|   |     |     |     |     |
|---|-----|-----|-----|-----|
| A | 00  | 01  | 11  | 10  |
| 0 | 0 0 | 0 0 | 1 1 | 1 1 |
| 1 | 1 0 | 0 0 | 0 0 | 1 0 |

$A^+ = \bar{A} \cdot CAR + A \cdot \overline{TIMED}$

CAR, TIMED

|   |     |     |     |     |
|---|-----|-----|-----|-----|
| A | 00  | 01  | 11  | 10  |
| 0 | 0 0 | 0 0 | 1 1 | 1 1 |
| 1 | 0 0 | 0 0 | 0 0 | 0 0 |

$START-TIMER = \bar{A} \cdot CAR$



## # SystemVerilog

initial → execute once and stop

## [Adder]

## Dataflow SystemVerilog

```

module full-adder (output logic sum, co,
                    input logic a, b, ci);
    always-comb
    begin
        sum = a ^ b ^ ci;
        co = (a & b) | (a & ci) | (b & ci);
    end
end-module

```

## Structural SystemVerilog

```

module full-adder (output logic sum, co,
                    input logic a, b, ci);
    logic i, j, k, l;
    xor g0 (i, a, b);
    xor g1 (sum, i, ci);
    and g2 (j, a, b);
    and g3 (k, a, ci);
    and g4 (l, b, ci);
    or g5 (co, j, k, l);
endmodule

```

ModelSim : force clk 0 0 , 110 -r 20  
set clk 0

Date

Arrays : logic [3:1] carry ; A vector value can be written as an integer, 4'b0101, <sup>✓ repeated value</sup>

[4 bit adder]

```
module four_bit_adder (output logic [3:0] sum, output logic co,
                      input logic [3:0] a, b, input logic ci);
```

logic [3:1] carry ;

```
full_adder fa0 (sum[0], carry[1], a[0], b[0], ci);
full_adder fa1 (sum[1], carry[2], a[1], b[1], carry[1]);
full_adder fa2 (sum[2], carry[3], a[2], b[2], carry[2]);
full_adder fa3 (sum[3], co, a[3], b[3], carry[3]);
```

endmodule

[Multiplexer]



G: Select

```
module mux2 (output logic y,
              input logic a, b, s);
    always_comb
        if (s)
            y = b;
        else
            y = a;
    endmodule
```

[4 input MUX]



$\frac{0}{3} : 0-3$

```
module mux (input logic a, b, c, d, s1, s0,
            output logic y);
    always_comb
        if (s0)
            if (s1)
                y = d;
            else
                y = c;
        else
            if (s1)
                y = b;
            else
                y = a;
    endmodule
```

[2-4 decoder]



```
module decoder (input logic [1:0] a,
                output logic [3:0] y);
    always_comb
        case (a)
            0: y = 1;
            1: y = 2;
            2: y = 3;
            3: y = 4;
            default: y = 'x;
        endcase
    endmodule
```

[Testbench] { tools → options  
assignment → setting → simulation }

module test\_decoder;

logic [1:0] a;

logic [3:0] y;

decoder do c.\*; // only works if internal signals have the same names

initial

begin

#10ns a=0;

#10ns a=1;

#10ns a=2;

#10ns a=3;

initial

begin

clock = 10;

forever #5ns Clock = ~Clock;

end

always

#5ns clock = 10;

#5ns clock = 1';

end

end

endmodule

SystemVerilog parameters

module decoder2

#(parameter N=3)

input logic [N-1:0] a,  
output logic [(1 << N)-1:0] y;

If  $S=2\text{b}0x$ , ? : evaluates both branches in case of  $x$

Thus ↴ return  $1\text{b}x$  [Three-state Buffer]

↓ from the tristate

mux

always\_comb  
y = 1b1 << a;

endmodule

module three-state\_mux  
(output wire y,  
input logic a,b,c,d,

assign y=(S==0)?a:1b2;

assign y=(S==1)?b:1b2;

assign y=(S==2)?c:1b2;

assign y=(S==3)?d:1b2;

endmodule



module three-state (input logic a, enable  
output logic y);

always\_comb  
y = enable ? a : 1b2;

end module

Logic vs Wire

logic is a variable, but no signal strength or signal resolution

If two signals drive the same physical hardware, use "wire" ← wire has 1b strengths (e.g. strong1, weak1, high1...)

The only occasion when it is permissible to connect Outputs of two or more gates together

Campus

**Unknowns**

- $==$  (or  $\neq$ ) treats x or z as don't care
- $\equiv$  (or  $\neq$ ) operator can return 0, 1, or x

### [Priority Encoder]

| Input                                                       | Output                              |
|-------------------------------------------------------------|-------------------------------------|
| A <sub>3</sub> A <sub>2</sub> A <sub>1</sub> A <sub>0</sub> | Y <sub>1</sub> Y <sub>0</sub> Valid |
| 0 0 0 0                                                     | 0 0 0                               |
| 0 0 0 1                                                     | 0 0 1                               |
| 0 0 1 -                                                     | 0 1 1                               |
| 0 1 - -                                                     | 1 0 1                               |
| 1 - - -                                                     | 1 1 1                               |

module encoder (output logic [1:0] y,

output logic valid,

input logic [3:0] a);

endmodule

unique checks for overlapping branches

always - comb

begin

valid = 1'b 1; // default value

→ unique casez (a) // treat z or ? as don't care

4'b1??: y=2'b11;

4'b0??: y=2'b10;

4'b001?: y=2'b01;

4'b0001: y=2'b00;

4'b0000: begin

y=2'b00;

valid = 1'b0;

end

This will  
check whether  
the case is  
unique

end case

end

### # Sequential Building Blocks

#### [D - latch]



module dlatch (output logic q,  
input logic d, en);  
always\_latch <-- indicate a latch  
if(en)  
q <= d; <-- use " $\leq$ " for  
sequential logic  
endmodule

#### [D - Flip-flop]



module dff (output logic q,

input logic d, clk);

always\_ff @ (posedge clk) <-- indicate a flip-flop  
q <= d; <-- activation list, sensitive to clocked

endmodule

No dependency on clock

#### [Flip-flop with asynchronous set/reset]



module dffr (output logic q,  
input logic d, clk, n-reset);  
always\_ff @ (posedge clk, negedge n-reset)  
if (!n-reset)  
q <= 1'b0;  
else  
q <= d;  
endmodule

#### [Flip-flop with clock enable]

module dffe (output logic q,  
input logic d, clk, enable);  
always\_ff @ (posedge clk)  
if(enable)  
q <= d;  
endmodule

#### [n-bit Register]

module dffn #(parameter N=8)  
(output logic [N-1:0] q,  
input logic [N-1:0] rd,  
input clk, n-reset, enable);  
always\_ff @ (posedge clk, negedge n-reset)  
if (!n-reset)  
q <= 10;  
else  
q <= d;  
endmodule

#### [Binary Counter]

module counter #(parameter = 8)  
(output logic [N-1:0] q,  
input logic clk, n-reset);  
always\_ff @ (posedge clk, negedge n-reset)  
if (!n-reset)  
q <= 10;  
else  
q <= q + 1;  
endmodule

#### [Universal Shift Register]

| Model Control | S <sub>1</sub> S <sub>0</sub> | Action           |
|---------------|-------------------------------|------------------|
|               | 0 0                           | Hold             |
|               | 0 1                           | Shift Right  lin |
|               | 1 0                           | Shift Left  lin  |
|               | 1 1                           | Parallel Load    |

modul usr #(parameter N=8)

(output logic [N-1:0] q,  
input logic lm, rin, clk, n-reset,  
input logic [N-1:0] s);  
always\_ff @ (posedge clk, negedge n-reset)  
if (!n-reset)  
q <= 10;  
else  
case (s)

2'b01: q <= a;  
2'b10: q <= {q[N-2:0], lm};  
2'b01: q <= {rin, q[N-1:1]};  
default:

endcase  
end module

and

3'b11D → load data

↓  
mode clock  
3

## [SystemVerilog Model of State Machine]

## \\$ Version 1

```

enum { G, R } present_state, next_state;
always @ (posedge clock, negedge n_reset)
begin : SEQ
    if (!n_reset)
        present_state <= G;
    else
        present_state <= next_state;
end

always - comb
begin : COM
    start_timer = '0;
    minor_green = '0;
    major_green = '0;
    next_state = present_state;
    case (present_state)
        G: begin
            major_green = '1;
            if (car)
                begin
                    start_timer = '1;
                    next_state = R;
                end
        end
        R: begin
            minor_green = '1;
            if (timed)
                next_state = G;
        end
    endcase
end

```

module traffic (output logic start\_timer, major\_green, minor\_green,  
input logic clock, n\_reset, timed, car);

## \\$ Version 2

```

enum { G, R } state;
always @ (posedge clock,
           negedge n_reset)
begin : SEQ
    if (!n_reset)
        state <= G;
    else
        case (state)
            G: if (car)
                state <= R;
            R: if (timed)
                state <= G;
        endcase
end

always - comb
begin : OP
    start_timer = '0;
    minor_green = '0;
    major_green = '0;
    case (state)
        G: begin
            major_green = '1;
            if (car)
                start_timer = '1;
        end
        R: begin
            minor_green = '1;
        end
    endcase
endmodule

```

## # Assert Statement

assert <condition> Message will be printed if condition is not true

## # Displaying signals

\\$display ("%2t ns a = %b, y = %b", \$time, a, y);

## # Datapath / Controller



## # Simple CPU



## [Control Signals.]

- ① ACC-bus: Drive bus with contents of ACC. (Enable three state output)
- ② load-ACC: Load ACC from bus
- ③ PC-bus: Drive bus with contents of PC
- ④ load-IR: Load IR from bus
- ⑤ load-MAR: Load MAR from bus
- ⑥ MDR-bus: Drive bus with contents of MDR
- ⑦ load-MDR: Load MDR from bus
- ⑧ ALU-ACC: Load ACC with result from ALU
- ⑨ INC-PC: Increment PC, and save the result in PC
- ⑩ Addr-bus: Drive bus with operand part of instruction held in IR
- ⑪ CS: Chip select. Use contents of MAR to set up memory address.
- ⑫ R-MW: Read, Not Write. When false, contents of MDR are stored
- ⑬ ALU-add: Performed an add operation in the ALU.
- ⑭ ALU-sub: Perform a subtract operation in the ALU.

BNE : Branch if not equal

→ LDA NumA  
CMP NumB  
BNE Differ

← Go to Differ if NumA ≠ NumB  
Otherwise, execution continues



```

module sequencer #(parameter OP_W=3) (input logic clock, n_reset, z_flag, input logic [OP_W-1:0] op,
                                         output logic ACC_bus, load_ACC, PC_bus, load_PC, load_IR, load_MAR, CS_R_N,
                                         MDR_bus, load_MDR, ALU_ACC, ALU_add, ALU_sub, INC_PC, Addr_bus);

`include "opcodes.v"
enum {S0,S1,S2,S3,S4,S5,S6,S7,S8,S9,S10} state;

#define LOAD 3'b000
#define STORE 3'b001
#define ADD 3'b010
#define SUB 3'b011
#define BNE 3'b100

always_comb
begin : comb
    ACC_bus = '0;
    load_ACC = '0;
    PC_bus = '0;
    load_PC = '0;
    load_IR = '0;
    load_MAR = '0;
    ALU_ACC = '0;
    ALU_add = '0;
    ALU_sub = '0;
    INC_PC = '0;
    Addr_bus = '0;
    CS = '0;
    R_NW = '0;

    case (state)
        S0: begin
            PC_bus = '1;
            load_MAR = '1;
            INC_PC = '1;
            load_PC = '1;
        end
        S1: begin
            CS = '1;
            R_NW = '1;
        end
        S2: begin
            MDR_bus = '1;
            load_IR = '1;
        end
        S3: begin
            Addr_bus = '1;
            load_MAR = '1;
        end
        S4: begin
            ACC_bus = '1;
            load_MDR = '1;
            CS = '1;
            R_NW = '1;
        end
        S5: begin
            MDR_bus = '1;
            load_PC = '1;
        end
        S6: begin
            endcase
        end
        S7: begin
            MDR_bus = '1;
            load_ACC = '1;
        end
        S8: begin
            MDR_bus = '1;
            ALU_ACC = '1;
            load_ACC = '1;
            if (op == 'ADD)
                ALU_add = '1;
            else if (op == 'SUB)
                ALU_sub = '1;
        end
        S9: begin
            MDR_bus = '1;
            load_PC = '1;
        end
        S10: ;
    endcase
end
endmodule

```

always\_ff @ (posedge clock, negedge n\_reset)
begin : seqz
 if (!n\_reset)
 state <= S0;
 else
 case (state)
 S0: State <= S1;
 S1: State <= S2;
 S2: State <= S3;
 S3: if (op == 'STORE)
 State <= S4;
 else
 State <= S6;
 S4: State <= S5;
 S5: State <= S0;
 S6: if (op == 'LOAD)
 State <= S7;
 else if (op == 'BNE)
 if (!z\_flag)
 State <= S9;
 else
 State <= S10;
 else
 State <= S8;
 S7: State <= S0;
 S8: State <= S0;
 S9: State <= S0;
 S10: State <= S0;
 endcase
 end
end



### # Synthesis Considerations

- ① Constraints (约束条件)
- ② Coding for synthesis
- ③ Hierarchical Designs
- ④ Technology Dependence
- ⑤ Design for Reliability

### # Area Constraints

Can define overall objective of synthesis to be area minimisation

`define_sharing full-adder on`

The compiler would attempt to share resources

### # State Assignment

`One_Hot` assignment - One flip-flop per state  
Good for fpga (lots of flip-flops)

### # State Encoding

`Simplify` uses one-hot encoding by default  
To force sequential encoding:  
`typedef enum {S0,S1,S2} State_type;`  
`(* syn_encoding = "Sequential") State_type state`

### # Timing Constraints

Clock Frequency is 20MHz → Clock period is 50ns  
Delay through ff is 1ns  
External Input and Output delays can be specified:  
`define_input_delay In1 10.0`  
→ The maximum delay thing, the input logic is 30ns  
 $50 - 10 - 1 = 39$

### # Design for Reliability

unique case (present\_state)
A: begin
 ready = '1;
 if (twenty)
 next\_state = D;
 ...

I: begin
 ret = '1;
 next\_state = A;
 end

default: next\_state = A;
enclose

When state goes to an unused state because of noise, could return to a known state or the system could switch between the unused states (known as "parasitic" state mea-

模块接口 已有的模块 模块接口 模块接口  
e.g. counter C.C.osc-CLK (clock), .clock (slow-clock);

# Testing Digital Systems

## # The need for testing

- Real time systems may have manufacturing defects
  - { ① Short circuits      ② Damaged components
  - { ③ Missing components
- Need to know if a system (board, IC, whole system) has a defect (and there doesn't work). Don't want to sell bad systems
- ↳ Need for testing is economic
  - Don't want to reject good systems

## # Two approach

Functional testing - does system work correctly? ← can be simply a long and difficult task

Structural testing - does system contain a fault?

## # Fault Models

- What defects occur?
- How can they be modelled?
- Why do models imply?

## # PCB Defects

- Break in connections
  - bad etching (蚀刻)
  - stress
- Short Circuits
  - Solder flow (焊锡流)
- Bad solder joints (焊点)

## # IC Defects

- Open circuit
- Electromigration (電迁移)
- Current overstress
- corrosion
- Short Circuit
- "Latch-up" - transient currents
- short Circuit

- Incorrect Transistor Action
- Silicon or oxide defects
- Mask misalignment
- Impurities
- Gamma radiation
- Data Corruption { Alpha particles  
EMI }

## # How do defects manifest themselves?

- Static failure (50%)
  - shorts, breaks etc
- Dynamic failure (49%)
  - out of spec components
  - timing failures
- Intermittent failures (1%)
  - environmental

## # Fault Model

- Physical defect manifests itself as a logical fault
  - Applies only to digital circuits
  - No analogue fault modeling
- Stuck Fault Model
  - Many physical defects can be modelled as a circuit node being:
    - Stuck at 1 (S-a-1)
    - Stuck at 0 (S-a-0)

## # Single-Stuck Fault Model

- Assumption
- The fault directly affects only one node
- The node is stuck at either 0 or 1

## # Validity of Single-Stuck Fault Model?

- Multiple faults do occur
- Can multiple faults mask each other?
- Number of faults might rise with complexity
- Can all defects be modelled with stuck model method?
- The model appears to be valid most of the time
- Almost all test pattern generation relies on this model
- Multiple faults are found with test patterns for single faults

## # Fault-oriented Test Pattern Generation

• Prepare a fault list (e.g. all nodes stuck-at 0 & stuck-at 1)

repeat

write a test  
check fault cover (one test may cover > 1 fault)  
(delete covered faults from list)  
until fault cover target is reached

## # Test Generation

1. Test pattern generation (writing a test) may be random or optimised. Ideally we want a minimum number of tests
2. One test may cover more than one fault, often faults are indistinguishable
3. If we simply want a pass/fail test, we can remove faults from further consideration, once we have found a test for a fault.  
If we want to diagnose a fault (for subsequent repair) we probably want to find all tests for a fault to deduce where the fault occurs
4. Fault cover target may be less than 100%. The higher the cover, the greater the number of tests and hence the cost of applying the test

cheaper to apply

## # Testability

How testable is a system?

- a. Controllability - can we control all the nodes to establish if there is a fault?
- b. Observability - can we observe and distinguish between a faulty node and a correct node?

## # Sensitive Path Algorithm



To test for  $a/0$ , need to set  $a$  to 1 (fault-free condition)  
Need to observe existence or otherwise of fault at  $z$   
If  $b$  is 0,  $e$  is 1 irrespective of  $a$ ,  $\therefore b$  must be 1

Similarly if  $f$  is 1,  $z$  is 1, irrespective of  $e$ ,  $\therefore f$  must be 0  
Thus we are establishing a sensitive path from  $a$  to  $z$ .  
To force  $f$  to 0, either  $c$  or  $d$  or both must be 0

- 7 nodes,  $\therefore 14$  stuck faults:
  - $a/0, a/1, b/0, b/1, \dots, z/0, z/1$
  - "a stuck-at-0" etc

This can be expressed as  
 $1101/0$

If the fault  $a/0$  exists,  $e$  is 1,  $z$  is 1. If the fault does not exist,  $e$  is 0,  $z$  is 0.  
We can conclude from this that a test for  $a/0$  is  $a=1, b=1, c=0, d=1$  for which the fault-free output is  $z=0$ .  
Other tests are  $1110/0$  and  $1100/0$ . Therefore, there is more than one test for the fault  $a/0$ .

To test for  $e/1$ , requires that  $f=0$  to make  $e$  visible at  $z$ .  $\therefore c$  or  $d$  or both must be 0  
To make  $e=0$  requires that  $a=b=1$ . So a test for  $e/1$  is  $1101/0$ .

Unstable FAULT ARE ALWAYS DUE TO REDUNDANCY  
Untestable

## # Algorithm

- Select a fault
- Setup input to force the node to a fixed value
- Set up input to transmit node value to an output
- Check for consistency
- Check for coverage of other faults.

| a | b | c | d | e | f | z |                                |
|---|---|---|---|---|---|---|--------------------------------|
| 1 | 1 | 0 | 1 | 0 | 0 | 0 | $a/0, z/1, e/1, f/1, b/0, c/1$ |
| 0 | 1 | 0 | 1 | 1 | 0 | 1 | $a/1, z/0, e/0$                |
| 1 | 0 | 0 | 1 | 1 | 0 | 1 | $b/1, z/0, e/0$                |
| 1 | 1 | 1 | 1 | 0 | 1 | 1 | $c/0, z/0, f/0, d/0$           |
| 1 | 1 | 1 | 0 | 0 | 0 | 0 | $d/1$                          |

## # Testing Sequential Circuits

Testing combinational circuits is relatively easy, provided, there is no redundancy in the circuit.

Number of test vectors  $\ll 2^{n \text{ of inputs}}$

Testing sequential circuits is difficult because circuits have states. May require long sequences of inputs to reach states. Some faults may be unstable, because certain states cannot be reached.

N.B. All sequential circuits MUST have a set or reset to initialise all flip-flops or testing is nearly impossible

## # Sequential ATPG

Combinational ATPG uses sensitive path algorithm. i.e. sensitise a path from a potential fault site to a PO (or in the case of SISO, a pseudo PO - a scan register input).

Can model a sequential circuit.



As an iterative array, each element represents a clock cycle.



For each ~~state~~ SSF in C

$r=1; q=0;$

repeat

build model with  $r+q$  time frames;

ignore POs in 1st  $r+q-1$  time frames;

generate test (D algorithm) for all fault as if all time frames represent a combinational circuit. PIs can be set up in all time frames; only POs in last time frame observed.

if Success return Success!;

else increase increment  $r$  or  $q$ ;

until  $r+q = f_{max}$ ;

return failure!;

## Design For Test

特别的

### # Ad Hoc Design for Test Method

- ① Controllability - ability to control logic value of an internal node from PI
- ② Observability - ability to observe logic value of internal node at PO

### # Testability can be enhanced by:

{ ad hoc design guidelines  
structured design methodology

Things to watch:

- ① Redundant Logic - undetectable faults
- ② Asynchronous sequential systems
  - difficult to synchronize with tester
  - if necessary, confine to independent blocks
- ③ Nonestables
  - difficult to control

### # Improve Test Access

Use test points to enhance controllability and observability



Original Circuit



Circuit modified to provide CP and OP

### # Initialisation

Initialisation must be provided for all sequential circuit elements



### # SISO

SDI: Serial Data In  
SDO: Serial Data Out  
Primary Output



- Modify all flip-flops to include MUX
- Make every state controllable and observable
- Sequential test problem reduce to combinational test

#### ② For each combinational test

- Set M=1 to set the state of the flip-flops after n clock cycles by shifting a pattern in through SDI
- Set M=0. Set up the primary inputs. Collect the values of the primary outputs. Apply 1 clock cycle to load the state outputs into the flip-flops
- Set M=1 to shift the flip-flop contents to SDO after n-1 clock cycles.

### # SISO Test Generation

- Generate tests for the combinational part using the sensitive path algorithm (or a variant) as before.
- For each test, need to distinguish between inputs that are applied through the scan path, and those that are applied through the primary inputs.
- Some with outputs

### # Running a test

① Set M=1 and test the flip-flops as a shift register. If a sequence of 1s and 0s is fed to SDI, we would expect the same sequence to emerge from SDO delayed by the number of clock cycles equal to the length of the shift register (n). A useful test sequence would be 00110... which tests all transitions and whether the flip-flops are stable

### # Costs of enhanced testability

- Extra I/O pins
- Includes interface etc
- Extra components (MUXs), extra wiring
- Degradation of performance because of extra gates in signal paths

More things to go wrong  
But the circuit will be easier to test  
Inserting a scan path is easy  
Ad hoc methods are difficult to automate

Scan path can be inserted before or after layout. Ordering is therefore arbitrary.

Test vectors need to be sorted - a job for a computer

### # Reducing the Cost

- Share SDI with PI (MUX)
- Share SDO with PO
- Do all flip-flops need to be in the scan path

• Build MUXes into next state logic

$$S^t = \bar{S} \cdot T + S \cdot \bar{T} \cdot A$$

$$T^t = \bar{S} \cdot T \cdot A + S \cdot T \cdot \bar{A}$$

Becomes

$$S^t = (\bar{S} \cdot T + S \cdot \bar{T} \cdot A) \cdot \bar{M} + T \cdot M$$

$$T^t = (\bar{S} \cdot T + S \cdot T \cdot \bar{A}) \cdot \bar{M} + SDI \cdot M$$

• Reduces delay

### # Partial Scan



Don't put every flip-flop in the scan path

C1/R1/R2/C2 form a balanced structure  
Similarly C2/R4/R1 are balanced

We can make R3 and R4 part of the scan path (or R1, R2, R3)

## Boundary Scan (JTAG)

### # Boundary Scan

- IEEE 1149.1 Also known as JTAG
- Not possible to test mounted ICs (the pins may be connected together);
- PCBs now often have more than 20 layers of metal, so deep layers cannot be reached;
- Density of components on a PCB is increasing. Multi-chip modules (MCMs) take the chip/board concept further and have unpackaged integrated circuits mounted directly on a silicon substrate.

### # Backdriving problem



### # off chip faults



### # Principle of JTAG



### # Test Architecture



### # TAP Controller



### # BS Cell



## # Modes of Operation

### 1. Normal mode

- Normal system data flows from IN to OUT.

### 2. Scan mode

- Shift DR selects the SCAN-IN input, clockDR clocks the scan path
- Shift DR is derived from the similarly named state in the TAP controller
- ClockDR is asserted when the TAP controller is in state Capture-DR or Shift-DR
- (Hence, the Boundary Scan architecture is not truly synchronous)

### 3. Capture mode (捕捉)

- Shift DR selects the IN input, data is clocked into the scan path register with ClockDR to take a snapshot of the system

### 4. Update mode

- After a capture or scan, data from the left flip-flop is sent OUT by applying 1 clock edge to UpdateDR
- Again, the clock signal comes from the TAP controller when it is in state Update-DR. The TAP controller then enters the Run Test state and MODE-CONTROL is set as appropriate according to the instruction held in the instruction register.

## # Instructions

### • EXTEST (Mandatory) 外测试

- This instruction performs a test of the system, external to the core logic of particular devices.
- Data is sent from the output boundary scan cells of one device, through the pads and pins of that device, along the interconnect wiring, through the pins and pads of a second device and into the input boundary scan cells of that second device
- Hence a complete test of the interconnect from one IC core to another is performed.

### • SAMPLE/PRELOAD (Mandatory) 顶加载

- This instruction is executed before and after the EXTEST and INTEST instructions to set up pin outputs and to capture pin inputs.

### • BYPASS (Mandatory)

- This instruction selects the Bypass register, to shorten the scan path.

### • RUNBIST (Optional) 自测试

- Runs a built-in self-test on a component

### • IDCODE, USERCODE (Optional)

- These instructions return the identification of the device (and the user identification for a programmable logic device). The code is put into the scan path

### • INTEST 内部测试 (Optional)

- This instruction uses the boundary scan register to test the internal circuitry of an IC

- Although such a test would normally be performed before a component is mounted on a PCB, it might be desirable to check that the process of soldering the component onto the board has not damaged it

- Note that the internal logic is disconnected from the pins, so if pins have been connected together on the board, that will have no effect on the standard test.

### • CONFIGURE (Optional)

- An SRAM-based FPGA needs to be configured each time power is applied.

- The configuration of the FPGA is held in registers. These registers can be linked to the TAP interface.

- This clearly saves pins as the configuration and test interfaces are shared.

## # Tristates and Bidirectionals.



## Built-in self-test

### # Why Built-in Test

- Economic justification
  - Simplifies test equipment
  - Simplifies TPG
  - Allow easy field test
- Increases user confidence

### # Principles of BIST



### # How to generate test vectors

- Store pre-generated vectors in ROM
  - Possibly a very large number
  - Problem with sequential logic
- Exhaustive test
  - Use binary counter to generate all test vectors
  - Separate combinational logic from registers

Example:  $n=4$



| Q1 | Q0 |
|----|----|
| 1  | 1  |
| 2  | 0  |
| 3  | 0  |
| 4  | 0  |
| 5  | 0  |
| 6  | 1  |
| 7  | 0  |
| 8  | 1  |
| 9  | 1  |
| 10 | 0  |
| 11 | 0  |
| 12 | 0  |
| 13 | 0  |
| 14 | 1  |
| 15 | 1  |
| 16 | 1  |

### # Linear Feedback Shift Register (LFSR)

$\Rightarrow 2^n - 1$  patterns, if feedback chosen correctly

Easy to build, 1-3 XOR gates

Pseudo-random sequence - can have implications for power dissipation. If  $n$  is large ( $> 32$ ), time taken for sequence may be unacceptable. All 0s state excluded.

### # LFSR - Feedback connections

- |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |       |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|
| <ul style="list-style-type: none"> <li><math>n=2</math>      <math>D_1 = Q_1 \text{ XOR } Q_0</math></li> <li><math>n=4</math>      <math>D_3 = Q_1 \text{ XOR } Q_0</math></li> <li><math>n=8</math>      <math>D_7 = Q_3 \text{ XOR } Q_0</math></li> <li><math>n=16</math>      <math>D_{15} = Q_5 \text{ XOR } Q_4 \text{ XOR } Q_3 \text{ XOR } Q_0</math></li> <li><math>n=24</math>      <math>D_{23} = Q_7 \text{ XOR } Q_6 \text{ XOR } Q_5 \text{ XOR } Q_0</math></li> </ul> | $Q_0$ |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------|

Not more than 24 feedback connections are needed to generate  $2^n - 1$  patterns

For  $n=24$ , > 250,000 possible feedback connections  
Circuit can be modified to generate all 0s state ( $2^n$  patterns)

### # Check Responses

- Look-up table of correct responses
  - Potentially very large
- Signature Analysis
- Data compression

### # BIST Example

This circuit consists of:

- a 3 stage LFSR
  - a 3 stage MISR
  - a circuit under test
- $$\begin{cases} X = A \text{ XOR } B \text{ XOR } C \\ Y = A \cdot B + A \cdot C + B \cdot C \\ Z = A \cdot B + A \cdot C + B \cdot C \end{cases}$$



LFSR:  $\begin{cases} A^+ = B \text{ XOR } C \\ B^+ = A \\ C^+ = B \end{cases}$

MISR:  $\begin{cases} P^+ = X \text{ XOR } (Q \text{ XOR } R) \\ Q^+ = Y \text{ XOR } P \\ R^+ = Z \text{ XOR } Q \end{cases}$

### # Multiple Input Signature Register

Compact response on chip



Probability of a fault in the CUT giving the fault-free signature  $\rightarrow 2^{-n}$  (aliasing)  
Eg to build

### # Fault Free Simulation

Both the LFSR and MISR are initialised to the 000 state. By simulation, the sequence of states is found to be:

LFSR   CUT   MISR

| a b c | x y z | CUT        | P Q R | z | PQR   |
|-------|-------|------------|-------|---|-------|
| 0 1 1 | 0 1 1 | Under test | 1 0 0 | 2 | 1 0 0 |
| 0 0 1 | 1 0 0 |            | 0 0 1 | 3 | 0 0 1 |
| 1 0 0 | 1 0 0 |            | 0 0 1 | 4 | 0 0 1 |
| 0 1 0 | 0 1 1 |            | 0 0 1 | 5 | 0 1 0 |
| 1 0 1 | 0 1 0 |            | 0 0 1 | 6 | 1 0 1 |
| 1 1 0 | 0 1 0 |            | 1 0 1 | 7 | 1 0 1 |
| 1 1 1 | 1 1 1 |            | 1 1 1 | 8 | 1 1 1 |

### # Simulation of a fault

Suppose a is 5-a-0

abc xyz

|   |   |   |   |   |   |   |   |
|---|---|---|---|---|---|---|---|
| 1 | 1 | 1 | 0 | 1 | 1 | 1 | 1 |
| 0 | 1 | 1 | 0 | 1 | 0 | 0 | 2 |
| 0 | 0 | 1 | 1 | 0 | 1 | 1 | 3 |
| 1 | 0 | 0 | 1 | 0 | 0 | 0 | 4 |
| 0 | 1 | 0 | 1 | 0 | 1 | 0 | 5 |
| 1 | 0 | 1 | 0 | 1 | 0 | 0 | 6 |
| 1 | 1 | 0 | 1 | 1 | 0 | 1 | 7 |
| 1 | 1 | 1 | 0 | 1 | 1 | 1 | 8 |

[Note] If a is 5-a-1, the signature 000 is also generated. This is an example of aliasing

signature  
of  
fault-free circuit

Signature of  
faulty circuit

#BIBO



Combines LFSR and MISR into one

Typically 5 modes:

- ① normal
- ② scan
- ③ synchronous reset
- ④ MISR
- ⑤ LFSR

# BIBO Example



- To test C1, R1 is LFSR, R2 is MISR
- To test C2, R1 is LFSR, R2 is MISR
- Scan signatures in R1, R2 to some comparator
- NB 2<sup>n</sup> test sessions, each of  $2^{n-1}$  cycles + scan + decision

# STUMPS



A 32-bit LFSR would take about 2s to go through the entire sequence at 2 GHz  
STUMPS avoids this problem using a single LFSR and MISR to generate patterns for all scan paths.

Technique adopted by Logic Vision  
Disadvantage - same input to each scan path, delayed by a cycle each time

## System Verilog Simulation

### # SystemVerilog Simulation

语义学

- Semantics (meaning) of SystemVerilog is defined by the simulation model
- Want to write SystemVerilog that synthesises correctly and that behaves in the same way before and after synthesis
- Difficult simulators may produce different behaviour with poorly written model

### # Simulation Time

- The simulator models time - this is called simulation time.
- Simulation time is an integral multiple of the resolution limit
- The simulator cannot measure time delays less than the resolution limit.
- For gate-level simulations the resolution limit may be quite fine, possibly down to 1ps.
- For RTL (Register Transfer Level) simulators, there is no need to specify a fine resolution since we are only interested in clock-cycle by clock-cycle behaviour and these transfer functions are described with zero delay or unit delay
- In this case, a resolution limit of 1ns is usually used.

### # Verilog Simulation

- A new event may be one of five types. The LRM talks of a stratified event queue, with five regions as follows:

  - ① Active events. These occur at the present time and can be processed in any order.
  - ② Inactive events. These occur at the present time and are processed after all the active events have been processed.
  - ③ Nonblocking assign update events
  - ④ Monitor events
  - ⑤ Father events

### # Event Regions

- Prepared - sample stable values for later checking
- Active - blocking assignments
- Inactive - zero delay assignments here
- MBA - nonblocking assignments
- Observe - Evaluate assertions
- Reactive - Execute programs in testbenches
- Postponed - \$store and \$monitor print routines

### # Event Driven Simulation

- The basis of SystemVerilog simulation is event processing. SystemVerilog simulation simulators are event driven simulators.
- There are 3 essential concepts to event-driven simulation:
  - simulation time
  - simulation regions
  - simulation event processing
- The VHDL model is different

### # Simulation Cycle

- Process Execution
  - each process is recalculated only if its sensitivity list has an event
  - this creates a new value for each target signal and a time at which the signal gets the value. This value/time pair is a new event and is scheduled for the current or a future time.

### # Event Queue



### # SystemVerilog Event Regions



## # Simulation Cycle

```

while (there are events) {
    if (no active events)
        if (there are events in the other lists (in order))
            activate all events in the next list;
    else
        advance T to the next event time;
    E = any active event;
    if (E is an update event) {
        update the modified event-object;
        add evaluation events for sensitive processes to event
    }
    else {
        evaluate the process;
        add update events to the event queue;
    }
}

```

Reactive events are generated by assignments in program blocks.

A program block is like a module, but

- Can only be instantiated (instantiated) in a testbench
- Can only contain initial procedural blocks

Introduced in SystemVerilog to remove testbench problems

Every Events may be processed from the active event list in any order (or to think of it in another way, events can be added to the events list in any order)

This is the fundamental cause of non-determinism

In Verilog

We can be sure of only two things:

- Statements between begin and end will be executed in the order stated
- Nonblocking assignments will be performed after active events.

Everything else is indeterminate.

Moreover, the execution of a procedural block can be interrupted to allow another procedural block to be executed.

Therefore the skill in writing Verilog code is therefore to ensure that this indeterminism does not matter.

If the code is badly written, a race condition is likely to result - that is a situation where the procedure writing a value and the procedure reading that value are racing each other.

Depending which completes first, either the original or the updated value may be read.

## # Avoiding Races.

Don't assign <sup>to</sup> the same variable from two or more procedures. Not only is contention liable, but multiple assignment blocks. Part of the problem is the keyword "initial". Procedures defined with the initial keyword are executed once; they are not initializations. Use nonblocking assignments for modelling sequential logic.

The example, above, evaluates correctly if the two assignments are made nonblocking, irrespective of the order of evaluation. This is because nonblocking assignments are evaluated last and cannot influence each other. Use blocking assignments to evaluate combinational logic.

Assignments made in combinational logic models take immediate effect. The use of nonblocking assignments would often be confusing. Do not mix blocking and nonblocking assignments in the same procedure.

Don't use zero delays (#0). They are not necessary and will cause confusion.

(and wrong)

The list of active events is one of the lists, Inactive, Nonblocking, Reactive, that has been created during some previous simulation cycle, together with any (active) events that are generated during the current cycle.

Nonblocking assign update events are created by nonblocking assignments ( $\leftarrow$ )

The evaluation of the right hand side of all nonblocking assignments is always done before any nonblocking assign updates are done.

This is important as it allows sequential systems to be modelled correctly.

Inactive events are those events that are due to occur at the current time but that have been explicitly delayed,

In practice, this can be done with a zero delay (#0).

A zero delay doesn't represent real hardware (nor a useful testbench construct). Therefore you are simply trying to fool the simulator.

Unless you know exactly what you are doing, it will probably

### # Race 1

```

assign p = q;
initial
begin
    q = 1;
    # q = 0;
    $display(p);
end

```



Because the execution of the initial procedure can be interleaved (交叉执行) with the assign statement, the value of p that is displayed could be 1 or 0. Either is "correct" and different simulators may give different results.

### # Race 2

```

always @ (posedge clock)
    b = c;
always @ (posedge clock)
    a = b;

```

This is supposed to model two flip-flops connected in series. If the procedures are evaluated in the order written, at a rising edge the value of c will be copied to b.

That same value is copied to a. This is clearly not the intended behaviour.

If the procedures are evaluated in the opposite order, the correct behaviour is modelled.

can cause ambiguous behaviour

.....

.....

.....

.....

## # Different ways to model delays.

- Left-hand side (LHS) of blocking assignment:  
 $\#5 a = b;$  Wait then do assignment. Used for generating waveforms in a testbench
  - Right-hand side (RHS) of blocking assignment;  
 $a = \#5 b;$  Pure delay, e.g. a transmission line
  - LHS of nonblocking assignments:  
 $\#5 a <= b;$  No real difference to first form
  - RHS of nonblocking assignments:  
 $a <= \#5 b;$  Pure delay. Useful to model delays in flip-flops
  - LHS of continuous assignments:  
 $assign \#5 a = b;$  Inertial delay in combinational logic  
Inertial delay means that any transition is delayed by 5 units AND any pulse less than 5 units made is suppressed.
- Can get weird behavior if you use the wrong form  
Can get differences between simulators  
Can specify zero delays - but don't

## # VHDL Simulation

- VHDL uses a different terminology to Verilog
  - transaction generation
  - event processing
- Additionally it has a different mechanism for 0 delays:
  - delta time

## # VHDL Simulation Cycle

Simulation alternates between two modes:

### ① Statement Execution

- each statement is recalculated only if its sensitivity list has an event
- this creates a new value for each target signal and a time at which the signal gets the value. This value/time pair is called a transaction and is added to a transaction queue.
- target signals are not updated during statement execution.

### ② Event Processing

- transactions for the current time are taken off the queue.
- if a transaction represents a change in the value of the signal it becomes an event, causing the signal to be updated with new value
- Signals are updated only at the next wait statement.
  - A process with a sensitivity list has an implicit wait at the end of the process.
  - Don't mix sensitivity lists and wait statements - must have one or the other.

## # Zero-Delay Transaction

- It is also possible to have a zero-delay assignment  
 $a <= '0';$
- The signal is still not updated immediately
- But, a transaction still needs to be scheduled.
  - If the transaction were scheduled at the current time:
    - It could be put at the start of the transaction queue
    - It could be put at the end of the transaction queue (easier to program)
  - Therefore the exact behaviour would depend on the simulator (not good - but see Verilog!)

## # Delta Time

- Zero-delay transactions are a problem - how many do you handle a transaction for the current time?
- Answer: don't! Schedule them for one delta time later.
  - a delta time is infinitesimally small



Each event generation phase consumes one delta time-setup transactions

## # Transaction Queue



## SystemVerilog Assertions

### # Verification

- Simulation is the main method for verifying designs
- Inspecting waveform is difficult and error-prone (易見難い)
- The principle of Assertions Based verification is:
  - Describe what you expect to see, and when you expect to see it
  - The simulator will tell you if this doesn't happen

### # Writing data

- Verilog has simple output tasks
  - \$display - writes data
  - \$store - writes data after all events have been processed
  - \$monitor - continuously monitors variables
- Use formatting commands like C
- \$display ("Output = %b", %);

### # Condition



- Reference could be behavioural (simulation) model
- ABV combines reference and comparison.

### # EXAMPLE - Lift Controller

Toy lift - Three floors

- Sensors - top, bottom, middle\_plus, middle\_minus
- Call buttons
- Indicator lamps
- All signals active high
- Rising edge of clock is active (posedge in verilog)
- Direction - 1 is up, 0 is down
- Seven states

### # SystemVerilog Assertions

- Introduction - not a full tutorial
- SVAs can be used with VHDL / Verilog models and testbenches.

### # Invariant

• An invariant is a property that is always true.

always @\*  
assert (!top || !bottom) top & bottom

else \$error ("At top and bottom!");

- @\* is a default sensitivity list
- assertion is evaluated whenever top or bottom changes
- Assertion that one or both of top and bottom is true
  - i.e. that both cannot be 1 at the same time
- Is this really useful?
- Better to synchronise with clock!

### # Property

property NotIn2Places;  
 (!top || !bottom);

endproperty

Assert property at clock edge

↓

assert property (@(posedge clock) NotIn2Places);

### # Temporal Properties

- So far, no better than Verilog
- Want to write properties that hold over time:
  - property (@posedge clk) a  $\rightarrow$  b;
  - b is true in the cycle after a is true
- Or we can write sequences:
  - a #1 b #1 c  $\rightarrow$  d
  - #1 means one clock cycle
  - a #1:4 b one to four clock cycles

If the lift is at the bottom and the call1 button is pressed, we expect that the lift will go up in the next clock cycle

property Call1;  
 (bottom && call1  $\rightarrow$  enable && direction);

endproperty

~~•  $\rightarrow$  is non-overlapping implication~~

## # Complex Properties

- So, it's possible to write complex properties like
  - "If  $a$  is true for 3 cycles,  $b$  will be true for at least 2 cycles, not more than 2 cycles later"
- In other words, this is the reference model against which we check the behaviour of the system.
- As noted, this reference model can be used in simulations and by model checkers.

## # Clocking

- All assertions are tested on rising edge of the clock,
- We can make this a default condition

default clocking clock\_block

```
@(posedge clock);
endclocking
```

- Hence  
assert property (call1);

*Non-overlapping  
Nonblocking occur next CLK*

*Overlapping  
Blocking occur within CLK*

## # Overlapping implication

If the ground floor indicator is lit, the lift should be going down

```
property Indicator;
  (indicator 0 !-> !direction);
endproperty
```

✗ •  $\rightarrow$  Overlapping implication - same at clock cycle.

## # Sequences

If the top indicator is 0 and then goes to 1, the lift should be at the top in the next dark cycle.

```
property GetTo2;
  (indicator2 ##!1 ! indicator 2 !-> top);
endproperty
```

- $##$  refers to the clocking method
- Could use an overlapping implication here

## # Liveness

```
property Eventually2;
  (bottom && call2 !-> ##[1:$] top);
endproperty
```

- $[1:$]$  means a range of 1 to infinite number of clock cycles.
- If the lift is at the bottom and the top call button is pressed, the lift goes to the top floor eventually

## # Liveness &amp; Safety

- "Liveness" means that good things happen eventually.
- "Safety" means that bad things never happen
- Liveness properties are not a good idea
  - What does eventually mean?
  - Properties can pass vacuously, so how can we know it's failed?

## # How do we know the property is correct?

- We don't
- The properties have to be written and debugged at the same time as the HDL model
- But not by the same engineer.

## # States

```
property StateChange;
  (state == floor 0) && call 1-> (state == upTo1);
endproperty
```

- Mapping states is a little convoluted, but possible

复杂的

## # System Verilog Assertions Ext

Assertions are primarily used to validate the behaviour of a design ("Is it working correctly?"). They may also be used to provide functional coverage information for a design ("How good is the design?"). Assertion can be checked dynamically by simulation or statically by a separate property check tool - i.e. a formal verification tool that provides proves whether or not a design meets its specification. Such tools may require certain assumption about design's behaviour to be specified.

In systemVerilog there are two kinds of assertions: **immediate (assert)** and **concurrent (assert property)**. Coverage statements (**cover property**) are concurrent and have the same syntax (`assert`) as concurrent assertions, as do **assume property** statements.

Another similar statement - **expect** - is used in testbenches; it is a procedural statement that checks that some specified activity occurs. The three types of concurrent assertion statement and the expect statement make use of sequences and properties that describe the design's temporal behaviour - i.e. behaviour over time, as defined by one or more clocks.

### # Immediate Assertions

Immediate assertions are procedural statements <sup>and</sup> are mainly used in simulation. An assertion is basically a statement that something must be true, similar to **if** statement. The difference is that an **if** statement does not assert that an expression is true, it simply checks that it is true, e.g.:

```
if (A==B) // Simply checks if A equals B
assert (A==B) // Asserts that A equals B, if not, an error is generated.
```

If the condition expression of the immediate assert evaluates to X, Z or 0, then the assertion fails and the simulator writes an error message.

An immediate assertion may include a pass statement and/or a fail statement. In our example the pass statement is omitted, so **[no action is taken]** when **assert** expression is true. If the pass statement exists:

```
assert (A==B) $display ("OK. A equals B");
```

it is executed immediately after the evaluation of the **assert** expression. The statement associated with an **else** is called a fail statement and is **executed if the assertion fails**:

```
assert (A==B) $display ("OK. A equals B");
else $error ("It's gone wrong");
```

Note that you can **omit the pass statement** and still have a fail statement

```
assert (A==B) else $error ("It's gone wrong");
```

The failure of an assertion has a severity associated with it. There are three severity system tasks that can be included in the fail statement to specify a severity level: **\$fatal**, **\$error** (the default severity) and **\$warning**. In addition, the system task **\$info** indicates that the assertion failure carries no specific severity. Here are some examples:

```
ReadCheck: assert (data == correct_data)
            else $error ("memory read error");
```

```
Igt10: assert (I>10)
        else $warning ("I is less than or equal to 10");
```

The pass and fail statements can be any legal SystemVerilog procedural statement. They can be used, for example, to write out a message, set an error flag, increment a count of errors, or signal a failure to another part of the testbench.

```
AeqB: assert (a == b)
        else begin error_count++; $error ("A should equal B"); end
```

### # Concurrent Assertions

The behaviour of a design may be specified using statements similar to these: "The Read and Write signals should never be asserted together." "A Request should be followed by an Acknowledgement no more than two clocks after the Request is asserted."

Concurrent assertions are used to check behaviour such as this. There are statements that assert that specified properties must be true. E.g. **assert property (! (Read && Write))**; assert property that the expression **Read && Write** is never true at any point during simulation.

Properties are built using sequences. For example,

**assert property (@(posedge clock) Req |> ##[1:2] Ack);** where **Req** is a simple sequence (just a boolean expression) and **##[1:2] Ack** is a more complex sequence expression, meaning that **Ack** is true on the next clock, or on the one following (or both). **|>** is the implication operator, so this assertion checks that whenever **Req** is asserted, **Ack** must be asserted on the next clock, or the following clock.

Concurrent assertions like these are checked throughout simulation. They usually appear outside any initial or always blocks in modules, interfaces and programs. Concurrent assertions may also be used as statements in initial or always blocks. A concurrent assertion in an initial block is only tested on the first clk

The first assertion example above does not contain a clock. Therefore it is checked at every point in the simulation. The second assertion is only checked when a rising clock edge has occurred; the values of **Req** and **Ack** are sampled on the rising edge of **clock**.

$$G \rightarrow b$$

|   |   |   |
|---|---|---|
| a | b | c |
| 0 | 0 | 1 |
| 0 | 1 | 0 |
| 0 | 0 | 0 |
| 1 | 1 | 0 |

## # Implication

The implication construct ( $\Rightarrow$ ) allows a user to monitor sequences based on satisfying some criteria, e.g. attach a precondition to a sequence and evaluate the sequence only if the condition is successful. The left-hand side operand of the implication is called the antecedent sequence expression, while the right-hand side is called the consequent sequence expression.

If this is no match of the antecedent sequence expression, implication succeeds vacuously by returning true. If there is a match, for each successful match of the antecedent sequence expression, the consequent sequence expression is separately evaluated, beginning at the end point of the match. There are two forms of implication: ① overlapping overlapped  $\Rightarrow$  ② non-overlapped  $\Rightarrow$  (overlap  $\Rightarrow$ )

For the overlapped implication, if there is a match for the antecedent sequence expression, then the first element of the consequent sequence expression is evaluated on the same clock tick,

$S1 \Rightarrow S2;$

In the example above, if the sequence  $S1$  matches, then Sequence  $S2$  must also match. If sequence  $S1$  does not match, the result is true.

For non-overlapped implication, this first element of the consequent sequence expression is evaluated on the next clock tick.

$S1 \Rightarrow S2;$

The expression above is basically equivalent to:

'define true 1  
 $S1 \#\# 1`true 1 \Rightarrow S2;$

where 'true' is a boolean expression, used for visual clarity, that always evaluates to true.

## # Properties and Sequences

In these examples, we have been using, the properties being asserted are specified in the assert property statements themselves. Properties may also be declared separately, for example:

property not\_read\_and\_write;

not (Read && Write);

endproperty assert property (not\_read\_and\_write);

Complex properties are often built using sequence. Sequences, too, may be declared separately:

assert sequence request sequence acknowledge

Req;

## [1:2] Ack;

property handshake;

@(posedge Clock) request |> acknowledge;

assert property (handshake);

endsequence

endsequence

endproperty

## # Assertion Clocking

Concurrent assertions (assert property and cover property statements) use a generalised model of a clock and are only evaluated when a clock tick occurs. (In fact the values of the variables in the property are sampled right at the end of the previous time step.) Everything in between clock ticks is ignored. This model of execution corresponds to the way a RTL description of a design is interpreted after synthesis.

A clock tick is an atomic moment in time and a clock ticks only once at any simulation time. The clock can actually be a single signal, a gated clock (e.g. `(clk && GatingSig)`) or other more complex expression. When monitoring asynchronous signals, a simulation time step corresponds to a clock tick.

The clock for a property can be specified in several ways:

① Explicitly specified in a sequence : ② Explicitly specified in the concurrent assertion:

sequence s;

@(posedge clk) a ##1 b;

endsequence

property P

a |> s;

endproperty

assert property (P);

③ Explicitly specified in the property

property P;

@(posedge clk) a ##1 b;

endproperty

assert property (P);

④ From a clocking block

clocking cb @ (posedge clk);

property p;

a ##1 b;

endproperty

endclocking

assert property (cb.p);

assert property (@(posedge clk) a ##1 b);

⑤ Inferred from a procedural block

property P;

a ##1 b;

endproperty

always @(posedge clk) assert property (P);

⑥ From a default clock

default clk cb;

## # Handling Asynchronous Reset

In the following example, the `disable iff` clause allows an asynchronous reset to be specified.  
property p1;

```
@(posedge clk) disable iff (Reset) not b ## 1 ## c;
```

endproperty

assert property (p1);

The `not` negates the result of the sequence following it. So, this assertion means that if `Reset` becomes true at any time during the evaluation of the sequence, then the attempt for `p1` is a success. otherwise, the sequence `b ## 1 c` must never evaluate to the time

## # Sequence

A `sequence` is a list of boolean in a linear order of increasing time. The `sequence` is time over time if boolean expressions are true at the specific clock ticks. The expressions used in sequences are interpreted in the same way as the condition of a procedural `if` statement

The `##` operator delays execution by the specified number of clocking events, or clock cycles.

`a ## 1 b` // a must be true on the current clock tick and b on the next clock tick

`a ## N b` // check b on the Nth clock tick after a

`a ##[1:4] b` // a must be true on the current clock tick and b on some clock tick between the first and fourth after the current clock tick

The `*` operator is used to specify a consecutive repetition of the left-hand side operand

`a ## 1 b [3] ## 1 c` // Equiv. to `a ## 1 b ## 1 b ## 1 b ## 1 c`

`(a ## 2 b) [2]` // Equiv. to `(a ## 2 b ## 1 a ## 2 b)`

`(a ## 2 b) [1:3]` // Equiv. to `(a ## 2 b) or (a ## 2 b ## a ## 2 b) or (a ## 2 b ## 1 a ## 2 b ## 1 a ## 2 b)`

The `$` operator can be used to extend a time window to a finite, but unbounded range.

`a ## b [1:$] ## 1 c` // E.g. a b b b b c

The `[> or goto` repetition operator specifies a non-consecutive sequence.

`a ## 1 b [1:3] ## 1 c` // E.g. a ! b b b ! b ! b c

This means a is followed by any number of clocks where c is false, and b is true between 1 and three times, the last time being the clock before c is true.

The `[=` or non-consecutive repetition operator is similar to goto repetition, but the expression (b in this example) need not be true in the clock cycle before c is true.

`a ## 1 b [=1:3] c` // E.g. a ! b b b ! b ! b b ! b ! b c

## SystemVerilog Simulation Extra

### # Event-driven Simulation

The objective of event-driven simulation is to minimize the work done by the simulator. Therefore, the state of the circuit is only evaluated when a change occurs in the circuit.

It is possible to predict when the output of an element might change because we know that such a change can only occur after an input changes. If we only monitor the inputs to element, we can only know that an output might change; the logic function of the element determines whether or not a change actually occurs. As we also know the delays through the element, we know when the output might change. Thus an element only needs to be evaluated when it is known that its output might change but not otherwise. Nevertheless, even by predicting a possible changes, it is only necessary to re-evaluate elements when the possible change occurs.

By following the possible events through the circuits we can minimize the computation done by the simulator. Only elements that change need to be evaluated.

The delays through elements are defined in terms of integer times. The units of time might be nanoseconds or picoseconds. As the time is incremented in discrete intervals, it is likely that, for any reasonably large circuit, more than one element will be evaluated at any one time. Equally, there may be times at which no elements are due for evaluation. This implies a form of time step control. As each element is evaluated, any change in its output will cause inputs to further elements to change and hence the outputs of those elements may subsequently change. Clearly, it is necessary to maintain a list of which signals change and when. An event is therefore a new value of a signal, together with the time at which the signal changes. If the event list is ordered in time, it should be easy to add new events at the correct place in the further.



A suitable data structure for the event list

When an event is predicted, it is added to the list of events at the predicted time. When an event is processed it is removed from the list. When all the events at a particular time have been processed, that time can be removed.

#### Zero-width spike

An event is only scheduled if the new value is different from the value that has previously been scheduled for that signal. If two or more events occur on input signals to an element, more than one event may be scheduled for an output signal. It is thus important to know that the new value is not merely different from the present value but also from a value that might be already be scheduled to be set in the future. This algorithm therefore has a disadvantage as it stands because an element is evaluated whenever an event occurs at an input. It is quite possible that two events might be scheduled for the same gate at the same time. This could lead to a zero-width spike (零宽度脉冲) being scheduled one delay later. Even worse, if the delay for rising and falling output differ, the presence or absence of an output pulse would depend on the order in which the input events were processed.

If zero-width pulses are to be suppressed, they can be considered as a special case of the inertial delay model.

Include a PULSE CANCELLATION if a pulse is less than the permitted minimum width.

- If an event is predicted at a time less than the inertial delay after the previous event for that node, this new event is not set and the previous event is also removed.

### # SystemVerilog simulation

The SystemVerilog simulation model is based upon the selective true algorithm. The LRM (Language Reference Manual) describes a stratified event queue in which the event list is divided into a number of regions.

① Prepared - sample stable values for later checking

② Active - blocking assignments

③ Inactive - zero delay assignments

④ MBA - Nonblocking assignments [any non-blocking assign updates are done]

[The evaluation of the right hand side of all nonblocking assignment is always done before] ↗

ALLOW sequential systems to be modeled correctly

⑤ Observe - Evaluate assertions

⑥ Reactive - Execute programs in testbenches → Can create new active events at the current simulation time

⑦ Postpond - \$strobe and \$monitor/print宏



Assertion

property Bill;

X reusable

```
  @ (posedge clock) ready && (twenty || ten) |=> bill;
```

endproperty

Symbol  $|=>$  is a non-overlapping implication. In other words, the condition on the left implies that in the next clock cycle the condition on the right is expected to become true.

Symbol  $|=>$  is an overlapping implication. The condition on the left implies that the condition on the right is true in the same clock cycle.

property BillNotDispense;

If the vending machine is waiting for bill, it is not dispensing a ticket at the same time

bill |=> ! dispense;

endproperty

The implication operator means that property fails if bill is true and dispense is also true.

The property passes if bill is true and dispense is false.

The property also passes if bill is false. This is known as a vacuous pass. In other words, the property passes because it is never really tested.

Potential source of false optimism → Just because none of the assertions used to verify a design fails, it does not follow that the design has been fully tested. Every assertion might have passed vacuously.

→ One way to check the properties are fully tested is to use a cover statement

⇒ COVER property (BillNotDispense)

After a simulation, the number of times the property was checked will be reported, together with the number of passes.

property BillDispenseReady;

bill ### ## 1 dispense |=> ready;

endproperty

The symbol  $\#\#$  is used to indicate clock cycles

$\#\# [1:3]$  means "between 1 and 3 clock cycles"

property Eventually;

ready && twenty |=> ###[1:\$] dispense;

In general, liveness properties should be avoided. "Eventually" can be a very long time and the property might never fail. The property can also pass vacuously so in practice, this property tells us nothing, while appearing to say a lot.

\* Property StateAtoD

(state == A) && twenty |=> (state == D);

endproperty

Default parameter

With /Prt that into synthesis and see the structure.

If we don't have parameter, you cannot synthesize it.

See what Quartus does to it.

Use default

Setup time violation

Design for Reliability

one-hot scheme

① put a default line there to go back a normal state

unique check whether all states are covered

unique case (~)

② Simplify (Directive in Quartus)

Synthesiz Tool we to design the actual circuit

cost (double the size, more complicated)

EXTRA

Add more flip-flops doing Hamming code

## # Modular testbench example

Simple example to illustrate modularity



## # Top level

```

module testbooth;
parameter N=16;
logic signed [N-1:0] ain, bin;
logic signed [2*N-1:0] qout;
logic clk, load, n-reset;
logic ready;
clock_gen #(10.0, 10.0) c0(.*);
stimulus s0(.*);
boot #(N, N, 2*N) b0(.*);
verify v0(.*);
endmodule
  
```

```

module clock_gen #(parameter ClockFreq_MHz = 100.0, ResetWidth = 10.0)
(output logic clk, n-reset);
timeunit 1ms;
timeprecision loops;
parameter ClockHigh = (500.0)/ClockFreq_MHz;
initial
begin
n_reset = 1;
clk = 10;
#ResetWidth n_reset = 10;
#ResetWidth n_reset = 1;
forever #ClockHigh clk = ~clk;
end
endmodule
  
```

## # Stimulus

```

program stimulus #(parameter NA=16, NB=16)
(output logic signed [NA-1:0] ah,
output logic signed [NB-1:0] bh,
output logic load);
initial
begin
#30ns ah = 12; bh = 110;
#10ns load = 11;
#10ns load = 10;
end
endprogram
  
```

## # Verify Assertions

```

module verify #(parameter AL=8, BL=8, QL=AL+BL)
(input logic signed [QL-1:0] qout,
input logic ready,
input logic signed [AL-1:0] ain,
input logic signed [BL-1:0] bin,
input logic clk, load);
  
```

```

default clocking clock_clock
@ (posedge clk);
end clocking
  
```

```

property LoadReady;
load l=> ##1 AL ready;
endproperty
  
```

```

property Product;
load l=> ##1 AL(qout == ah*bin);
endproperty
  
```

```

assert property (Load)
else $error ("%d * %d gives %d", ah, bin, qout);
  
```

```

endmodule
  
```



a, b two numbers      8bit  $\times$  8bit

Shift bit of b is 0  $\rightarrow$  Done dd  
always-comb

$$\{C_{in}, g_{in}\} \leftarrow a + b$$

always-if

~~$\{C_{in}, g_{in}\}$~~

$$\{C, g, b\} \leftarrow \{C_{in}, g_{in}, b\}_{[i-1, i]}$$

$$A \# \# S = (A + S)/2$$

A  
17 \$?

## Review

[Q] Explain the difference between synchronous and asynchronous inputs to a sequential system.

Synchronous inputs will affect a sequential system outputs only when a clock edge comes (usually a rising edge). In other words, the inputs are synchronized with the clock.

Asynchronous inputs are independent of the clock, they are dominant over clk and will affect a sequential system outputs immediately.

(Pros): High speeds can be achieved, as the data path is independent of reset signal

The circuit can be reset with or without a clock present

No work around is required for logic synthesis

The reset signal does not need to be long enough in order to be captured at rising clock edge.

(Cons): Reset line is sensitive to glitches

May have metastability issues

[Q] The circuit might ever enter an unused state because of a set-up time violation. How can this be avoided?

- ① Use unique case to cover all unused states and use default to set the next state when the circuit goes to a unused state.
- ② self-correcting circuit
- ③ Add more flip-flops using Hamming code.

[Q] Layered Testbench



[Q] Benefits of using Assertion Based Verification

① Providing internal test point in the design

② Simplifying the diagnosis and detection of bugs by localizing the occurrence of a suspected bug to an assertion monitor which is then checked

③ Allowing designers to verify the same assertions using both simulation and formal verification

④ Increasing the observability when simulation is used.

⑤ Increasing both controllability and the observability when formal verification is used.

Assertions help to detect more functional bugs, detect them earlier in the process and detect them closer to their original cause

Formulating and writing assertions can give the designer a better understanding of the design and hence uncover bugs in the specification or else avoid introducing bugs into the design in the first place

High verification efficiency

[Q] Principle of Assertion-based verification

? Describe what you expect to see, and when you expect to see it. The simulator will tell you if this doesn't happen

"An assertion is an expression that is if it is false, indicates an error. Assertions are used for debugging by catching can't happen errors. Within hardware description language designs, an assertion is a conditional statement that checks for specific behaviour and displays a message if it occurs. Assertions are generally used as monitors looking for bad behaviour, but may be used to create an alert for desired behaviour as well."

For our purposes, an assertion is a statement about a specific functional characteristic or property that is expected to hold for a design.

[Q] Principle of Scan-in, Scan-out (SISO)

SISO makes the state variables directly accessible by connecting all the state registers as a shift register. The shifter register has a mode control input M. when M=0, it's in operational mode. when M=1, it's in scan mode, the flip-flop performs a shift register with the input to the shift register being the scan data in (SDI) pin and the output being the scan data out (SDO) pin.

[Q] What features distinguish a SystemVerilog "testbench" from other SystemVerilog modules

① A testbench has no inputs or outputs

② As testbench is never synthesized, we can use the entire SystemVerilog language

③ Can use initial assignment which indicates a procedure that is executed once

Primitive 是针对器件特征开发的一系列常用模块的名字，可以将其看成一个库函数。

## UG578 Notes

The GTx Transceivers support line rate from 500 Mb/s to 32.75 Gb/s in UltraScale + FPGAs.

Each GTx transceiver channel in a Quad has six clock inputs available: ① Two clock local clock reference clock pin pairs, GTR/GTREFCLK0/1

② Two reference clock pin pairs from the Quads above, GT\_SOUTH\_REFCLK0/1 ③ from Quads below, GT\_NORTH\_REFCLK0 or GT\_NORTH\_REFCLK1

TXSYS CLK SEL : Selects the PLL reference clock source to drive the TXOUT CLK

TXPLLCLK SEL : Selects the PLL to drive the TX datapath

{ 00: CPLL

{ 10: QPLL1

{ 11: QPLL0

QPLL0/1 Nominal VCO Operating Range  
 { QPLL0 9.8~13.35  
 { QPLL1 8.0~13

$$f_{\text{PLLCLKout}} = f_{\text{PLLCLKin}} \times \frac{N \cdot \text{FractionalPart}}{M \times \text{QPLL\_CLKOUTRATE}}$$

$$f_{\text{LineRate}} = \frac{\text{Fractional} \times 2}{D}$$

$$\text{Fractionalpart} = \frac{\text{SPMDATA}}{2^{\text{SPMWIDTH}}}$$

PCS & PMA

The physical coding sublayer (PCS) is a networking protocol sublayer, it resides at the top of the physical layer (PHY), and provides an interface between the physical medium attachment (PMA) sublayer and the media independent

DRP Dynamical Reconfiguration Port

IBERT: Integrated Bit Error Ratio Test



# PRBS

$$\text{PRBS7} = x^7 + x^6 + 1$$

$$\text{PRBS9} = x^9 + x^5 + 1$$

$$\text{PRBS11} = x^{11} + x^9 + 1$$

$$\text{PRBS15} = x^{15} + x^{14} + 1$$

$$\text{PRBS20} = x^{20} + x^3 + 1$$

$$\text{PRBS23} = x^{23} + x^{18} + 1$$

$$\text{PRBS31} = x^{31} + x^{26} + 1$$

e.g.: PRBS: [ ] 1 2 3 4 5 6 7

$2^7 - 1$ : 127-bit long

reg [1:8] prbs = {8'b10101010};

Xilinx Code:

reg [1: POLY\_LENGTH] prbs\_reg;

↑ poly-tap  
poly-length

check\_mode == 0 ?

stimulus: data\_in = 10

checking: data\_in == rxdata

[64:1] prbs\_msb

1 2 3 4 5 6 7 / [1:7] prbs\_reg

$$\text{prbs}[0] = \text{prbs\_reg} \oplus \text{prbs}[1]$$

(>>)  $\text{prbs}[i] = \{\text{prbs\_msb}[i+1], \text{prbs}[i:1:\text{POLY\_LENGTH}-1]\}$

$$\text{xor\_a}[0] = \text{prbs}[0][6] \wedge \text{prbs}[0][7]$$

输入 RLFSR 的 seed 为 111，MBITS 设置每周期产生的 PRBS 的 bit 数，并通过 DATA\_OUT 输出 MBITS PRBS bits。  
 每周期后，Seed 会更新 (prbs-reg 的值)



## [Verilog Generate]

```
genvar I;
generate for (I=0; I<NBITS; I=I+1)
begin : g1
    ...
end
endgenerate
```

## ② NRZ (Non Return zero)

最常见, HIGH -1, LOW -0 e.g. UART

## [Coding]

## ① RZ (Return to Zero)

在一个周期内, 用二进制传输数据, 在数据位脉冲结束后, 需维持一段时间的低电平



同时传递时钟信号和数据信号

归零码高占用一部分带宽, 效率低

## ③ NRZI (Non Return zero Inverted)

既能准时钟信号, 又能尽量不损失系统带宽

信号电平翻转表示 0, 信号电平不变表示 1



## Bit

Stuffing  $\Rightarrow$  例如数据包中, USB 2.0 协议中规定每 7 个 1 后在数据中插入一个 0

## ④ Manchester

利用信号的跳变方向来决定数据

一位位中间, 信号由高向低跳变表示 0, 由低向高跳变表示 1



NRZ 和 NRZI 都没有自同步特性, 但是有一些特殊技巧, 比如发送一个同步头 (010101) 的方波, 让接收者通过这个同步头计算着发送者的频率, 再利用这个频率来解之后的数据

## [8b/10b]

# DC bias: the DC bias, DC component or DC coefficient is the mean value of the waveform

If the mean amplitude is zero, there is no DC offset. A waveform without a DC component is known as DC-balanced or DC-free

# 8b/10b encoding: In telecommunication, 8b/10b is a code that maps 8-bit words to 10-bit symbols to achieve DC-balance and bounded disparity and yet provide enough state changes to allow reasonable clock recovery. This means that the difference between the counts of ones and zeros in a string of at least 20 bits is no more than two, and there are not more than five ones or zeros in a row. This helps to reduce the demand for the lower bandwidth limit of the channel necessary to transfer the signal.

## [Phase-locked Loop]

锁相环是一个相位反馈自动控制系统

① 压控振荡器的输出经过采样并分频

② 和基准信号同时输入鉴相器

③ 鉴相器通过比较上述两个信号的频率差, 输出一个直流脉冲电压

④ 控制 VCO, 使它的频率改变

⑤ 经过一个很短的时间, VCO 输出锁定于某一频率值。

## # Dynamic Reconfigurable Port

The DRP allows the dynamic change of parameters of the GTV primitives

The DRP interface:

① Address bus (DRPADDR)

② Separated data buses for {reading (DRPDDO)  
writing (DRPDII)}

③ Enable signal (DRPEN)

④ Read/write signal (DRPWE) implement read/write operations.

⑤ Ready/valid signal (DRPRDY) indicate operation completion/ availability of the data

|   |               |                                                                                                  |
|---|---------------|--------------------------------------------------------------------------------------------------|
| I | DRPADDR [9:0] | DRP Address Bits                                                                                 |
| I | DRPCLK        | DRP Interface Clock                                                                              |
| I | DRDEN         | DRP Enable signal (0: No read or write operation performed)                                      |
| I | DRPDI [5:0]   | Data bus for writing configuration data from the interconnect logic resources to the transceiver |
| O | DRPRDY        | Indicates operation is complete for write operations and data is valid for read operations       |
| O | DRPDO [5:0]   | Data bus for reading config data from the GTV transceiver to the interconnect logic resources    |
| I | DRPWE         | DRP write enable (0: read operation when DRP is 1)                                               |

Campus 无线装订本  
B5 40页

WCN-CNB1430



6 937748 309932  
MADE IN CHINA

国誉商业(上海)有限公司 ·  
<http://www.kokuyo.cn/st/>

上海市奉贤区人杰路128号

TEL : 400-820-0798 FAX : 021-3255-8508

产地: 上海市 QB/T1438-2007 合格

B5 252×179mm

● 采用日本进口纸张，纸质细滑，牢固不掉页。