

# Hardware Modelling Using Verilog

5/9/22

Moore's Law : Every 18 months, no. of transistors in chip will double



We need to use Computer Aided Design (CAD)

- Based on Hardware Descriptive Language
- HDLs provide formats for esp. outputs of various design steps
- ⇒ CAD tool transforms this HDL input into HDL output that contains more detailed info about hardware.
  - Behavioural level to Register Transfer Level (MUX, Adder)
  - RT level to Gate level
  - Gate level to Transistor level
  - Transistor level to Layout level.

## Steps in Design Flow

### Behavioural Design

- Specify functionality of design.  
(Boolean expression, Finite state machine behaviour, high-level algorithm)
- Needs to be synth. into more detailed specifications for hardware realisations.

### Data Path Design

Generate a netlist of register transfer level components, like registers, adders, multipliers, multiplexers, decoders,



- Data • A netlist is directed graph, where vertices indicate components and edges indicate interconnections
- Netlist specification = Structural Design
  - may be specified at various levels, comp. may be functional modules, gates or transistors.
  - Sys. transformed from one level to next.

### Netlist



- Logic Design → Generate a netlist of gates / flip-flop or standard cells.
- Standard cell → pre designed circuit module (gates, flip-flops, multiplexers etc) at layout level
- Various logic optimisation tech. are used to obtain cost eff. design.
- Regr. due to optimization :- →
  - (i) minimum no. of gates
  - (ii) mini. no. of gate delays / levels (delay)
  - (iii) minimise signal transmission activities (ie dynamic power)
- Physical Design → generate final layout that can be sent for fabrication

(Additional) Final target may be Field Programmable Gate Array (FPGA)  
where technology mapping from gate level netlist is used

# ~~Know your Arduino~~

## Lecture - 2 Design Representation

3 pts of view  $\rightarrow$  i) behavioural

ii) Structural

iii) Physical

conveniently expressed by Y diagrams



Behavioral  $\rightarrow$  specifies how a particular design should respond to a given set of inputs

specified by  $\rightarrow$  i) Boolean Equations  
ii) Algos written in Verilog or VHDL

Structural spec.  $\rightarrow$  specifies how components are connected  
List of modules and their specifications. interconnections  
- called netlist

Physical rep.  $\rightarrow$  lowest level of physical specific.  
(photo-mask info req. by various focus + step)

- At module level, physical layout for 4-bit address may be defined by rectangle or polygon, and collection of ports
- has number of rectangles or polygons

Lecture 3

ASIC  $\rightarrow$  Application Specific Integrated Circuit

Used when high performance and high packing density is required

- ② When it has to be mass produced

FPGA  $\rightarrow$  Field programmable Gate Array

- Fast turnaround time is reqd. to validate design
- Trade off in performance

~~Code Example~~

~~module~~

Lecture 4 FPGA [Read on web]

- programmable
- Array of logic cells connected via routing channels
- Diff types of cells
  - Input / Output Cells
  - Logic cells
- Interconnect b/w cells  $\rightarrow$  SRAM, anti fuses

Fundamental Building block inside of FPGA are flip flops and look up table. (LUT)

CLB  $\rightarrow$  Combinational Configurational logic blocks

Lecture 5 Gat. Array

done with metal mask design and processes

4 eq. 2 step manufac. process

## Standard Cell Based Design

- also called semi custom design style
- Requires developing full custom mask set
- Commonly used logic cells are developed and stored in standard cell library.
- Typical library may contain a few hundred cells

### Characteristics

- ① made of fixed height

To enable automated placement of cells, and routing of inter-cell connections

- + number of cells can be abutted side-by side to form rows.

- o Power and ground rails ~~are~~ typically run parallel to upper and lower boundaries of cell

Neighbouring cells share a common power and ground bus.

The input and output pins are located on upper and lower boundaries of cell.



## Custom Design

Entire mask design is done from scratch

Design reuse is

economical

Full custom design is used for memory cell

Week 2

Lecture 6

## VHDL Language Features Part - 1

```
module name( ports);  
    input/output declarations  
    local net declaration  
    parallel statements  
end module
```

module cannot contain  
def. of other modules  
(Same as functions in C++)  
Can be instantiated  
within another module  
First → hierarchy feature

Module



assign statement → represents continuous assignment,  
variable gets updated as RHS changes.

assign variable = expression;

LHS must be "net" variable ("wire")

RHS can contain both "register" & "net" type  
variables

assign → models behavioral design style →  
used to model combinational circuit

Ex assign  $t_1 = a \xrightarrow{\text{(and)}} b$ ;  $a \rightarrow t_1 \rightarrow$  continuously driven

## Data Types in Verilog

### Net

- must be continuously driven
- cannot be used to store value
- used to model connections b/w continuous assignments and instantiations.

### Register

- retains last value assigned to it.
- used to represent storage elements

connection b/w hardware elements

1 bit values unless they are declared explicitly as vectors

Net data types  $\rightarrow$  wire, wor, wond, tri, supply0, supply1, etc.

(wire, tri)  $\rightarrow$  equivalent. When there are multiple drivers driving them, driver outputs are shorted together?

wor, wond  $\rightarrow$  OR, AND at junction

Supply0, Supply1  $\rightarrow$  power connections

```

module use-wire(f, A,B,C,D);
    input A,B,C,D;
    output f;
    wire f;
    assign f = A & B;
    assign f = C | D;
end

```



Can be either indeterminant

if  $w$  and  $f$ ;  $f = (A \& B) \Delta (C | D)$

Supply 0, supply 1  $\rightarrow$  greatest signal strength

Eg

```

Supply00  gnd ;
Supply1  vdd ;

```

## Data Values and Strengths of Signals

4 value levels, 8 strength levels

$\hookrightarrow$  used to resolve conflicts between signal drivers of diff strengths in real circuits

All unconnected net  $\rightarrow z$   
Register variables  $\rightarrow x$

| Value Level |                |
|-------------|----------------|
| 0           | logic 0        |
| 1           | logic 1        |
| x           | unknown        |
| z           | high impedance |

| Strength | Type           |
|----------|----------------|
| Supply   | Driving        |
| strong   | Driving        |
| full     | Driving        |
| large    | Storage        |
| weak     | Driving        |
| medium   | Storage        |
| small    | Storage        |
| High z   | High Impedance |

If two signals of unequal strengths get delivered to wire, stronger signal will prevail

Register Data Type  $\rightarrow$  Var that can hold a value

$\rightarrow$  Not same as "Register" in IRL

### Types

- reg  $\rightarrow$  used most

- integer  $\rightarrow$  loop counting

- real store floating point numbers

- time : keep track of simulation time  
(not used in synthesis)

"reg"  $\rightarrow$  default val = "x"

can be assigned a value in synchronism with clock or otherwise.

- declaration specifies size ~~itself~~ explicit (default - 1 bit)

```
reg n,y;           // 1 bit
```

```
reg [15:0] bus;  // 16 bit bus
```

$\rightarrow$  Treated as unsigned number in arithmetic expressions

$\rightarrow$  for used for sequential hardware (counters etc.)

(clock counter code ??)

"integer"

: general purpose for manipulating quantities

- more convenient to use in situations like loop counting

- treated as 2's compliment signed integer in arithmetic expressions

- default size  $\rightarrow$  32 bits

Synthesis tool takes

using data flow analysis to determine size

"real" store floating point numbers

When real val assigned to integer, real num rounded off to nearest integer.

time → Sim. is carried out using logical clock called simulation time.

"time" used to store simulation time.

System function ~~"\$time"~~ "\$time" gives current simulation time

Vectors "net", "reg" types can be declared as vectors, of multiple bit widths

• Declared by specifying range [range 1 : range 2]

where range 1 most significant bit  
range 2 least

wire [7:0] sum; // MSB = sum[7], LSB is sum[0]



individual bits can be  
~~be~~ accessed  
[7] → sum[7] →

(Array types)

reg [1:10] data;



parts of a vector can be addressed and used in expression.

Eg

`reg[31:0] IR;`

`reg [5:0] op_code;`

`reg [4:0] reg1, reg2, reg3;`

`reg [10:0] offset;`

|    |      |      |      |     |    |    |    |
|----|------|------|------|-----|----|----|----|
| 31 | 26   | 21   | 20   | 16  | 15 | 11 | 10 |
| op | reg1 | reg2 | reg3 | off |    |    |    |

LR

Can also use

$$\text{sum} = \text{IR}[25:21] + \text{IR}[20:16]$$

// Adding 2-4 bits and storing in sum var

### Multi Dimensional Arrays and Memories

• multi dimensional arrays of any size can be done in Verilog

Example `reg [31:0] register_bank[15:0];` // 16 32-bit registers

• Memories can be modelled in Verilog as 1D array of registers  
Each element of array addressed by single array index.

### Specifying Constant Values

Specified in either sized or unsized form

Syntax:

'<size>' <base><numbers>

`4'b0101` // 4 bit = 0101

`1'b0` // Logic 0

`12'hB3C` // 12 bit number 1011 0011 10

`2'B` // Signed number in 32 bits

## Parameters

constant with a given name

- cannot specify size of parameter. Size decided from const. value itself.

parameter HI = 25, LO = 0;

## VHDL Language Features (Part 3)

### Predefined Logic Gates in VHDL

- can be instantiated within a module to create a structured design
- Gates respond to logic values (0, 1,  $\alpha$ , or Z) in logical way



$$f = t_1 + t_2$$

$Z \rightarrow$  High impedance State

### Primitive Gates

buffer      buf(out,in)



if  $ctrl=0$ ,  $out=in$   
 $=1$ ,  $out=Z$



if  $ctrl=1$ ,  $out=in$   
~~else~~  $=0$ ,  $out=Z$



Some restrictions when instantiating primitive gates:

- output port must be connected to net (eg wire)  
(wire by default)
- Inputs ports may be net or register type
- Single output but can have any number of i/p (except NOT + BUF)
- When instantiating a gate, an optional delay may be specified.
  - Used for simulation
  - logic synthesis tool ignore the time delays

Eg and #5 G1( F, A, B);  
↓  
delay

### Time Scale Directive

delay values in one module need to be specified in terms of some time unit, while these in some other other modules need to be specified in terms of other time unit

timescale <reference time unit> / <time precision>

unit of measurement  
for time

precision  
to which  
delays are  
rounded off  
during simulation

Valid values → 1, 10, 100

For: 10 ns

write

s, ms, us, ps, fs

# 5 → 50 ns

↓  
microsecond

## Specifying Connectivity during Instantiation

- o When module is instantiated within another module,  
2 ways to specify connectivity of signal lines b/w 2 modules.
- a) Positional association  $\Rightarrow$  Parameters of module being instantiated are listed in same ~~and~~ order as in original module description.
- b) Explicit association  
 $\begin{array}{l} \text{(Review after testbench)} \\ \text{(in testbench } \cdot \text{OUT}(Y))} \end{array}$ 
  - parameters of module instant in arbitrary order
  - less chance of errors

## Hardware Modelling Issues

- o In terms of hardware realization, value computed can be assigned to :-
  - A "wire", - + "flip-flop" (edge-triggered storage cell)
  - A "latch" (level triggered storage cell)
- "Net" data type always map to "wire" during synthesis
- "Register" maps either to "wire" or "storage cell" depending upon the context under which value is assigned.

## Lecture-9 Verilog operators

### Arithmetic operator

| Unary    | Binary  |                        |
|----------|---------|------------------------|
| $+ A$    | $A + B$ | $\rightarrow$ multiply |
| $- B$    | $A - B$ | $/$ divide             |
| $-(A+B)$ | $A * B$ | $\%$ modulus           |
|          |         | $**$ exponentiation    |

### Logical Operators (return either TRUE or FALSE)

! - logical negation

|| logical OR

&& logical AND

Value 0 treated FALSE, Any non-zero value TRUE

### Relational operators

$!=$ ,  $\leq$ ,  $\geq$

(operate on numbers, and return a Boolean val.)

$=$

$\sim$  Bitwise NOT

$\wedge$  Bitwise AND

$|$  OR

$\wedge \vee$  XOR

$\sim \wedge$  XNOR

### Reduction operators

$\rightarrow$  accept a single word operand and produce a single bit as output.

Operates on all the bits within the word.

$\sim \&$  NAND

$\vee |$  NOR

Ex wire [3:0] x; wire y;

assign y = &x;

Red. operator



Shift operators:

>> shift right

<< shift left

>>> arithmetic shift right



>>> :: 2's compliment num sys.

|   |   |   |   |   |   |   |
|---|---|---|---|---|---|---|
| 1 | 0 | 1 | 0 | 0 | 1 | 0 |
|---|---|---|---|---|---|---|

  
1 (-ve)  
0 (+ve)

shift right : /2  
shift left : \*2

-ve no.: shift in 1

+ve no.: shift in 0

Conditional operator:

cond-exp? true-exp: false-exp

is true  
↓  
if operation ↓  
else

Concatenation operator : Joins together bits from 2 or more comma separated expression.

Replication operator  
{ n } ?

Reps n copies of expression m.

operator precedence  $\Rightarrow$  check web  
associated left to right

\* Presence of 'z' or 'x' in reg or wire being used in an arithmetic expression results in whole expression being unknown (x)

Boolean false = 1'b0  
True : 1'b1

## Lectures 10 VHDL Modelling Examples



Example structural hierarchical description of 16 to 1 MUX.

- a) pure behavioural modelling
- (a) structural modelling in 4-1 multiplexer model
  - (i) Make start model of 4-1 one multiplexer using behavioural modelling of 2-to-1 multiplexer
  - (ii) make structural gate level modelling of 2-1 multiplexer
  - (iii) have complete structural hierarchical description

## Example Behavioral modelling of 4 to 1 MUX + Structural modelling of 16 to 1 MUX

module mux4to1 (in, sel, out);

input [3:0] in;

input [1:0] sel;

output out;

assign out = in[sel];

endmodule

module mux16to1 (in, sel, out);

input [15:0] in;

input [3:0] sel;

output out;

wire [3:0] t;

mux4to1 M0 ( in[3:0], sel[1:0], t[0]);

mux4to1 M1 ( in[7:4], sel[1:0], t[1]);

mux4to1 M2 ( in[11:8], sel[1:0], t[2]);

mux4to1 M3 ( in[15:12], sel[1:0], t[3]);

mux4to1 M4 ( t, sel[3:2], out);

endmodule



## Example 2

### Section 1: Behavioral desc. of 16 bit adder

#### • Generation of status flags:

- **Sign** : whether sum is +ve or -ve

- **Zero** : sum  $\Rightarrow 2^{16}0$

- **Carry** : whether there is carry out of last step

- **Parity** : number of 1s in sum are even or odd

- **Overflow** : whether sum cannot fit 16 bits



## lec 12 Description Styles in Verilog

- Data Flow
  - continuous assignment Using assignment statements
- Behavioral
  - procedural assignment Using procedural statements
    - Blocking
    - Non-Blocking
  - Similar to a program in high level language

### Continuous Assignment

- keyword "assign"
- assign  $a = b + c;$
- Forms a static binding b/w:
  - "net" being LHS (wire)
- RHS can be "net" or "register" type variable

Assignment  $\rightarrow$  continuously active

$\hookrightarrow$  used for combinational circuits

$\hookrightarrow$  Also for some sequ. circuit

```
module
  [ signal declaration ]
  [ assign statements ]
  {
    =
    -
    :
  }
endmodule
```

NOTE whenever there is array reference on RHS with variable index, a MUX is generated by synthesis tool  
assign = data[sel] ;

$\rightarrow$  If index constant, wire will be created

Conditional operator also generates a MUX  
assign  $f = \text{sel} ? a : b$ ;  $\Rightarrow$  For a bus, an array of  $2 \times 1$  MUX will be generated

if  $\text{sel} == 1$ , choose  $a$   
else choose  $b$

Example assign  $f = (a == 0) ? (c+d) : (c-d)$ ;

if  $a == 0 \Rightarrow c+d$   
else  $\Rightarrow c-d$



module generate\_decoder(out, in, select);

input in;

input [0:1] select;

output [0:3] out;

assign out[select] = in;

endmodule

Non constant index in LHS generates decoder

Whenever a synthesis tool detects a variable index in LHS, a decoder is generated

MUX  $\underline{\text{out}} = \underline{\text{in}[sel]}$

## Lecture 13: Procedural Assignment

- Two kinds supported
  - The "initial" block
    - Executed once at begin
    - Used only in testbenches
  - "Always" block
    - Continuous loop that never terminates

Procedural block defines

- A region of code containing sequential statements
- Statements execute in order they are written



Used in testbenches for simulation:-

- specifies stimulus to be applied to design-under-test (DUT)
- specifies how DUT outputs are to be displayed / handled
- specifies file where waveform info is to be dumped

### Always block

Starts at  $t = 0$ , executes statements inside block repeatedly and never stops.

Used to model a block of activity that is repeated indefinitely in digital circuit.

Clear signal  $\Rightarrow$  example.



Granville

```

module generating_clock;
    output reg clk;
initial
    clk = 1'b0; // initialised to 0 at time 0
always
    #5clk = ~clk; // Toggle after time 5 units
initial
    # 500 $finish; // Finish at time 500 unit
endmodule

```

(Now what is "net" and "reg")

All these blocks will execute concurrently

Basic syntax of always block:

@(event-expression) block seq. for both combinational and sequential circuit descriptions.

Only "reg" can be used assigned within an "initial" or "always" block.

Reason:-

Sequential block "always" block executes only when event expression triggers

At other times, block is doing nothing.

Object being assigned to must remember the last value assigned (not continuously driven).

Any kind of variable may appear in event expression (reg, wire, etc)

reg → hardware register  
→ combinational circuit

always @ (event-expression);

begin

end

→ This block will be executed only if event-exp is triggered

## Sequential statements in Verilog

one or more sequential statements can be present inside an "initial" or "always" block.

Statements executed sequentially

Multiple assign statements inside begin--end may either execute sequentially or concurrently depending upon type of assign.

Types of assign → blocking ( $a = b + c$ )

→ non blocking ( $a <= b + c$ )

### Sequential suite

(a) begin -- end

begin

seq. stat 1;

  || 2;

  |  
  n;  
  |

If  $n = 1$ , begin--end not seq.

end

(b) if - else

if (<exp 1>)

  seq - stat ;

else if (<exp 2>)

  seq stat ;

  |

else default stat - w ;

also

these can be grouped

with begin--end

(i) 'case'

```
case(<expression>)
expr 1: seq - statement;
expr 2: seq - --;
```

default: default stat.

endcase

→ Can replace complex if--else  
Statement for multiway  
branching

Expression compared to alter-  
natives in order they are written.

If none of alternative matched,  
default is executed

two variations: "casez" and "casex"

✓ treats all  
'z' values as  
don't cares

↓ treats all "x" and "z"  
as don't cares

(d) 'while' loop

while (<expression>) → executes until <exp> not true

Sequential-statement; → can be single or grp within "begin--end"

e) 'for' loop

for( expr1; expr2; expr3)

Sequential-statement

executes as long as expr2 is true

→ can be single or  
grp.

a) initial condition expr1

b) terminal condition (expr2)

c) procedural assignment to change the value of control variable

"for" can be conveniently used to initialise an array  
or memory

(f) "repeat" loop → execute fixed number of times

repeat (<expression>)

sequential statement;

cannot be used to loop on a general logical expression like "while"

it can be const, var, or signal val

- if it is variable or signal val, it is evaluated only when loop starts and not during execution of loop.

Example 100 clk pulses generated.

reg clock;

initial

begin

clock = 1'b0;

repeat (100) →

#5 clock = ~clock;

end



exactly 100 clk pulses generated

(g) 'forever' loop

forever

sequential-statement;

forever construct does not use any expression and executes forever until \$finish is encountered in test bench.

Used with timing specified

- If delay not specified, simulator would execute this indefinitely without advancing \$time.

- Rest of design will never be executed (if time not specified)

// clock generation using "forever" construct

reg clk;

initial

begin

clk = 1'bo;

forever #5 clk = ~clk; // launit clk period

end

Other constructs available .

#(time-value) → makes a block suspend for 'time-value' units of time  
→ time unit can be specified using timescale command

@(event-expression) → makes a block suspend until "event-expression" triggers

• keywords associated with e-e ahead

① @(event-expression)

specifies the event that is required to resume execution of the procedural block.

• The event can be any one of following:-

a) change of signal value

b) positive or neg. edge occurring on signal (posedge or negedge)

c) list of above mentioned events separated

by "or" or comma

posedge → transition from {0, x, z} to 1 and

from {1, 0} to {z, x, y}



negedge →



## Example

$\text{@}(\text{iu})$  // "u" change  
 $\text{@}(a \text{ or } b \text{ or } c)$

-  $\text{@}(\text{posedge clk})$   
// positive edge of  
'clk'

-  $\text{@}(+)$   $\Rightarrow$  // any variable change

// flip flop with synchronous set and reset

module dff( q, qbar, d, set, reset, clk );

input d, set, reset, clk;

output reg q; output qbar;

assign qbar =  $\sim q$ ;

always  $\text{@}(\text{posedge clk})$

begin

if ( reset == 0 )  $q <= 0$ ;

else if ( set == 0 )  $q <= 1$ ;

else  $q <= d$ ;

end

endmodule

for ~~as~~ asynchronous

modify always  $\text{@}(\text{posedge clk or negedge set or negedge reset})$

## Lecture - 15

### Procedural Assignment (Examples)

① module max21 ( in1, in0, s, f );  
 input in1, in0, s;  
 output reg f;

always @ ( in1 or in0 or s ) → commas can be used also  
 if ( s )

f = in1;

else

f = in0;

endmodule

② D flip flop

module dff\_nedge ( D, clock, Q, Qbar );  
 input D, clock;  
 output reg Q, Qbar;  
 always @ ( negedge clock )  
 begin  
 Q = D;  
 Qbar = ~D;  
 end

endmodule

When a "case" statement is incompletely decoded (specified), the synthesis tool will infer the need for a latch to hold the residual output when the select bits take the unspecified value.

module xyz ( f, a, b, c );  
 input a, b, c;  
 output reg f;

always @ ( + )

if ( a == 1 ) f = b & c;

endmodule



→ for a = 0, value of f is unspecified

```

module xyz(if, a, b, c);
    input a, b, c;
    output neg f;
    always @(*)
        begin
            f = c;
            if (a == 0) f = b & c;
        end
endmodule

```



## Working of Priority Encoder ???

### 1) 2 bit comparator

```

module compare (A1, A0, B1, B0, lt, gt, eq);
    input A1, A0, B1, B0;
    output neg lt, gt, eq;
    always @ (A1, A0, B1, B0)
        begin
            lt = (A1, A0 < B1, B0);
            gt = (A1, A0 > B1, B0);
            eq = (A1, A0 == B1, B0);
        end
endmodule

```

~~continuous assignment~~



## hedure 17

# Procedural Assignment

- used to update variables of types "real", "integer", "real", "time"
  - value assigned to variable remains unchanged until another procedural assign statement assigns new value to variable

(Different from continuous assignment (using "assign") that results in expression on RHS to continuously drive "net" type variable on left.

Types of procedural assignment statements

- a) Blocking (denoted by " $=$ ")
  - b) Non-blocking (denoted by " $\leq$ ")

Left hand side of a procedural assignment statement can be one of:

- a) register type var ("eeg", "integer", "real" or "time")
  - b) bit select of these variables (a[5:1])
  - c) part select (IR[31:26])
  - d) concatenation

RHS can be any expression consisting of "net" and "register" that evaluate to value.

Procedural assignment statement can only appear within procedural blocks ("initial" or "always")

$$\{a[5], b[3:2], c\} = \frac{x+y}{\downarrow}$$

## Blocking Assignment (=)

variable-name = [delay-or-event-control] expression;

### blocking assignment

Executed in order they are specified in procedural block.

- target of assignments get updated before next seq. statement in procedural block is executed.
- they do not block execution of statements in other procedural blocks.

Recommended style for mod combinational logic

## Non Blocking Assignment (<=)

variable-name <= [delay-or-eventexpression] expression;

Allows scheduling of assignments without blocking execution of statements that follow within procedural block.

- Assignment to target gets scheduled for end of simulation cycle (at end of procedural block)
- Statements subsequent to instruction under consideration are not blocked by assignment
- allows concurrent procedural assignment, suitable for sequential logic

Recommended style for modelling sequential logic.

- ~~several~~ "reg" type variables can be assigned synchronously under the control of common clock.

```
integer a, b, c;  
initial  
begin  
  a = 10; b = 20; c = 15;  
end
```

( Good for synch  
seq - circuit )

```
initial  
begin  
  a <= #5 b + c;  
  b <= #5 a + 5;  
  c <= #5 a - b;  
end
```

→ They will be evaluated together

initially  $a = 10, b = 20, c = 15;$

$a = 35$  at  $t = 5$

$b = 15$  at  $t = 5$

$c = -10$  at  $t = 5$

Example Swapping "a" and "b"

always @ (posedge clk)

$a <= b;$

always @ (posedge clk)

$b <= a;$

Some Rules to be followed

- ° blocking and non-blocking not mixed in "always" block.
- ° Verilog synthesiser ignores the delays specified in procedural assignment statements.  
→ May lead to functional mismatch.

Not Memorable →

$x = x + 5;$   
 $x <= y;$

## (lect 17 Part - 2)

more examples      // up-down counter (synchronous clear)

```
module counter ( mode, clk, ld, d_in, clk, count);  
    input mode, clk, ld, d_in;      ↳ load input  
    input [0:7] d_in;  
    output reg [0:7] count;  
  
    always @ (posedge clk)  
        if (ld)      count <= d_in;      // if ld = 1, Then -  
        else if (clr) count <= 0;      // if clr = 1, Then -  
        else if (mode) count <= count + 1;  
        else            count <= count - 1;  
endmodule
```

// Parameterised design : N bit counter

```
module counter ( clear, clock, count);
```

```
    parameter N = 7;
```

```
    input clear, clock
```

```
    output reg [0:N] count;
```

```
    always @ (negedge clock)
```

```
        if (clear)
```

```
            count <= 0;
```

```
        else
```

```
            count <= count + 1;
```

```
endmodule
```

// using counter

module ring\_counter ( clk, init, count );

input clk, init;

output reg [7:0] count;

always @ (posedge clk)

begin

if ( init ) count = 8'b10000000;

else begin

count <= count << 1;

count [0] <= count [7];

end

shift left by 1

both will work simulation.

end

endmodule

or use count = { count [6:0], count [7] };

Next is

always @ (posedge clk)

begin

q1 = a;

q2 = q1;

end

a



Now

q1 <= a;

q2 <= q1;

a



lec 19 pt - 4

Mixing blocking / nonblocking  
not recommended:

### Basic Idea

- possible to combine both blocking & nonblocking assignments in same procedural block (viz. "always")
  - Simulator or synthesis tool supports this.
- interpretation not straightforward - see not nice for designness

### Example 1

always@(\*)

```
begin  
x = 10;  
x = 20;  
y = x;  
#10;  
end
```

```
begin  
x = 10;  
x = 20;  
y <= x;  
#10;  
end
```



Same result, no delay in blocking assign

so  $x = 20$  at  $t = 0$

Then  $y = 20$

### Example 2

always @(\*)

```
begin  
x = 10;  
y = x;  
z = 20;  
#10;  
end
```

```
begin  
x = 10;  
y <= x;  
z = 20  
#10;  
end
```

→ Same result

## Generate Block "generate" "endgenerate"

generate statement allows Verilog code to be generated dynamically before the simulation or synthesis begins.

- convenient for parameterized module description

generate instantiations can be used for various Verilog blocks

- modules, user-defined primitives, gates, continuous assignment, 'initial', 'always' blocks etc

- Generated instances have unique identifier names and can be referenced hierarchically.

Special "genvar" variables:

- used to declare variables that are used only in evaluation of generate block.
- These variables do not exist during simulation or synthesis
- value of 'genvar' can be defined only in generate loop.
- Every generate loop is assigned a name, so that variables inside the generate loop can be referenced hierarchically

Example Bit wise XOR ( f, a, b);

param module xor\_bitwise ( f, a, b);

parameter N=16;

input [N-1:0] a, b;

output [N-1:0] f;

genvar p;

generate for ( p=0; p< N; p=p+1)

begin real norlp

xor XG1( f[p], a[p], b[N]);

end

endgenerate

endmodule

this name was given to generate loop

(?)

→ relative hierarchical names will be

norlp[0].XG1, -- xorlp[15].xo

## Topic User Defined primitives (UDP)

used to define custom Verilog primitive by use of look up table  
They can specify:

- Truth table for combinational function
- State table for seq. func.
- Dont care, rising and falling edges can be specified.

For combinational func., TT entries defit

$\langle \text{input}_1 \rangle \langle \text{input}_2 \rangle \dots \langle \text{input}_N \rangle : \langle \text{output} \rangle$

For sequential state table entries

$\langle \text{input}_1 \rangle \langle \text{input}_2 \rangle \dots \langle \text{input}_N \rangle : \langle \text{present-state} \rangle : \langle \text{next-state} \rangle$

### Rules

Input terminals to UDP can only be scalar variables

- multiple i/p terminals can be used
- input terminals are declared as "input",
- input entries in table must be in same order as "input" terminal list.

only one scalar output terminal must be used

- output terminal must appear in beginning of terminal list
- For combinational UDPs, o/p terminal declared as "output"
- For sequential UDPs, o/p terminal declared as "reg".

\* For sequential UDPs, state can be initialised with "initial" statement (optional)

'input a,b,c;

a b c : <output>

Guideline • VDPs model functionality only  
— Do not model timing or process tech.

- Functional block can be modeled as VDP only if it has exactly one output
  - If block has more than one op, it has to be modeled as a module.
  - Alternative, multiple VDPs can be used, one per output.
- Inside simulator, a VDP is typically implemented as a lookup table in memory
- VDP state tables should be specified as completely as possible.
  - For unspecified cases, output set to 'x'.

Example "Full address sum generation using VDP

primitive udp-sum( sum, a,b,c);

'input a,b,c;

output sum;

table

| a | b | c | sum |
|---|---|---|-----|
| 0 | 0 | 0 | 0   |
| 0 | 0 | 1 | 1   |
| 1 | 1 | 0 | 1   |
| 1 | 1 | 1 | 1   |

We can also specify  
don't care input combina-  
tions as "?".

end-table

end primitive

## II Instantiation UDPs.

```
module full-adder (sum, cout, a, b, c);
    input a, b, c;
    output sum, cout;
    udf-sum SUM (sum, a, b, c);
    udf-cy CARRY (cout, a, b, c);
endmodule
```

A 4 to 1 multiplexer

```
primitive adp-mx41(f, s0, s1, x2 i0, i1, i2, i3);
```

## Table

4 se sl lo i / i2 i3 : f

Q. Q. 1. 5. 7.

10

1 0

4 /

1

( )

end table

end primitive

## // A T flip-flop

primitive TFF( q, clk, dcl);

(-)

↳ keep the previous state

input clk, dcl;

output reg q;

table

// clk dcl q q-new

? 1 : ? : 0 ;

? (10) : ? : - ; // ignore -ve edge of "dcl"

(10) 0 : 1 : 0 ;

(10) 0 : 0 : 1 ;

(0?) 0 : ? : - ; // ignore true edge of "clk"

endtable

endprimitive

X → invalid

## Some rules to follow

- "?" cannot be specified in d/p field (Don't care)
- "-> No change (only in d/p)
- "s" same as (101) → rising edge
- "f" same as (10) → falling edge
- "//" → any value change in signal

## Lecture 21    Vivilog Test Bench

- procedural block that executes only once
  - used for simulation.
- generates clk, reset, seq. test vectors for a given design-under-test (DUT)
- can monitor DUT outputs and present them in a way as specified by creator.
  - Print values of single lines
  - Dump values in file from where waveforms can be viewed
- inputs & outputs of DUT must be connected to test bench.
- Test bench uses "initial" procedural block that executes only once
  - use "always" to generate test inputs, like clk signals

### Test Bench



### Description

```
module example (A,B,C,D,E,F,Y);  
    input A,B,C,D,E,F;  
    output Y;  
    wire t1,t2,t3;  
  
    nand #1 G1 (t1,A,B);  
    and #2 G2 (t2,C,~B,D);  
    nor #1 G3 (t3,E,F);  
    nand #1 G4 (Y,t1,t2,t3);  
  
endmodule
```

```

module testbench;
reg A,B,C,D,E,F; wire Y;
example DUT (A,B,C,D,E,F,Y); % b → binary
initial
begin
$monitor ($time, "A=%b, B=%b, C=%b, D=%b, E=%b,
          F=%b, Y=%b", A,B,C,D,E,F,Y);
#5 A=1; B=0; C=0; D=1; E=0; F=0;
#5 A=0; B=0; C=1; D=1; E=0; F=0;
#5 A=1; C=0; ( // other var not changing)
#5 $finish;
end
endmodule

```

How to write • create dummy template.

— declare inputs to design-under-test (DUT)  
as "reg" and outputs as "wire".

Because we have to initialise DUT inputs inside procedural  
blocks typically "initial", where only "reg" can be  
assigned.

- Instantiate DUT
- Assign known val to DUT inputs
- Monitor DUT ops for functional verification

For synchronous seq. circuits → need clock generation logic.

Simulator directives: \$display, \$monitor, \$dumpfile, \$dumpvar,  
\$finish etc.

→ D out need testbench for synthesis

- `$display ("<format>", expr1, expr2, --);`  
— print immediate values of text or variables to `stdout`  
format specifies "b" (binary), "h" (hexa)
- `$monitor ("<format>", var1, var2, --);`  
~~Show~~ only print values whenever value of some variable in given list changes.  
(functionality of event-driven input)

~~Start~~ `$finish` → terminate the simulation

#100 `$finish` // After `$time = 100` then stop simulation.

- `$dumpfile (<filename>);`  
↳ file for storing values of selected vars  
extension .vcd (Value change Dump)  
contains info about any value changes on selected variable.

`$dumpoff` → stop dumping of variables

All variables dumped with "`x`" and next change of variables will not be dumped.

`$dumpon;`

↳ Starts previously stopped dumping of variables

\$dumpvars ( level , list\_of\_vars );

- specifies variables to be dumped

- ~~\*~~ level = 0  $\rightarrow$  all var from modules will be dumped  
all module instances which have variables  
will also be dumped.

- level = 1  $\rightarrow$  only listed variables and variables of listed  
module will be dumped

\$dumpall

$\hookrightarrow$  current values of all variables will be written.  
instead of change.

• \$dumplimit ( filesize );  $\rightarrow$  used to set max size of .vcd  
file

---

\$dumpvars ( 0, module1 , module2 );  $\Rightarrow$  all vars in 1,2  
will be dumped

( 1, a, b, c )

$\downarrow$  only these

## Lecture 22    Writing Test benches

module testbench;

reg a, b, c; wire sum, cout;

full-adder FA( sum, cout, a, b, c);

initial

begin

\$monitor( \$time, "a=%b, b=%b, c=%b, sum=%b, cout=%b",  
              a, b, c, sum, cout);

#5 a=0; b=0; c=1;

#5 b=1;

#5 a=1;

#5 a=0; b=0; c=0;

#5 \$finish;

end

endmodule

integer i;

initial

begin

for (i=0; i<8; i=i+1)

begin

{a, b, c} = i; #5;

\$display ("T=%2d, a=%b, b=%b,

c=%b, d=%b, sum=%b,

cout=%b", \$time, a, b, c,  
sum, cout);

end

#5 \$finish;

end

endmodule

## Lecture 33 MODELLING FINITE STATE MACHINES

FSM  $\rightarrow$  can be represented in form of state table or in form of state transition diagram.

Mealy and Moore FSM types

a deterministic FSM can be mathematically defined as a 5-tuple  
 $(\Sigma, \Gamma, S, s_0, \delta, \omega)$

$\Sigma \rightarrow$  set of i/p comb.,  $\Gamma \rightarrow$  set of o/p comb

$S \rightarrow$  finite set of states,  $s_0 \in S \rightarrow$  initial state,  $\delta$  state transition function,  $\omega \rightarrow$  output function.

Here,  $s: S \times \Sigma \rightarrow S$

For Mealy machine,  $\omega: S \times \Sigma \rightarrow \Gamma$  (output depends on state + inputs)

Moore,  $\omega: S \rightarrow \Gamma$  (depends only on state)

module cyclic-lamp (clock, light);

input clock;

output reg [0:2] light;

parameter  $s_0 = 0, s_1 = 1, s_2 = 2;$

parameter RED = 3'b100, GREEN = 3'b010, YELLOW = 3'b001;

reg [0:17] state;



always @ (posedge clock)

case (state)

$s_0$ : begin

light <= GREEN; state <=  $s_1$ ;

end

$s_1$ : begin

light <= YELLOW; state <=  $s_2$ ;

end

$s_2$ : begin

light <= RED; state <=  $s_0$ ;

end

default: begin  
light <= RED;  
state <=  $s_0$ ;  
end  
end case  
end module

5 ff generated 2 for state 13 for lights  
due to non-blocking assignments

but actually we do not need separate ff for outputs,  
as outputs can be directly generated from state.

How to achieve this?

- o Make assignment states to light in separate "always"
- o block.
- o Use blocking assignment triggered by state change, not by clock

---

Example? 1-#t

- o Design of a serial parity detector
  - A continuous stream of bits is fed to circuit in synchronism with clock.

Circuit will generate bit stream as output, where  
0 indicate even no. of ones and  $1 >$  odd no.  
Write the code yourself

### Sequence Detector

Detect pattern 0110  $\rightarrow$  gives output 1 else 0  
or rules considered



```
module detect ( clk, n, reset, z);
    input n, clk, reset;
    output reg z;
    parameter S0=0, S1=1, S2=2, S3=3;
    reg [0:17] PS, NS;
```

```
always @ (posedge clk orposedge reset)
    if (reset) PS <= S0;
    else PS <= NS;
```

```
always @ (PS, n)
```

```
case (PS)
```

```
S0: begin
```

```
    z = n ? 0 : 0;
```

```
    NS = n ? S0 : S1;
```

```
end
```

```
S1: begin
```

```
    z = n ? 0 : 0;
```

```
    NS = n ? S2 : S1;
```

```
end
```

```
S2: begin
```

```
    z = n ? 0 : 0;
```

```
    NS = n ? S3 : S1;
```

```
end
```

```
S3: begin
```

```
    z = n ? 0 : 1;
```

```
    NS = n ? S0 : S1;
```

```
end
```

```
endcase
```

```
endmodule
```

## 1-25 Data Path and Controller Design (Part 1)

In complex digital system, the hardware is typically partitioned into two parts:

- a) Data path / which consists of functional units where all computations are carried out [Registers, multiplexers, bus, address, multipliers, counters]
- b) Control path, implements FSM, provides control signals to data path in proper sequence.

In response to control signals, various operations are carried out by data path.

Also takes inputs from data path regarding various status info.

### Example 1 Multiplication by Repeated Addition

We identify functional blocks required in data path, and corresponding control signals

Then we design FSM to implement multiplication algo using data path



DATA PATH



Part 2 28 Dec

better style of modelling Data/ control path

- only trigger the state change in clock activated "always" block
- In separate "always" block using blocking assignments compute next state
- As in previous example, in separate "always" block, generate control signals for data path

GCD Computation → simple algo using repeated subtraction



$$26 \ ; \ 65$$

$$\begin{matrix} 26 \\ A \end{matrix} < \begin{matrix} 65 \\ B \end{matrix} \Rightarrow$$

$$B = 39$$

$$\begin{matrix} B = 39 \\ B = 39 - 26 \\ B = 13 \end{matrix}$$

$$A = 26 - B$$

$$A = 13$$

④ identify functional blocks required in data path, and corresponding control signals

⑤ Design FSM to implement GCD computation algorithm using data path



### CONTROL PATH



Modelling control path using alternative approach

→ always @ (posedge)

we define next-state → change next state  $\leq$  next st.;

→ always @ (state)

case (state)

in this we will assign blocking assign next state

## lec 27 Part 3 Booth's multiplication Algorithm

Booth's algorithm is improvement whereby we can avoid the additions whenever consecutive 0's or 1's are detected in multiplier.

o We inspect two bits of multiplier  $(Q_i, Q_{i-1})$  at time if bits are same (00 or 11), we only shift the partial product.

— 01 → addition then shift

— 10 → sub then shift

$Q_{-1}$  → assumed zero

START

$A=0, Q_{-1}=0$

Count = n

M = multiplicand

Q = multiplier

M: n bit multiplicand

Q: n bit multiplier

A: n bit temporary register

$Q_{-1} \rightarrow 1$  bit ff.





Example  $(-10) \times (13)$

$$M: (10110)_2$$

$$-M: (01010)_2$$

$$Q: (01101)_2$$

$$\text{product: } -130$$

$$= (110111110)_2$$

| A     | Q     | Q <sub>-1</sub> |             |
|-------|-------|-----------------|-------------|
| 00000 | 01101 | 0               |             |
| 01010 | 01101 | 0               | sub shift ① |
| 00101 | 00110 | 1               | add         |
| 11011 | 00110 | 1               | shift ②     |
| 11101 | 10011 | 0               | sub         |
| 00111 | 10011 | 0               | sub         |
| 00011 | 11001 | 1               | shift ③     |
| 00001 | 11100 | 1               | shift ④     |
| 10111 | 11100 | 1               | add         |
| 11011 | 11110 | 0               | shift ⑤     |

ANS =  $\boxed{11011 \quad 11110}$

count = 0

DATA PATH



## CONTROL path



lec 28    Synthesizable Verilog  $\rightarrow$  language subset that can be synthesized.

### Rules for combinational

- Avoid technology delay
- Avoid feedback connections
- Give ~~all~~ outputs for all cases (if not, inference a latch)  
(if-else, "case" constructs)

### Styles for synthesizable Combi. logic

- Netlist of Verilog built-in primitives like gate instances (AND, OR, NAND)
- Combinational UDP (not ~~with~~ supported by all)
- Continuous assign
- Functions ↓
- Behavioral statements
- Tasks without event or delay control

### Functions in Verilog

```
module fa( s, cout, a,b, cin);  
    input a,b,cin;  
    output s, cout;  
    assign s = sum( a, b, cin);  
    assign cout = carry( a, b, cin);  
endmodule
```

```
function sum;  
    input x,y,z;  
    begin  
        sum = x ^ y ^ z;  
    end  
function carry;  
    input x,y,z;  
    begin  
        carry = (x & y) | (y & z)  
            | (z & x);  
    end
```

⇒ Input arguments appear in same order

Task  $\rightarrow$  can be used anywhere

arguments must be specified in same order as they appear in task declaration. More than one output value can be returned.

```

module fulladder( s, cout, a, b, cin);
    input a, b, cin;
    output reg s, cout;
    always @ (a, b, cin)
        FA ( s, cout, a, b, cin);

task FA;
    output sum, carry;
    input A, B, C;
    begin
        #2 sum = A & B & C || (A & B) | (B & C) | (C & A);
        carry = (A & B) | (B & C) | (C & A);
    end
endtask
endmodule

```

Diff b/w function & task.

| <u>Function</u>                                                    | <u>Task</u>                                                                          |
|--------------------------------------------------------------------|--------------------------------------------------------------------------------------|
| func can call another func<br>but not another task                 | task can call other tasks & functions                                                |
| func executes in 0 simulation time                                 | task may execute in nonzero simulation time                                          |
| func cannot contain any delay, event or timing control statements. | task can contain delay, event, or timing control statements.                         |
| returns single val                                                 | can pass multiple <del>task</del> values through "output" and "input" type arguments |
| func must have at least one "input" argument                       | can have 3rd or more arguments of type "input", "output",                            |

Construct to Avoid in Combinational

- fork - join, "wait", "disable"
- delay loop
- Data dependent loops
- feedback loop
- sequential user defined primitives

Summary: Synthesizable Verilog construct

- "module" . . . endmodule"
- "parameter"
- Instantiation of synthesizable module
- "ifuc" and "task"
- "always" . . . assign
- "for"
- built-in primitives
- "block" + New blocks
- select of vectors.
- "if-else", "case", "casen", "casez"
- UDPs - for Combinational  
only

## Lecture 30 Modelling Memory (Study Memory part also)

Memory typically included by instantiating a pre-designed module from design library.

Modelled using 2-dimensional arrays.

- array of register variables (behavioral model)
- used for synthesis of small size memories
- or for synthesis

Example

module memory - (-)

  reg [7:0] mem [0:1023];

  endmodule

Each memory word  
is of type [7:0]

Access memory words  
at mem[0], mem[1],  
- mem[023].

### How to initialise memory?

• By reading memory data patterns from a specified disk file  
- used for simulation & test benches

• Two verilog functions can be used:

\$readmem (filename, memname, startaddr, stopaddr)  
    ↳ read in binary form

\$readmemh  
    ↳ read in hexadecimal form

omission of  
these 2  
leads to  
reading the  
entire memory.

Single-port RAM with synchronous Read / write

module ram\_1 (addr, data, clk, rd, wr, cs);

input [9:0] addr; input clk, rd, wr, cs;

inout [7:0] data;

reg [7:0] mem[1023:0]; reg [7:0] d\_out;

assign data = (cs && rd) ? d\_out : 8'bz;

always @ (posedge clk)

if (cs && wr && !rd) mem[addr] = data;

always @ (posedge clk)

if (cs && rd && !wr) d\_out = mem[addr];

endmodule.

ROM Example



Important pt to Note (Regarding "inout")

- ① Some simulation or synthesis tools give inconsistent behaviour when using the "inout" data type. (Should be avoided)
- ② Better way to design → keep data i/p and data o/p bus lines separate.



module ram\_3 (data\_out, data\_in, addr, wr, cs);  
input data\_in;  
parameter addr\_size = 10, word\_size = 8,  
memory\_size = 1024;  
input [addr\_size - 1:0] addr;  
input [word\_size - 1:0] data\_in;  
input wr, cs;  
output [word\_size - 1:0] data\_out;  
reg [memory\_size - 1:0] mem [memory\_size - 1:0];  
assign data\_out = mem[addr];  
always @ (wr or cs)  
if (wr) mem[addr] = data\_in;  
endmodule



## Q31 modelling Register Banks

A register bank or register file is a group of registers, any of which can be randomly accessed.

- Commonly used in computers to store the user-accessible registers
- For example, in the MIPS32 processor, there are 32 32bit registers, referred to as R0, R1, R2, ..., R31.
- o Can be implemented in Verilog as independent registers, or as an array of registers similar to memory
- o Register banks often allow concurrent accesses
  - MIPS32 allows 2 reg reads and 1 reg write every clk cycle

Assumption  $\rightarrow$  read + write from same register



432x32 eeg file  
 module eegbank ( wdData1, wdData2, wrData, sr1, sr2, dr,  
 write, clk );  
 input clk, write;  
 input [4:0] sr1, sr2, dr; // source, destination registers  
 input [31:0] wrData;  
 output [31:0] rdData1, rdData2;  
 reg [31:0] regfile [0:31];  
 assign rdData1 = regfile [sr1];  
 assign rdData2 = regfile [sr2];  
 always @ (posedge clk)  
 if (write) regfile [dr] <= wrData;

end module

(refer vids for testbenches)

### lc 32 Basic Pipelining Concepts

Mechanism for overlapped execution of several I/P acts by further partitioning some computation into a set of k-sub-computations (or stages)

- Very nominal inc in cost of implementation
- Very significant speedup (ideally, k).
- Where is pipelining used in computer system?
  - Instruction execution
  - Arithmetic computation
  - Memory access  $\rightarrow$  reads to consecutive locations

### Example



while one machine is running, the other below  
is free for more items.

How does pipeline work?



### Processor pipeline

- To attain k times speed up for computation,
- Split computation into k stages
- Need for buffering:-

In hardware pipeline, we need a latch b/w successive stages to hold the intermediate results temporarily.

Model of a Synchronous k-stage Pipeline.



latches made with master-slave flip flops, and serve the purpose of isolating i/p's from o/p's  
pipeline stages typically combinational

When it comes → all latches transfer data to next stage simultaneously.

## Reservation Table

data structure that represents the utilization pattern of successive stages in a synchronous pipeline.

- Basically a space-time diagram of pipeline that shows precedence relationships among pipeline stages

X-axis  $\rightarrow$  time steps      Y axis - ~~step~~ stages

No. of columns  $\Rightarrow$  Evaluation time



## Characteristics

- ① multiple X's in row : repeated use of same stage in different cycles.
- ② Contiguous X's in row : extended use of a stage over one time period cycle
- ③ Multiple X in column : multiple stages used in parallel during a clock cycle

## Speedup & Efficiency

Some notations :  $T$  :: clock period of pipeline

$t_i$  : time delay of circuitry in stage  $s_i$

$d_i$  : delay of latch.

max stage delay:  $T_m = \max \{t_i\}$

Thus  $\tau = T_m + d_2$

Pipeline freq  $f = 1/\tau$

If one result is expected to come out of pipeline every clock cycle,  $f$  will represent the max throughput of pipeline.

Total time to process  $N$  data sets

$$T_k = [(k-1) + N]\tau$$

$(k-1)\tau$  time req to fill pipeline

1 result every  $\tau$  time after that  $\rightarrow$  total  $N\tau$ .

For equiv non pipelined processor

$$T_1 = Nk\tau$$

$$\text{Speed up } S_k = \frac{T_1}{T_k} = \frac{Nk\tau}{k\tau + (N-1)\tau} = \frac{Nk}{k + (N-1)}$$

As  $N \rightarrow \infty$ ,  $S_k \rightarrow k$

Pipeline Efficiency  $E_k = \frac{S_k}{k} = \frac{N}{k + (N-1)}$   
closeness of performance to ideal val

Pipeline throughput: No. of operations completed per unit time

$$H_k = \frac{N}{T_k} = \frac{N}{[k + (N-1)]\tau}$$

## Clock Skew / Jitter / Setup Time

• The minimum clock period of pipeline must satisfy the inequality:

$$t \geq t_{\text{skew+jitter}} + t_{\text{logic+setup}}$$

Skew: max delay difference b/w arrival of clock signals at the stage latches

Jitter: max delay diff b/w ~~arrive~~ arrival of clock signal at same latch. (Noise in environment)

Logic delay: max delay of slowest delay of in pipeline

Setup time: minimum time a signal needs to be stable at input of latch before it can be captured

## PIPELINE MODELLING

Example 4 N bit unsigned integers A, B, C, D as i/p  
N bit unsigned integer F as ~~output~~ o/p

$$S_1: x_1 = A + B; \quad x_2 = C - D;$$

$$S_2: x_3 = x_1 + x_2;$$

$$S_3: F = x_3 \times D$$

Same D is to be used in computation.

D must be forwarded to S2 and then to S3



## Lecture 34

Consider a pipeline that carries out the following stage wise operations:

- Inputs : 3 reg addresses ( $rs_1, rs_2, rd$ ), ALU func. (func)  
memory addr. (addr)
- Stage 1 : Read two 16 bit num from reg specified by " $rs_1$ " and " $rs_2$ " and store them in A and B
- Stage 2 : Perform ALU operation on A and B specified by "func" and store in Z.
- Stage 3 : write val of Z in reg specified by " $rd$ ".
- Stage 4 : write val of Z in memory location "addr".



### Clocking Issue in Pipeline

Important that consecutive stages be applied suitable clock for correct operation

Two options :  
 ① make clock to stop race condition  
 ② non overlapping 2<sup>th</sup> clk for consecutive pipeline stages



# SWITCH LEVEL MODELLING

(Page 35)

- Computerise of netlist of MOS transistors
- not very common
- Verilog provides ability to model digital circuits at Mos level
  - Transistor func. as switches → either conduct (ON) or open (OFF)

For Types of switches : ideal or resistive



Switch primitives

— nmos, — pmos, cmos  
anmoe → resistive

— supply1, supply0

(a) NMOS and PMOS

nmos inst-name ( o/p, i/p, control);      Tcontrol  
optional

cmos in ( o/p, i/p, ncontrol, pcontrol);

↓  
better switch



pullup, pulldown

pullup(a);

supply1 = vdd

x  
pullup

only

stronger

signal available



supply0

pulldown

## (c) Bidirectional Switches

conduct in both directions



## Part 2 Lecture 36

$$\begin{array}{c} \text{D} \\ \text{D} \\ = \end{array} \begin{array}{c} \text{D} \\ \text{D} \\ \text{D} \end{array}$$



## Path values & Signal Strengths



Path value  
inc

- supply
- strong
- pull
- large
- weak
- medium
- small
- highz

~~conduction~~  
~~2 signals drive~~