

27/4/22

Physical Design Workshop → VBD

① installation snap.

Day 1Theory

4: ivenilog and testbench.

simulator → to check RTL design's spec

Design Testbench → setup to apply stimulus to the design.Simulator working → change in ip  $\Rightarrow$  op is evaluated

output of a simulator → vcd file

↳ value change dump format.

↳ gtkwave → to view waveform.

Lab: ① VLSI → git clone from kunal's git page

Lab: ② verilog files → to load a file into ivenilog  
design file = tb-<design name>.v

command- ivenilog good-mux.v tb-good-mux.v

[•/a.out] gives vcd file

↳ to execute.

vut → unit under

test

cmd: gtkwave tb-good-mux.vcd

drag & drop inputs &  
ops.

zoom fit

→ symbol and arrow to move.

\$dumpfile

\$dumpvars

look into files → using gvim  
to open multiple files → use -o.Intro to yosys

synthesizer → RTL to netlist

tool used:

DESIGN

.lib

yosys

netlist file

representation of design in  
the form of standard cells.

command: `read_verilog` → to read the design  
`read.liberty` → to read the library files  
`write_verilog` → to obtain netlist

How to verify synthesis:



lib  
 ↓  
 not exhaustive  
 but such  
 enough to  
 implement  
 design

Why different flavours of the same gate?

$$T_{clk} > T_{CQ\_A} + T_{COMBI} + \dots$$



ip -> op  
 should be  
 one clock  
 cycle

Are fast cells sufficient? or why do we need slow cells?

$$T_{HOLD\_B} < T_{CQ\_A} + T_{COMBI}.$$

Requirement: ip of B should change after this window.

edge of clk, after  
 which data  
 shouldn't  
 change



sol: to guarantee a  
 minimum delay.  
 ↓

B shouldn't capture here. (for the same cycle)

Fast cells to meet setup criteria }  
Slow cells to meet hold criteria } thus variety of cells  
are reqd.

Fast cells vs slow cells → digital logic circuit

load of DLC → capacitance

fast changing and discharging of C → less delay.  
↓

transistors capable  
of sourcing more current.  
↓

Wide → less delay → Area & Power ↑

Narrow → more delay → Area &  
Power ↓

wide transistors

selection of cells → 'constraints'

↳ guidance to pick correct set  
of cells.

Synthesis illustration.

Lab:

Introduction to yosys synthesizer.

commands: yosys → to invoke synthesizer.

read-liberty -lib .. / my-lib / <filename>

flop  
↓  
another  
and to  
be executi

has significance  
read-verilog good-mux.v

synth -top good-mux

↳ module name.

abc-liberty .. /

↳ path to lib file

RTL to

netlist command.

ABC report -x.x

↳ note signals & cells.

show

↳ graphical version of logic realized.

4 cm  
3 ad ant.

## Gate level interpretation

O<sub>2</sub> air → orange



→ to generate netlist file

cmd: write\_verilog good mux  
netlist.v

modify the switch?

cmd: write-verilog -noather good-mux\_netslist.v

Day 3

Lab: ④

## Intro to timing lib file

$t\bar{t} \rightarrow$  typical

→ open library file in gvim.

$$025\text{C} \rightarrow 25^{\circ}\text{C}$$

Name of the library file → sky130\_

Any lib file

Process → variations due to fabrication. (bread toasting example)

semiconductors are sensitive to temp. conditions

Process,  
voltage,  
temperature.

∫ 1V80 → 1.8 Volts.

## technology

delay derodel  $\rightarrow$  IUT.

want unit of time, voltage, current, power, res. & cap.  
• tab file operating conditions

↳ contains definitions of different cells.

↓ and delay.  
feature → leakage power mentioned for

cells can be better understood

by viewing its equivalent  
verilog model

all possible ip combs.

\* area

\* power port info

\* details wst each pin

Lab : ⑤

## Hierarchical and Flat Synthesis

cmd: `gvim multiple_modules.v`



multiple module

- \* launch yosys      \* synth-top multiple-modules.v
  - \* read liberty file      \* generate netlist (abc)
  - \* read verilog file      \* show
- output: instances shown and not the gate

↳ hierarchical design.

- \* write netlist      → hierarchy preserved.
- \* view gvim

OR gate realized in terms of NAND. → classical action  
of DeMorgan's.

Why?

cmd: flatten

NAND implementation

cmd: write\_verilog multiple\_modules-flat.v

is preferred over

\* view the netlist

NOR

↓

Stacking PMOS  
is bad.

Task: given multiple modules, how to synthesize

mention submodule only submodule?

change in synth-top command.

↓

because more  
width reqd for

good logical  
effort.

Why do we need to do submodule synthesis?

① If same module is replicated n times in  
a time, it can be synthesized once and the result  
can be replicated & put

② Divide and conquer method

together.

↳ for massive designs.

Lab 6 files: dff-async-set  
dff-asyncreset  
dff-asyncreset-sync

dff-sync.

## Flops and Flop coding style.

Propagation delay in comb. circuits  $\rightarrow$  glitches @ output.

Example:



using timing diagram  $\rightarrow$  glitch was explained.



chain of combinational circuits  $\rightarrow$  output(s) will continuously be glitchy.

Sol: D FF.

Flops provide shielding.  $\rightarrow$  Main purpose of using flops in digital circuits.

Reset / Set pin  $\rightarrow$  to initialise a flop

Asynchronous.

Synchronous

async  $\rightarrow$  irrespective of clk.

sync  $\Rightarrow$  D pin of the flop.

depends on  
clock.



same flop can have sync & async reset.

set and reset together can cause race conditions.

Lab 6:

Synthesis and simulation of flops:

\* Simulation of all types of flops given.  $\rightarrow$  snaps of op.

Sync reset given preference

CO<sub>2</sub> standard  
cells & ffs are  
kept separately  
@ time

dfflib map

$\rightarrow$  after the 'abc' step.

Lab 7:

## Interesting optimisations

two modules  $\xrightarrow{\text{mult-2.v}}$   
 $\xrightarrow{\text{mult-8.v}}$

mult-2  $\rightarrow$  truth table observation.

$a \times 2 \Rightarrow a$  appended with 0  $\rightarrow$  in binary

Take example of  $a \times 9$ )

$$\hookrightarrow a * [8+1] \Rightarrow a * 8 + a * 1.$$

6 bits  
a followed by three zeros.  $\Rightarrow y$  will be  $\boxed{ap}$   
3 bits 3 bits.

Day 3

## Combinational and sequential Optimisation

Introduction to optimisation  $\rightarrow$  logic

Combinational  $\rightarrow$  Why?  $\rightarrow$  to squeeze the logic to get the most optimised design.

Techniques:

$\hookrightarrow$  Area & Power.

\* Constant Propagation  $\rightarrow$  Direct.  $\rightarrow$  eg =  $y = (AB+C)'$   
where A always 0

\* Boolean Logic Optimisation

$\downarrow$

$\rightarrow \bar{a}\bar{c} + ac$ .

given expr.

assign  $y = a ? (b ? c : (c ? a : d)) : (\neg c)$

Simplified expr  $y = a \oplus c$

## Sequential

\* Basic - sequential constant propagation

\* Advanced

state  
optimisation

Retiming

$\downarrow$   
condensation

splitting logic  
equally for  
imp. sorting max  
op. frequency.

Sequential logic Cloning  
(Floor Plan Aware Synthesis)



$\downarrow$  minimize  
routing delay.  
(huge tree slack  
prerequisite)

'de-asserted'.

sequential constant propagation

example:



$$\bar{a} \cdot 0 + ab$$

$y$  is always 1

irrespective of  $D$ ,  $clk$ ,  $A$  and  
 $rst$ .

same example with `rst` option

↳ constant propagation cannot be applied.

Lab 08:

combinational logic optimisation part 1.

(6 as per the course)

files → all opt files.

cmd: opt-clean -purge

opt-check

opt-check2

opt-check3

for multiple modules:

first flatten and then optimise.

Lab 09:

sequential logic optimisation

files → dff-conc1..

const425 → task.

(7 as per the course)

sequential optimisation of unused outputs.

counter-opt.v → synthesized module.



modify counter and observe op.

$$q_1 = (\text{count}[2:0] == 3'b100)$$

## Day 4

### gate level simulation (GLS)

- running TB with netlist as DUT.
- ↓
- Why ?
- to verify logical correctness
  - to ensure timing (run with delay annotation) to be
- logically same as RTL code

### GLS using ivinilog



### Synthesis - simulation mismatch

- \* Missing sensitivity list
- \* Blocking vs non-blocking assignments
- \* Non standard verilog coding .

comes into picture only inside the 'always' block .

- = → blocking → sequential execution
- <= → non blocking → concurrent execution .  
↳ RHS evaluated first .

caveats with blocking statements

↓  
SR example

\* bad-mux  
\* good-mux

Lab (10): cmd to read verilog models.

(8 as per course)

iverilog -I my-lib/ ~~mylib~~ verilog-models/primitives.v  
sky130\_fd\_sc\_hd.v  
and then netlist and test bench.

Lab (11): Mismatch due to blocking statements.

(9 as per course)

file blocking-caveat.v

### Day 5

If case constructs

if → to create priority logic

if <cond1>

:  
else if <cond2>

:  
else if <cond3>

:  
else : }'E

H/W



Danger / caution → inferred latches.

↳ Bad coding style

incomplete if → latch →

↳ effect of incomplete if statement.

required @ timee

ex: counter.

no inferred latches in combinational circuits.

variable to which the value is assigned inside a if or case statement should be a `reg`!



for - generate → to replicate hardware

genvar i;

generate

for (i=0; i<8; i=i+1)

and u-and (.a(in1[i]), .b(in2[i]), .y(y[i]));

end

end generate

// creates 8 instances of and gate.

Example: Ripple Carry Adder. (RCA)

if generate can also  
be used

~~outside the~~ always block.

~~Lab 14~~

Lab(14):

mux - generate.v

rca.v

01

demux - case.v

fa.v

sel[1] & sel[0]

demux - generate.v

Rule for addition

N-Bit + N-Bit number → N+1 output

N-Bit + M Bit → max(N, M) + 1 bit output.