

# Digital Design and Computer Architecture (DDCA)

Managing Complexity:

Flavian Kaufmann  
flkaufmann@ethz.ch  
github.com/flavian112/ethz\_ddca

Abstraction:

| Level                                        | Example                 |
|----------------------------------------------|-------------------------|
| Application Software                         | Programs                |
| Operating Systems                            | Device Drivers          |
| Architecture                                 | Instructions, Registers |
| Microarchitecture                            | Datapath, Controllers   |
| Logic                                        | Address, Memories       |
| Digital Circuits                             | Gates                   |
| Analog Circuits                              | Amplifiers              |
| Devices                                      | Transistors, Diodes     |
| Physics                                      | Electrons               |
| → Hiding details when they are not important |                         |

Discipline:  
→ Intentionally restricting design choices for greater productivity at higher abstraction level.

Three Y's:

- Hierarchy
  - A system is divided into modules of smaller complexity
- Modularity
  - Having well defined functions and interfaces
- Regularity
  - Encouraging uniformity, so modules can be easily reused

Circuit Design Goals:

- Optimize...
- Area (proportional to cost)
- Speed/Throughput
- Power/Energy
- Design Time

Principles:

- Lazy, be creative
- Why?, take nothing for granted
- Engineering is not religion, use what works best for you.
- KISS (Keep it simple and stupid), manage complexity.

Moore's Law:

Gordon Moore (1965): Number of transistors that can be manufactured doubles roughly every 18 months.

## Binary Numbers



| Decimal | Binary | Hexadecimal | n  | 2^n   | IEC       | x Si          |
|---------|--------|-------------|----|-------|-----------|---------------|
| 0       | 0000   | 0           | 0  | 1     |           |               |
| 1       | 0001   | 1           | 1  | 2     |           |               |
| 2       | 0010   | 2           | 2  | 4     |           |               |
| 3       | 0011   | 3           | 3  | 8     |           |               |
| 4       | 0100   | 4           | 4  | 16    |           |               |
| 5       | 0101   | 5           | 5  | 32    |           |               |
| 6       | 0110   | 6           | 6  | 64    |           |               |
| 7       | 0111   | 7           | 7  | 128   |           |               |
| 8       | 1000   | 8           | 8  | 256   |           |               |
| 9       | 1001   | 9           | 9  | 512   |           |               |
| 10      | 1010   | A           | 10 | 1024  | Ki (kibi) | 10^3 k (kilo) |
| 11      | 1011   | B           | 11 | 2048  |           |               |
| 12      | 1100   | C           | 12 | 4096  |           |               |
| 13      | 1101   | D           | 13 | 8192  |           |               |
| 14      | 1110   | E           | 14 | 16384 |           |               |
| 15      | 1111   | F           | 15 | 32768 |           |               |
|         | 16     | 100000      | 16 | 65536 |           |               |
|         | 20     | Mi (mibi)   | 20 | 10^6  | M (mega)  |               |
|         | 30     | Gi (gibi)   | 30 | 10^9  | G (giga)  |               |
|         | 40     | Ti (tebi)   | 40 | 10^12 | T (tera)  |               |
|         | 50     | Pi (petbi)  | 50 | 10^15 | P (peta)  |               |

Sign/Magnitude:

-x = 2^n - x

Range for Nbit number:  $-2^{N-1} \dots 2^{N-1} - 1$

Two zeroes, addition doesn't work.

-x = x̄

Range for Nbit number:  $-2^{N-1} \dots 2^{N-1} - 1$

Two zeroes, addition by end-around carry.

One's complement:

-x = x̄

Range for Nbit number:  $-2^{N-1} \dots 2^{N-1} - 1$

Two zeroes, addition by end-around carry.

Two's complement:

-x = x̄ + 1, ignoring overflow

Range for Nbit number:  $-2^{N-1} \dots 2^{N-1} - 1$

One zero, addition like unsigned addition

## IEEE-754 Floating Point:

S E F

sign exponent fraction

(-1)^S  $\cdot$  1.F  $\cdot$  2^E-bias  
Mantissa

| Name             | Size (bits) | Mantissa (bits) | Exponent (bits) | Exponent Bias | Max                  | Min (pos)              |
|------------------|-------------|-----------------|-----------------|---------------|----------------------|------------------------|
| Half precision   | 16          | 10.1            | 5               | 15            | $6.55 \cdot 10^4$    | $6.4 \cdot 10^{-6}$    |
| Single precision | 32          | 23.1            | 8               | 127           | $3.4 \cdot 10^{38}$  | $1.48 \cdot 10^{-38}$  |
| Double precision | 64          | 52.1            | 11              | 1023          | $1.8 \cdot 10^{308}$ | $2.23 \cdot 10^{-308}$ |

### Special Values

- ±0: S=0, E=0, M=0
- ±Inf: S=0/1, E=1..1, M=0
- NaN: E=1..1, M≠0

### Fixed Point:

- m integer bits, k fraction bits

$$b_{m-1} \dots b_0.b_{k-1} \dots b_0$$

### Addition:

- Extract exponent and fraction bits
- Prepend leading 1 to form mantissa
- Compose exponent
- Shift smaller mantissa if necessary
- Add mantissas
- Normalize mantissa and adjust exponent if necessary
- Round result
- Assemble back into floating point format

## Combinational Circuits

- A circuit is combinational if it consists of interconnected circuit elements such that:
- Every circuit element is itself combinational.
  - Every node of the circuit is either designated as an input to the circuit or connects to exactly one output terminal of a circuit element.
  - The circuit contains no cyclic paths: every path through the circuit visits each circuit node at most once



Arbitrary combinational logic circuit with n inputs and m outputs



Each output  $y_j$  for  $j \in \{0, \dots, m-1\}$  is a function  $f_j$  of the inputs  $a_i$  for  $i \in \{0, \dots, n-1\}$ :

$$y_j = f_j(a_0, \dots, a_{n-1})$$

Each output can also be represented as a truth table.

## Timing

A → Y  
t<sub>cd</sub>: contamination delay  
t<sub>pd</sub>: propagation delay



## Truth Table

| A | B | C | Y |
|---|---|---|---|
| 0 | 0 | 0 | 0 |
| 0 | 1 | 0 | 0 |
| 1 | 0 | 0 | 0 |
| 1 | 1 | 0 | 1 |

A truth table (n inputs) has 1 output and  $2^n$  rows.

## Terminology

| Name              | Description                             | Examples                              |
|-------------------|-----------------------------------------|---------------------------------------|
| Literal           | A variable or its complement            | A, B̄                                 |
| Product/Implicant | And composition of literals             | A, B̄A, ĀB̄ĀC                       |
| Minterm           | Product that contains all of the inputs | A, B, C, ĀB, ĀC, B̄C, ĀB̄C, ĀB̄C̄ |
| Sum               | Or composition of literals              | Bv, Ā, ĀB̄v                         |
| Maxterm           | Sum that contains all of the inputs     | ĀBvB̄C, ĀB̄vB̄C̄                    |

## Sum-of-Products Canonical Form

## Product-of-Sums Canonical Form

The SOP canonical form is the sum of all minterms where the output Y is 1.

The POS canonical form is the product of all maxterms where the output Y is 0.

$$Y = (\bar{A}\bar{B}) \vee (\bar{A}B) \vee (A\bar{B}) \vee (AB)$$

Sigma notation:  $Y = F(A, B) = \sum(m_0, m_1, m_2, m_3)$

$$Y = (\bar{A}B) \wedge (\bar{A}B) \wedge (A\bar{B}) \wedge (AB)$$

Pi notation:  $Y = F(A, B) = \prod(M_0, M_1, M_2, M_3)$

## Boolean Algebra

| Axiom                | Dual                  | Name         |
|----------------------|-----------------------|--------------|
| A1 A + 0 = A = 0     | A1' A + 0 = A = 1     | Binary Field |
| A2 0 + 1 = 1         | A2' 1 + 0 = 0         | Not          |
| A3 0 · 0 = 0         | A3' 1 · 1 = 1         | And/Or       |
| A4 1 · 1 = 1         | A4' 0 · 0 = 0         | And/Or       |
| A5 0 · 1 = 1 · 0 = 0 | A5' 0 · 1 = 1 · 0 = 1 | And/Or       |

| Theorem       | Dual           | Name         |
|---------------|----------------|--------------|
| T1 A · A = A  | T1' A + A = 1  | Identity     |
| T2 A · 0 = 0  | T2' A + 1 = 1  | Null element |
| T3 A · A = A  | T3' A + A = A  | Idempotency  |
| T4 Ā · A = 0 | T4' A + Ā = 1 | Involutive   |
| T5 A · Ā = 0 | T5' A + Ā = 1 | Complements  |

| Theorem (multi variable)                             | Dual                                                 | Name           |
|------------------------------------------------------|------------------------------------------------------|----------------|
| T6' A v B = B v A                                    | T6 A · B = B · A                                     | Commutativity  |
| T7' (A v B) v C = A v (B v C)                        | T7 (A · B) · C = A · (B · C)                         | Associativity  |
| T8' (A v B) · (A v C) = A v (B · C)                  | T8' (A · B) v (A · C) = A · (B v C)                  | Distributivity |
| T9' A · A = A                                        | T9' A v A = A                                        | Covering       |
| T10' (A · B) v (A · B̄) = A                          | T10' (A v B) · (A v B̄) = A                          | Combining      |
| T11' (A · B) v (A · C) v (B · C) = (A · B) v (B · C) | T11' (A v B) · (A v C) · (B v C) = (A v B) · (B v C) | Consensus      |
| T12' (A · B̄) = A · B̄                               | T12' (A v B̄) = A v B̄                               | de Morgan      |



- Use the fewest circles necessary to cover all the 1/0's.
- All squares in each circle must only contain 1/0's or X's.
- Each circle must span a rectangular block with side lengths that are a power of two.
- Each circle should be as large as possible.
- A circle may wrap around the edges of a K-map.
- A 1/0 in a K-map may be circled multiple times if doing so allows fewer circles to be used.
- Circles of 1s are products and circles of 0s are sums.

## Logic Gates



## Multiplexer (MUX):

| S | D <sub>0</sub> | D <sub>1</sub> | Y |
|---|----------------|----------------|---|
| 0 | 0              | 0              | 0 |
| 0 | 0              | 1              | 1 |
| 0 | 1              | 0              | 0 |
| 1 | 0              | 0              | 0 |
| 1 | 0              | 1              | 0 |
| 1 | 1              | 0              | 1 |
| 1 | 1              | 1              | 1 |



In general, wider ( $2^n : 1$ ) multiplexers can be achieved by two level logic, an array of tri-state buffers with a ( $n : 2^n$ ) decoder, or in a hierarchical design.

## AND:



## OR:



## XOR:



## Tri-State Buffer:



## Combinational Building Blocks

### Decoder (DEC):

|                |                |                |                |                |                |
|----------------|----------------|----------------|----------------|----------------|----------------|
| A <sub>1</sub> | A <sub>0</sub> | Y <sub>3</sub> | Y <sub>2</sub> | Y <sub>1</sub> | Y <sub>0</sub> |
| 0              | 0              | 0              | 0              | 1              | 0              |
| 0              | 1              | 0              | 0              | 1              | 0              |
| 1              | 0              | 0              | 1              | 0              | 0              |
| 1              | 1              | 1              | 0              | 0              | 0              |

Decoder 10-2: 00, 01, 10, 11

The outputs of a decoder are also the corresponding minterms.



## Arithmetic

### Half Adder:



### Full Adder:



### Carry Propagation Adders (CPA):



### (Parallel) Prefix Adder (PPA):



$$G_{i-1} = Cin \\ P_{i-1} = 0$$

$$t_{PPA} = t_{PGT} + \log_2(N) t_{PG} \text{Prefix} + t_{XOR}$$

### Carry Save Adder (CSA):



### Ripple Carry Adder:



### Carry Lookahead Adder:



For  $N > 16$  a CLA is generally faster, but still scales linearly with respect to propagation delay.

### Comparator:



### Subtractor:



### Left Shift:



### Right Shift:



### Arithmetic Right Shift:



## Parallel Multiplier:



$$\begin{array}{r}
 1 & 0 & 1 & 0 \\
 \times & 0 & 1 & 0 & 1 \\
 \hline
 1 & 0 & 1 & 0 \\
 0 & 0 & 0 & 0 \\
 1 & 0 & 1 & 0 & \text{Partial} \\
 + & 0 & 0 & 0 & 0 & \text{Products} \\
 \hline
 0 & 1 & 1 & 0 & 0 & 1 & 0
 \end{array}$$

## Synchronous Sequential Logic

- Synchronous sequential logic circuits consist of interconnected circuit elements such that:
- Every circuit element is either a register or a combinational circuit.
  - At least one circuit element is a register.
  - All registers receive the same clock signal.
  - Every cyclic path contains at least one register.

## Sequential Building Blocks

### SR Latch:



### D Latch:



### D Flip-Flop:



A register is an array of D Flip-Flops that all have the same clock signal.

## Finite State Machines

### Mealy Machine:

- $(S, S_0, \Sigma, \Lambda, T, G)$
- $S$ : Finite set of states
  - $S_0 \in S$ : Initial State
  - $\Sigma$ : Input Alphabet
  - $\Lambda$ : Output Alphabet
  - $T: S \times \Sigma \rightarrow S$ : Transition Function
  - $G: S \times \Sigma \rightarrow \Lambda$ : Output Function



### Moore Machine:

- $(S, S_0, \Sigma, \Lambda, T, G)$
- $S$ : Finite set of states
  - $S_0 \in S$ : Initial State
  - $\Sigma$ : Input Alphabet
  - $\Lambda$ : Output Alphabet
  - $T: S \times \Sigma \rightarrow S$ : Transition Function
  - $G: S \rightarrow \Lambda$ : Output Function



### Timing:



## Transistors

MOSFETs (Metal Oxide Semiconductor Field Effect Transistor):



NOT:



## Logic Level Noise Margins



## Memories

- Flip-Flops / Latches (Registers)**
  - Very fast, parallel access
  - Expensive (2+ transistors per bit)
- SRAM (Static RAM)**
  - relatively fast, only one data word at a time
  - less expensive (6 transistors per bit)
- DRAM (Dynamic RAM)**
  - slower, reading destroys content (refresh), one data word at a time
  - cheaper (4 transistors per bit)
- Other (HDD, Flash, ...)**
  - much slower
  - no transistors involved, non-volatile



Transmission Gate: Tri-State Buffer (inverted): General:



## MIPS

Instruction Formats:

R-Type:

| opcode            | rs                   | rt                   | rd                   | shamt                | funct              |
|-------------------|----------------------|----------------------|----------------------|----------------------|--------------------|
| 31 ... (6bits)... | 26 25 ... (5bits)... | 21 20 ... (5bits)... | 16 15 ... (5bits)... | 11 10 ... (5bits)... | 6 5 ... (6bits)... |

I-Type:

| opcode            | rs                   | rt                   | imm                    |
|-------------------|----------------------|----------------------|------------------------|
| 31 ... (6bits)... | 26 25 ... (5bits)... | 21 20 ... (5bits)... | 16 15 ... (16 bits)... |

J-Type:

| opcode            | addr                  |
|-------------------|-----------------------|
| 31 ... (6bits)... | 26 25 ... (26bits)... |

Instructions:

| Mnemonic | Name                       | Type | Operation                                            | Op | Ums | Fmt | Notes                             |
|----------|----------------------------|------|------------------------------------------------------|----|-----|-----|-----------------------------------|
| add      | Add                        | R    | $rd = rs + rt$                                       | ✓  | 0   | 20  |                                   |
| addu     | Add Unsigned               | R    | $rd = rs + rt$                                       | 0  | 21  |     | May generate overflow exception.  |
| addi     | Add Imm                    | I    | $rt = rs + \text{SignExtImm}$                        | ✓  | 8   |     | Operands are considered unsigned. |
| addiu    | Add Imm Unsigned           | I    | $rt = rs + \text{SignExtImm}$                        | -  | 9   |     |                                   |
| sub      | Subtract                   | R    | $rd = rs - rt$                                       | ✓  | 0   | 22  |                                   |
| subu     | Subtract Unsigned          | R    | $rd = rs - rt$                                       | 0  | 23  |     |                                   |
| and      | And                        | R    | $rd = rs \& rt$                                      | 0  | 24  |     |                                   |
| andi     | And Imm                    | I    | $rt = rs \& \text{ZeroExtImm}$                       | C  | -   |     |                                   |
| or       | Or                         | R    | $rd = rs \vee rt$                                    | 0  | 25  |     |                                   |
| ori      | Or Imm                     | I    | $rt = rs \vee \text{ZeroExtImm}$                     | D  | -   |     |                                   |
| nor      | Nor                        | R    | $rd = \sim(rs \vee rt)$                              | 0  | 27  |     |                                   |
| sll      | Shift Left Logic           | R    | $rd = rt \ll \text{shamt}$                           | 0  | 0   |     |                                   |
| srl      | Shift Right Logic          | R    | $rd = rt \gg \text{shamt}$                           | 0  | 2   |     |                                   |
| sra      | Shift Right Arithmetic     | R    | $rd = rt \ggg \text{shamt}$                          | 0  | 3   |     |                                   |
| slt      | Set Less Than              | R    | $rd = rs < rt ? 1 : 0$                               | 0  | 2A  |     |                                   |
| slti     | Set Less Than Imm          | I    | $rt = rs < \text{SignExtImm} ? 1 : 0$                | A  | -   |     |                                   |
| sltu     | Set Less Than Unsigned     | R    | $rd = rs < rt : PC = 4 + \text{BranchAddr}$          | ✓  | 0   | 2B  |                                   |
| sltiu    | Set Less Than Imm Unsigned | I    | $rt = rs < \text{SignExtImm} ? 1 : 0$                | B  | -   |     |                                   |
| beq      | Branch on eq               | I    | $\text{if } rs == rt : PC = 4 + \text{BranchAddr}$   | 4  | -   |     |                                   |
| bne      | Branch on not eq           | I    | $\text{if } rs \neq rt : PC = 4 + \text{BranchAddr}$ | 5  | -   |     |                                   |
| jr       | Jump                       | J    | $PC = \text{JumpAddr}$                               | 2  | -   |     |                                   |
| jal      | Jump and Link              | J    | $PC = \text{JumpAddr}$                               | 3  | -   |     |                                   |
| jr       | Jump Register              | R    | $PC = rs$                                            | 0  | 8   |     |                                   |
| lw       | Load Word                  | I    | $rt = M[rs + \text{SignExtImm}]$                     | 23 | -   |     |                                   |
| lhu      | Load Half Unsigned         | I    | $rt = M[rs + \text{SignExtImm}] \& 15:0$             | 25 | -   |     |                                   |
| lbu      | Load Byte Unsigned         | I    | $rt = M[rs + \text{SignExtImm}] \& 1:0$              | 24 | -   |     |                                   |
| lui      | Load Upper Imm             | I    | $rt = \$imm, 16:0$                                   | F  | -   |     |                                   |
| sw       | Store Word                 | I    | $M[rs + \text{SignExtImm}] = rt$                     | 28 | -   |     |                                   |
| sh       | Store Half                 | I    | $M[rs + \text{SignExtImm}] \& 15:0] = rt[15:0]$      | 29 | -   |     |                                   |
| sb       | Store Byte                 | I    | $M[rs + \text{SignExtImm}] \& 1:0] = rt[1:0]$        | 28 | -   |     |                                   |
| mfhi     | Move from Hi               | R    | $rd = hi$                                            | 0  | 40  |     |                                   |
| mflo     | Move from Lo               | R    | $rd = lo$                                            | 0  | 42  |     |                                   |
| mult     | Multiply                   | R    | $hi, lo_3 = rs * rt$                                 | 0  | 18  |     |                                   |
| multu    | Multiply Unsigned          | R    | $hi, lo_3 = rs * rt$                                 | ✓  | 0   | 43  |                                   |

## Assembly

.data staticVar: .word 0 // 4 bytes

.text .global main

```
main:
    addi $s0 $t0 0 // 1st arg
    addi $s1 $t1 0 // 2nd arg
    addi $s2 $t2 0 // 3rd arg
    addi $s3 $t3 0 // 4th arg
    addi $sp $sp -24 // increase stack for args
    sw $t4 16($sp) // 5th arg
    sw $t5 20($sp) // 6th arg
    jal func // call function
    addi $sp $sp 24 // decrease stack
    addi $s6 $v0 0 // return value
end:
```

j end // halt loop

```
func:
    addi $sp $sp -36 // push stack frame // prologue
    sw $ra 60($sp) // save registers
    sw $s0 24($sp)
    :
    sw $s7 52($sp)
    sw $fp 56($sp)
    addi $fp $sp 36 // set frame ptr
    addi $s0 $fp 0 // 1st arg
    addi $s1 $fp 0 // 2nd arg
    addi $s2 $fp 0 // 3rd arg
    addi $s3 $fp 0 // 4th arg
    lw $t4 4($fp) // 5th arg
    lw $t5 8($fp) // 6th arg
    // body
    addi $s0 $t6 0 // return value
    lw $fp 56($sp) // restore registers
    lw $s7 52($sp)
```

```
// body
    addi $s0 $t6 0 // return value
    lw $fp 56($sp) // restore registers
    lw $s7 52($sp)
    lw $s0 24($sp)
    lw $sra 60($sp)
    addi $sp $sp 36 // pop stack frame
    jr $sra // return to caller
```

Stack frame:



Memory Layout:



## Single Cycle Microarchitecture



| Instruction | Op     | RegWrite | RegDst | ALUSrc | Branch | MemWrite | MemToReg | ALUOp | Jump |
|-------------|--------|----------|--------|--------|--------|----------|----------|-------|------|
| R-Type      | 000000 | 1        | 1      | 0      | 0      | 0        | 0        | 10    | 0    |
| lw          | 100011 | 1        | 0      | 1      | 0      | 0        | 1        | 00    | 0    |
| sw          | 101011 | 0        | X      | 1      | 0      | 1        | X        | 00    | 0    |
| beq         | 000100 | 0        | X      | 0      | 1      | 0        | 0        | 01    | 0    |
| addi        | 001000 | 1        | 0      | 1      | 0      | 0        | 0        | 00    | 0    |
| j           | 000010 | 0        | X      | X      | X      | 0        | X        | XX    | 1    |



Critical Path:  
 $T_c = t_{pcq\_PC} + t_{mem} + \max(t_{RF\_read}, t_{ext}, t_{mux}) + t_{ALU} + t_{mem} + 2t_{mux} + t_{RF\_setup}$   
 Time to execute N Instructions =  $N \cdot \frac{CPT}{1 - \frac{1}{f_c}}$

## Pipelined Architecture



$T_c > \max($

$t_{pcq} + t_{mem} + t_{setup},$   
 $2(t_{RF\_read} + t_{mem} + t_{ext} + t_{RF\_mux} + t_{setup}),$   
 $t_{pcq} + t_{mem} + t_{ALU} + t_{setup},$   
 $t_{pcq} + t_{memwrite} + t_{setup},$   
 $2(t_{pcq} + t_{mem} + t_{RF\_write})$

fetch  
decode  
execute  
memory  
writeback

//Forwarding Logic:

```
assign ForwardAD = (RD1==RLE) & (RD2==RLE) & RegWriteM
assign ForwardBD = (RD1==RLE) & (RD2==RLE) & RegWriteM & RegWriteE
```

//Stalling Logic:

```
assign ldstall = ((RD == RLE) | (RLE == RD)) & MemToRegE
assign branchstall = (BranchD & RegWriteE & (WriteRegE == RD) | WriteRegE == RLD) |
    (BranchD & MemToRegM & (WriteRegM == RD) | WriteRegM == RLD)
```

//Stall signals:

```
assign StallF, StallD, FlushE = ldstall | branchstall
```

## Hazards

- Data Hazard: Data for next instruction is not ready
- Control Hazard: Next Instruction is unknown

## Pipeline:



## Branch Resolution:



## Data Forwarding:



## Pipeline Stalling:



## Verilog:

Comments:  
// One-liner  
// Multiple  
lines //

Numeric Constants:  
8'b\_0110\_1010 // binary  
8'd152 // octal  
8'd106 // decimal  
8'h6A // hexadecimal  
"j" // ASCII  
7'b2 // High Impedance

Nets and Variables:  
wire [3:0] w; // assign outside always  
reg [3:0] r; // assign inside always  
reg [2:0] men [3:0];

Parameters:  
parameter N=8;  
localparam S=2'd3;

Assignments:  
assign Output = A\*B;  
assign Zc,D3 = 2'd15&[1:2],C[1:2],E[3];

Operators:  
A[D:1][1:D] // Select  
&A, ~&A, IA, ~IA, IA, ^A // Reduction  
!A, ~A // Complement  
+A, -A // Unary  
{A, ..., B} // Concatenate  
ENLA3 // Replicate  
A\*B, A/B, A%B, A+B, A-B // Arithmetic  
A<B, A>B, A>>B // Shift  
A>B, A>=B, A<=B, A=B // Relation  
A&B, A|B, A^B, A~B // Bitwise  
A&B, A|B // Logical  
A?B:C // Conditional

## Module:

```
module MyModule
#(parameter N=8) // optional parameter
(input rst, clk,
output [N-1:0] out);
  //impl
endmodule
```

## Case:

```
always @(*) begin
  case (num)
    2'b00: A = 8'd3; //blocking
    2'b01,
    2'b10: A = 8'd10;
    2'b11: A = 8'd2;
  endcase
end
```

## always @(\*):

```
begin
  casex (dec)
    4'b0xxx: enc = 2'b00;
    4'b00xx: enc = 2'b01;
    4'b001x: enc = 2'b10;
    4'b0001: enc = 2'b11;
    default: enc = 2'b00;
  endcase
end
```

## Generate:

```
genvar j;
wire [1:0]out [3:0];
generate
  for(j=0; j<20; j=j+1)
    begin: Gen_Modules
      MyModule #(N(1)) mod(
        .rst(RST), .clk(CLK),
        .out(out[j]));
    end
  endgenerate
end
```

## Synchronous:

```
always @ (posedge clk) begin
  if(rst) B <= 0; // (non-blocking)
  else B <= B + 1'b1;
end
```

## Loop:

```
always @ (n) begin
  count = 0;
  for(j=0; j<8; j=j+1)
    count = count + input[i];
end
```

## Function:

```
function [2:0]F;
  input [3:0]A;
  input [3:0]B;
  begin
    F = {A+1'b1,B+2'd2};
  end
endfunction
```

## State Machine:

```
reg [3:0]state;
parameter S0 = 2'b00,
S1 = 2'b01,
S2 = 2'b10,
S3 = 2'b11;
```

## always @ (posedge clk):

```
if (rst) state = S0;
else case (state)
  S0: state = S1;
  S1: state = S2;
  S2: state = S3;
  S3: state = S0;
endcase
```

## end

## Testbenches:

```
'timescale 1ns/1ps //unit time>/resolution
initial
always
#10 // delay by 10ns
$display ("%b bin,%h hex,%d dec,%o oct,%c ASCII,%s str,%t time",
...,$time)
$readmemh("filename", testbuf)
$finish
```

```
module d_latch(input d,en,rstn,
                  output reg q);
  always @ (en or rstn or d)
    if (rstn) q <= 0;
    else if (en) q <= d;
  endmodule
```

```
module d_flipflop(input d,rstn,clk,
                   output reg q);
  always @ (posedge clk or negedge rstn)
    if (rstn) q <= 0;
    else q=d;
  endmodule
```

## Functional Verification:

