

# Data Representation - CS 132

## Decimal to Binary Conversion

Repeatedly divide by 2 to get remainders  $r_0, r_1, \dots, r_n$

then binary representation is  $r_n r_{n-1} \dots r_1 r_0$

## Decimal to Octal or Hex

Same as binary but dividing by 8 and 16 respectively.

### Addition in binary

$$\begin{array}{r} 100 4 \\ 010 2 \\ \hline 110 6 \end{array} \quad \begin{array}{r} 111 7 \\ 001 1 \\ \hline 0110 \end{array}$$

Overflow

### Subtraction:

To subtract we add a number to the other numbers negated value

### Signed Magnitude

$$\begin{array}{cccc} | & | & 0 & | \\ 2^2 & 2^1 & 2^0 & \\ \text{negative?} & & & \end{array}$$

$$-1 (2^2 + 0 + 1) = -5$$

### Addition in two's complement

Same as before, but ignore overflow

### Floating Point

Similar to scientific notation

mantissa  $\times 10^{\text{exponent}}$

### Two's Complement

$$\text{has range } -2^{N-1} \rightarrow 2^{N-1} - 1$$

$$\begin{array}{cccc} | & | & 0 & | \\ -2^3 & 2^2 & 2^1 & 2^0 \\ \# & & & \end{array}$$

$$-8 + 4 + 0 + 1 = -3$$

### Fractional Numbers

#### Fixed Point

$$0.11 = \frac{1}{2^1} + \frac{0}{2^0} + \frac{1}{2^{-1}} + \frac{1}{2^{-2}} \\ 2 + 0 + \frac{1}{2} + \frac{1}{4} = \frac{3}{4}$$

### IEEE Floating Point

more range, less precision.  
32 bit representation:

1 1111111111111111111111111111111

Sign Exponent Mantissa.

# Digital Logic - CS 132

## Moore's Law

Predicted the number of transistors on a chip would double every 12-18 months.

### Consequences

- Cost of computer logic and memory circuitry has fallen at a dramatic rate
- As logic and memory is more tightly packed, the operating speed is increased.
- Computers become smaller
- reduction in power and cooling requirements.
- interconnections on ICs are more reliable than solder connections, so having fewer off-chip connections increases reliability

### Combinatorial logic circuit

A logic circuit whose output is a logical function of its input

POSITION IS IMPORTANT

|   | AB | 00 | 01 | 11 | 10 |
|---|----|----|----|----|----|
| C | 00 | 0  | 1  | 0  |    |
| 0 | 1  | 0  | 1  | 0  |    |
| 1 | 1  | 1  | 1  | 0  |    |

$$\bar{A} \cdot \bar{B} + C \cdot \bar{A} + A \cdot B$$

Groupings:

Only allow grouping of adjacent squares



Can also loop around the K-map

|   | AB | 00 | 01 | 11 | 10 |
|---|----|----|----|----|----|
| C | 00 | 1  | 0  | 0  | 1  |
| 0 | 1  | 1  | 1  | 1  | 1  |
| 1 | 1  | 1  | 1  | 1  | 1  |

~~Boolean Expressions~~

Simplifying boolean expressions with boolean algebra.  
distributive law (AND)

$$A+B \cdot C = (A+B) \cdot (A+C)$$

Karnaugh maps

A B C f Sum of products

$$0 \ 0 \ 0 \ 1 = \bar{A} \cdot \bar{B} \cdot \bar{C} + \bar{A} \cdot B \cdot C +$$

$$0 \ 0 \ 1 \ 0 = A \cdot \bar{B} \cdot \bar{C} + \bar{A} \cdot \bar{B} \cdot C +$$

$$0 \ 1 \ 0 \ 0 = A \cdot B \cdot C$$

we can Simplify this algebraically,

but karnaugh maps are easier.

0 1 1 1

1 0 0 0

1 0 1 0

1 1 0 1

1 1 1 0

Don't Care Conditions

Allows us to Simplify expressions, we can use it in Kmap if it Simplifies it.

# Digital Logic Circuits - CS 132

## Half-Adder

| A | B | Sum | Carry |
|---|---|-----|-------|
| 0 | 0 | 0   | 0     |
| 0 | 1 | 1   | 0     |
| 1 | 0 | 1   | 0     |
| 1 | 1 | 0   | 1     |

## 1-bit Full-Adder

| Cin | A | B | Out | S | Cin | A | B | Out | S |
|-----|---|---|-----|---|-----|---|---|-----|---|
| 0   | 0 | 0 | 0   | 0 | 1   | 0 | 0 | 0   | 1 |
| 0   | 0 | 1 | 0   | 1 | 1   | 0 | 1 | 1   | 0 |
| 0   | 1 | 0 | 0   | 1 | 1   | 1 | 0 | 1   | 0 |
| 0   | 1 | 1 | 1   | 0 | 1   | 1 | 1 | 1   | 1 |

## N-Bit Full-Adder



## Subtractor:

We add in a control line to determine between adder and subtractor.

Can use the control line

to perform both steps to

convert B to a negative number.

1. flip bits

2. +1



## Multiplexer

| $x_0$ | $x_1$ | $x_2$ | $x_3$ | $S_0$ | $S_1$ | $Y$   |
|-------|-------|-------|-------|-------|-------|-------|
| 0     | 0     | -MUX  | -     | 0     | 0     | $x_0$ |
| 0     | 1     | -     | -     | 0     | 1     | $x_1$ |
| 1     | 0     | -     | -     | 1     | 0     | $x_2$ |
| 1     | 1     | -     | -     | 1     | 1     | $x_3$ |

## Active-High Decoder

| $x_0$ | $x_1$ | $y_0$ | $y_1$ | $y_2$ | $y_3$ |
|-------|-------|-------|-------|-------|-------|
| 0     | 0     | 1     | 0     | 0     | 0     |
| 0     | 1     | 0     | 1     | 0     | 0     |
| 1     | 0     | 0     | 0     | 1     | 0     |
| 1     | 1     | 0     | 0     | 0     | 1     |

## Active-High encoder

| $x_0$ | $x_1$ | $x_2$ | $x_3$ | $y_0$ | $y_1$ |
|-------|-------|-------|-------|-------|-------|
| 1     | 0     | 0     | 0     | 0     | 0     |
| 0     | 1     | 0     | 0     | 0     | 1     |
| 0     | 0     | 1     | 0     | 1     | 0     |
| 0     | 0     | 0     | 1     | 1     | 1     |

## De-Multiplexer (Active-High)

| $A$ | $S_0$ | $S_1$ | $y_0$ | $y_1$ | $y_2$ | $y_3$ |
|-----|-------|-------|-------|-------|-------|-------|
| 0   | 0     | 0     | 1     | 0     | 0     | 0     |
| 0   | 0     | 1     | 0     | 1     | 0     | 0     |
| 0   | 1     | 0     | 0     | 0     | 1     | 0     |
| 0   | 1     | 1     | 0     | 0     | 0     | 1     |

# Digital Logic Contd - CS132

## Common Applications

### Multiplexor

- Source Selection Control

- Home stereo control e.g. send iPod, CD or radio to speakers - Analogue, not digital

- Share one communication line between multiple senders

- Requires both Mux and De-Mux.

- Parallel to serial conversion

### De-Multiplexor

- Share ~~multiple~~ one communication line between multiple senders

- Serial to parallel conversion

- A control for multiple lights

### Sequential Logic

A logic circuit whose outputs are logical functions of its inputs and current state

### Flip-Flops



### D-Type latch

essentially 1-bit of memory



### Clacked Flip-Flops

Changes on the rising edge of a clock input

### N-bit Register



### N-bit Shift register

A register that stores and shifts 1 bit at a time



### Multi-bit memory

### N-bit Counter



# Digital Logic (Contd.) - CS132

## Three-State logic

Instead of just 1 and 0, we now have another state **UNCONNECTED**.



Enable (active high)

high = connected

low = unconnected

## Propagation delay

Each logic circuit takes a bit of time to run.

Can be reduced by placing components closer together

## PLA organisation



## Three-State Buses

Can use three-state logic to allow multiple sources of data onto 1 bus.

## Properties of Logic Gates

$V_{max}$  — logic 1

$V_{min}$  — forbidden region

$V_{max}$  — logic 0

0V

Do not connect outputs together except three-state buffers

## ICs

Logic gates and functions can be obtained in small ICs

There are many types of programmable logic device!

Programmable Array Logic (PAL)

one-time programmable

programmable logic array (PLA)

Contains an And array, which feeds to an OR array.

Field Programmable gate array (FPGA)

- Modern

- Can contain millions of gates

- Enough for an entire processor

# Microprocessor Fundamentals

## Assembler - CS132

### CPU

Controls and performs instructions.

Commonly found on a silicon chip called a microprocessor

### Fetch-Decode-Execute Cycle

#### Fetch:

- instruction fetched from memory location in PC
- Store in IR
- increment PC

#### Decode:

- Retrieved instructions / opcode decoded.
- Read effective address to establish opcode type.

#### Execute

- CU Signals CPU components
- May result in changes to data registers, PC, ALU, I/O, etc.

### Status Register

Consists of various status bits that are set or reset upon certain conditions arising from the ALU

### Registers

### Key Components of CPU

**ALU** - Arithmetic Logic Unit

**CU** - Control Unit

### Fetch-Decode-Execute Components

**ALU/MUX** - Performs Mathematical and Bit-level operations

**CU** - Decodes programs instructions and handles logistics for executing them.

**IR** - Contains most recently fetched Instruction.

**PC** - Tracks the memory address of next instruction.

**MAR** - Contains memory address location to R/W from/to.

**MDR** - Contains data fetched from memory, or ready to write to memory

### Data Registers

Stores frequently used values and intermediate results

### Address Registers

- Used as **PoINTER REGISTERS** in the calculation of operand addresses

- One of these registers can be used to hold subroutine return addresses.

# Register Transfer Language

## Assembler - CS132

What is RTL?

describes the operations of a microprocessor as it is executing instructions.

e.g.

$[MAR] \leftarrow [PC]$

RTL - Instruction Fetching

This part of the cycle is always the same

1.  $[MAR] \leftarrow [PC]$

2.  $[PC] \leftarrow [PC] + 1$

3.  $[MBR] \leftarrow [MS([MAR])]$

(R/W set to read)

4.  $[IR] \leftarrow [MBR]$

5.  $[CU] \leftarrow [IR(\text{opcode})]$

Fetch and Execute - Adding a constant byte to DO

1.  $[MAR] \leftarrow [PC]$

2.  $[PC] \leftarrow [PC] + 1$  Instruction fetching (1-5)

3.  $[MBR] \leftarrow [ ]$

6.  $[MAR] \leftarrow [PC]$

7.  $[PC] \leftarrow [PC] + 1$

8.  $[MBR] \leftarrow [MS([MAR])]$

9.  $[ALU] \leftarrow [MBR] + DO$

10.  $[DO] \leftarrow ALU$

# Assembly Language

## Assembler - CS132

### Assembly language

uses easily remembered mnemonics for each instruction

also allows memory locations and constants to be given

### Symbolic names

### Assembler format:

(label) <opcode> <oprand(s)> | comment

e.g.

START: move.b #5, 00 - | load 00  
| with 5

### Why use Assembly language?

Computers only understand 1's and 0's.

But this is hard for humans to understand.

This is why we use assembly languages

### Instruction Set

Comments have 2 aspects:

- Instructions

- Addressing Modes

## Subroutines and Stacks

### Subroutines

Frequent used sections of code.

Reduces program size

Improves readability

JSR (label) Jump to Subroutine

RTS Return from Subroutine

Use of Stack for Subroutine calls.

### 1. Subroutine call

- pushes contents of PC onto stack

- puts start address of subroutine to PC

### 2. Return from Subroutine

- pops return address from stack and puts it in PC

So the stack stores where to return from a subroutine.

A Subroutine can call another Subroutine, so we have multiple addresses on the stack.

## Addressing Modes

### Assembler - CS132

#### Some addressing modes

- Direct Addressing
- Immediate Addressing
- Absolute Addressing
- Address Register - indirect
- Relative Addressing

#### Immediate Addressing

The operand forms part of the instruction and remains a constant.

e.g. move #42, D5

#### Address Register - Indirect

Take the contents of an address register and use that as the address to set data from with offset

Similar, but add to the fetched contents a constant

#### Post-incrementing

Normal indirect addressing, but increment the address register afterwards

#### Pre-decrementing

Normal indirect addressing, but decrement the address register before.

#### Indexed Addressing

Add together 2 address registers and a constant to get the address.

#### Direct Addressing

The address of an operand is specified by either a data or address register.

e.g. move D3, D2

#### Absolute addressing

The operand specifies the location in memory explicitly.

e.g. move D2, \$7FFF0.

does not allow position independent code.

#### Relative addressing

Take the contents of the PC and can add or subtract values to it.

Relative addressing is useful for position independent code.

# Memory Systems - CS132

## Choice of Memory Tech

MUST allow for memory to be used to solve Designer's dilemma  
read and written

Many factors influence choice of memory technology:

- Frequency of access
- Access time
- Capacity required
- Cost

## Temporal Locality:

If a memory location is referenced, it is likely to be referenced again soon

## Reasonable Assumption

90% of memory accesses are within 2kb of previous PC position

## Moore's Law

"Number of transistors on a memory chip doubles every 18 months"

## Consequences:

- Cost of computer logic and memory circuitry has fallen
- Speed increases
- Size decreases
- Power and cooling requirements decreases.
- Increases reliability

## Cache Miss

$$n = \frac{\text{Number of times words are in cache}}{\text{Total number of memory references}}$$

## Memory hierarchy

Sequential Access Random Access



## Spatial Locality

If a memory location is referenced, it is likely nearby memory will be referenced soon

## Cache

Small bit of memory located near the CPU for faster lookup time.

Cache is small to limit cost.

Why have Cache?

Memory Size Increasing.

Memory Access Speed not increasing quick enough.

## Cache concepts

### Cache read

No change will occur, so copies in cache will stay the same  
∴ no change needed.

### Cache write

1. Write through - update item in Cache and writes through to update lower levels.

2. Only updates copy in cache ensuring blocks are copied to memory when replaced - write back

## Memory Systems 2 - CS 132

### Types of cache miss

- **Compulsory** - misses that would occur regardless of cache size.
- **Capacity** - misses occurring as a result of cache not being large enough.
- **Conflict** - misses occurring as a result of the placement strategy for blocks not being fully associative, meaning a block may be discarded and retrieved.
- **Coherency** - misses occurring due to cache flushes in multiprocessor systems

### Semiconductor Memory

Most common form of Main Store RAM

2 main technologies

- **SRAM**
- **DRAM**

### Dynamic Ram (DRAM)

Stores data as charge on a capacitor

Capacitors discharge, hence DRAM cells require periodic charge to maintain data storage.

### 1 bit DRAM cell



### Measuring Cache performance

hit/miss rate isn't enough. We need to know how much slower a miss is.

Average memory access time

$$= \text{Hit time} + (\text{miss rate} \times \text{Miss penalty})$$

This can still be inaccurate  
~~as it's just a formula~~

Execution time can often be a better measure

### Multi-level cache



### Static RAM (SRAM)

Stores data using configurations of flip-flops and logic gates. Memory cells hold data as long as power is supplied.

Made up of 6 transistors

### DRAM operations

#### Write:

Voltage applied to bit line. A signal is applied to the address line to allow the bit line value through to the capacitor.

#### Read:

Transistor powered by address line, allowing the capacitor to discharge into the bit line.

Capacitor needs recharging after.

## Memory Systems 3 - CS132

### Comparing SRAM & DRAM

Both volatile

Dynamic are simpler and more compact.

Refresh Circuits of DRAM incurs one-off cost.

SRAM typically faster

Memory Cell

Select

R/W → Cell → Data in/out

### Memory Chip organisation

Can organise memory how we want e.g.  $128 \times 8$  which would require 7 address lines.

### Poor choices

try and make memory organised in a square.  
don't do something like  $1024 \times 1$ .

This results in a long narrow cell array with a large address decoder.

### Inefficient use of space

#### System Memory Organisation

Can use similar idea to organise multiple memory ICs.

See

OS - Memory Systems  
Slide 10.

for diagram.

### Advanced DRAM Organisations

data transfer between MS and CPU is large bottleneck.

Other DRAM organisations can help with this

- SDRAM - Synchronous DRAM

- RDRAM - Rambus DRAM

- DDR SDRAM

- CDRAM - Cache DRAM

### Memory Chip



$16 \times 8$  memory IC.

16 words, each of 8 bits.

### Minimising Address Decoding Space

Split address input into 2 parts:

row address and column address



## Memory Systems 4 - CS132

### Noise

Noise is unwanted info.  
Noise arises from the physical properties of devices

- Thermal noise
- Noise from electronic components
- ⇒ Noise of transmission circuits

### Majority voting.

Send the message 3 times and take a vote from the responses

### EXPENSIVE

works well, as if the prob 1 error occurs is low, then the prob 2 errors occur at the same place is even lower

### Error Detection Using Parity

#### At transmitter

Compute parity bit ( $P$ ) add to message

#### At receiver

recompute parity bit ( $Q$ )  
if  $P \neq Q$ , then an error has occurred.

### Burst Errors

In practice, many errors occur at once in a burst.  
This makes parity check useless.

Solution is to calculate a checksum.

### Noise and Digital Logic Devices



Noise Immunity is high in digital devices

Immunity breaks once the noise reaches a certain magnitude

### Parity

Adds 1 bit to the message to make the number of 1's in the msg even or odd.

#### in Software:



#### in Hardware



dummy input (in place where parity bit should be)

= 0 for even parity

= 1 for odd parity

### Error Correcting Codes (ECCs)

Calculate both character and column parity.

|   |   |   |   |   |   |
|---|---|---|---|---|---|
| M | 1 | 0 | 1 | 1 | 0 |
| E | 0 | 1 | 0 | 0 | 0 |
| S | 1 | 1 | 0 | 0 | 1 |
| S | 1 | 1 | 0 | 0 | 1 |
| A | 1 | 1 | 0 | 0 | 0 |
| G | 1 | 1 | 0 | 0 | 1 |
| E | 0 | 1 | 0 | 0 | 1 |
| R | 0 | 1 | 0 | 1 | 1 |

# NON-ASSESSED

## Memory Systems 5 - CS132

### Hard Disk

Records data by magnetising a thin film of ferromagnetic material on a disk.

### Data Organisation

One track contains many sectors, each separated by an inter-connected gap.

Sector contains preamble to synchronise the head before read/write, data and ECC.

### Performance

Seek times - time to move arm to correct track.

### Rotational latencies:

Motors spin at a constant speed.

Average delay = 3 to 6 ms

Formatting reduces capacity by 15%.

### Increasing capacity with zones

The outer edge of the hard disk has a faster velocity.

This means the bit density is lower at the outer edge.

To change this there are more zones placed in the outer edges.

### Optical Disk



Strong reflection      Weak reflection

○

1

Change of State = 1  
otherwise = 0

### Error correction

Cross Interleaved Reed-Solomon (CIRC) code is used to spread data out over the disk.

Up to 2mm of track can be corrected.

Data only accounts for 28% of bits stored, thus we pay a heavy price for great reliability.

### Performance

Seek time averages are typically 80ms, which is at least an order of magnitude worse than hard disks.

# I/O Mechanisms - CS132

## Memory-Mapped I/O

- Same address bus used for memory and I/O devices.
- Memory and registers on I/O devices are mapped to address values.

## Disadvantages of Memory-mapped I/O

portions of memory address space must be reserved.

This problem is only relevant for 16/32 bit machines

## Polling advantages

Simple Software and hardware.

- looping construct and checks
- Support for action "reads"

## HandShaking

responds to I/O device being ready by placing data on the bus and signalling that the data on the bus is valid.



## Advantages of Memory-mapped I/O.

Simple

CPU requires less internal logic

Allows use of general purpose memory in structures.

Addressing modes are available for I/O

## Synchronising with polling

Busy-wait

Polling



## Polling disadvantages

wastes CPU time and power.

Polling with interleaved tasks

can lead to delayed responses.

GS22 VIA performing handshaking.

PCR 1000xxxx

CB2 handshake output mode

CB1 interrupt flag set by negative transition on CB1.

**PRINTER READY Asserted.**

VIA sets IFR4

VIA sets DATA\_VALID to invalid.

When CPU writes a value to CRB

VIA sets DATA\_VALID to valid

CB1: printer-ready

CB2: DATA\_VALID



# I/O Mechanisms 2 - CS132

## Interrupts

CPU normally executes instructions sequentially unless there is a jump. An interrupt forces the CPU to jump to a service routine.

## Nested Interrupts

Interrupts can be called while processing other interrupts.

This is why we have stacks

## Interrupt Advantages

Fast response

No wasted CPU time/power

## Interrupt Disadvantages

All data transfers still controlled by CPU

More complex hardware and software.

## DMA

dedicated device that controls the 3 system buffers during data transfer.

DMA is optimised for data transfer.

## DMA operation

1. DMA transfer requested by I/O
2. DMAC passes request to CPU
3. CPU initialises DMAC

4. DMAC requests use of system buses

5. CPU responds with DMA Ack when it's ready to surrender buses

## Interrupt Sequence

1. External device signals interrupt  
INTERRUPT RESPONSE

2. CPU completes current instruction

3. Push PC onto Stack

4. Push Status register(s) onto Stack

4. Load PC with address of interrupt Handler

RETURN FROM INTERRUPT

1. Pop PC from Stack

2. Pop Status register(s) from Stack

3. Load PC with popped return address.

## Direct Memory Access (DMA)

CPU is a bottleneck for I/O

DMA avoids the CPU bottleneck.

Why use DMA?

Used when large amounts of data must be transferred at high speeds.

Control of buses surrendered to a DMA Controller (DMAC)

DMA data transfer can be more than 10x faster than CPU-driven I/O.

i. Input or output

ii. Start address → DMAC address res.

iii. Number of words to transfer → Count res.

iv. CPU enables DMAC

## I/O Mechanisms 3 - CS132

DMA modes of operation

Cycle Stealing

DMAC uses the system buses when they are not being used by the CPU.

Burst Mode

DMAC receives system buses for extended transfer of large amounts of data at a high speed and "locks" the CPU from using them.

The CPU is locked until the transfer is complete or a device with higher priority requests an interrupt.

DMA organisation - Single bus detached DMA

All modules share the same system buses

The DMA module mimics the processor, uses programmed I/O to exchange data between memory and an I/O module via the DMA module.

Straightforward to implement, but inefficient.

each word transfer takes 2 bus cycles

DMA organisation - I/O Bus

Reduces number of I/O interfaces in the DMA module to one.

System bus the DMA module shares with the processor and memory is used only to transfer data to and from memory.

## Processor Architecture - CS 132

### Organisation Vs. Architecture

Architecture concerns the structure and properties of a computer system, from a software viewpoint.

Organisation is the same but from a hardware viewpoint.

### Control Unit

Takes several inputs:

- Opcode from IR
- Values from CCR
- A stream of clock pulses

The control unit generates signals to control all CPU components

### Control Signals Fetch

1. Epc, CMAR
2. R, EMS, CIR
3. Epc
3.  $F_1 = 0$   $F_2 = 0$
3. CALUR
4. EALUR,  $C_{Epc}$

### Analysing Control Signals

Analysis allows us to determine

the number of control steps a particular instruction takes.

Number of steps taken for an instruction is important as more steps means slower execution.

### Representation of Instructions

| Address | Mnemonic | Opcode | Operand |
|---------|----------|--------|---------|
|---------|----------|--------|---------|

|   |        |     |       |
|---|--------|-----|-------|
| 0 | CLEAR  | 000 | 00000 |
| 1 | ADD #9 | 010 | 01001 |
| 2 | DECI   | 011 | 00000 |

### Fetch Macro Instructions

This operation is performed before every instruction

$$[IR] \leftarrow [MS(PC)]$$

$$[PC] \leftarrow [PC] + 1$$

### Controls Signals example - INC1

use PATP model.

|                               |                        |
|-------------------------------|------------------------|
| 1. $[ALU(Q)] \in [PO]$        | 1. $E_{D0}$            |
| 2. $[ALU(F)] \in Q_1 ("Q+1")$ | 1. $F_1 = 0$ $F_2 = 1$ |
| 3. $[ALU_{Reg}] \in [ALU]$    | 1. CALUR               |
| 4. $[PO] \in [ALU_{Reg}]$     | 2. EALUR, $C_{Epc}$    |

### Control Signal Assumptions

Enables are level triggered

Clock Signals are falling edge triggered

An output can be enabled ON to the main bus and then clocked elsewhere in one step.

ALU timings assume that, if values are enabled at P and Q at cycle start, CALUR can be clocked at cycle end

MS timings assume that, if MAR is loaded during 1 cycle, then R, W and EMS can be used in the next cycle

# Processor Architecture 2 - CS132

## Control Unit Task

FETCH, DECODE, EXECUTE

## Hardwired CU

We use a sequencer

We build a digital logic circuit that produces the outputs if we connect  $C_{CU}$  to a clock input.

## Hardwired CU overview



## Microprogrammed CU overview



## Micro PC

What can next MPC value be

- Current Address +1
- Address given in MIR Next Address Field.
- Output from DTDAs circuit

Decision depends on design choices

## Control Unit Design

2 Approaches

- Hardwired / "Random Logic"
- Microprogrammed



## Hardwired CU Advantages

FAST

## Hardwired CU Disadvantages

Complex

Difficult to design and test.

Inflexible

Long design time.

## Microprogrammed CU

USES microprogram memory

Store the required control action in memory

## Microprogrammed CU - Terminology

microprogram routine

Describes how to generate the CU outputs for a macro instruction.

Microaddress

A location within microprogram memory

Micro PC

CUs internal program counter

Micro IR

CUs internal Instruction Register

Microinstruction

Holds CU output values and other fields to control microprogram flow

## Processor Architecture 3 - CS132

### Micro Program Execution - Normal operation

- When powered MPC initialises to address 0.
- Execution of the M program then starts.
- Each Minstruction sets the CU outputs to the values recorded within the Minstruction. This generates PATP control signals corresponding to the PATP fetch operation.
- After each Minstruction, MPC is incremented

### MicroProgram Execution - End of fetch

- Last Minstruction is unusual.
- IR is filled with next opcode and this is fed to the CU opcode inputs
- Last fetch Minstruction has the next MPC calc type field set so we don't just increment PC. Instead we set the output of the OTOA circuit.
- MPC jumps to the address of the M program for the opcode fetched from memory.
- CU starts to execute that M program

### Microprogram Execution - End of Microprograms

- is unusual
- Next MPC calc type field set s.t. MPC is set from the MIR next address field.
- Opcode M programs use this facility to set the MPC back to the fetch Mprogram address

### Microprogram Execution - Conditional Branch

- PATP use BZC
- Next MPC value is determined by the MIR's next address if Z is set. Otherwise +1 circuit used.

### Microprogram Advantages

- Ease of design and implementation
- Flexible
- Simple hardware
- Mprogram memory can be reprogrammed

### Microprogram Disadvantages

- Slower than hardware implementations.