

# Pipelining

- Parallel processing .
- multiple functional unit
- Flynn's classification ✓
- Pipelining
  - floating +, - in pipelining
  - concept ✓
  - Speed up equation

## Instruction Level pipeling

- instruction cycle
- 3 and 4 instruction pipeline
- Pipeline conflict and soln ;

## Vector processing

- Application
- Vector operation
- Matrix Multiplication

## Parallel processing

- It is the technique which provide simultaneous data processing task for increasing computational speed of computer.

## Pipelining

- It is the process of breaking down the sequential process into sub operation, where each sub process is executed in dedicated segment that operate concurrently with all other program.
- Collection of processing segment through which binary information flows.
- Each segment performs partial processing
- The result obtained from each segment is transferred to next segment in pipeline.
- The final result is obtained after the data have passed through all segment.

→ Several computation can be done at a same time.

→ The overlapping of computation is made possible with the help of associative register.

for example:-

We wanna perform operation

$$A_i * B_i + C_i \quad \text{for } i = 1, 2, \dots, 7$$

Here,  
- is each suboperation is to be implement within a pipeline.  
- each segment have 1 or 2 register and combinational circuit.

Here

$$\begin{aligned} R_1 &\leftarrow A_i && \text{Input A} \\ R_2 &\leftarrow B_i && \text{Input B} \\ R_3 &\leftarrow R_1 * R_2 && \text{Multiply} \\ R_4 &\leftarrow C_i && \text{Input C} \\ R_5 &\leftarrow R_3 + R_4 && \text{Addition} \end{aligned}$$



## K-segment pipeline



fig: K-segment pipeline

→ Any operation can be broken into a sequence of suboperation of same complexity can be implemented by pipeline processor.

→ Here, in the diagram, there consist K-segment and K-register.

→ Each segment consist of combinational circuit and register.

→  $R_i$  holds the result between the stages

Suppose,

we have K-segment pipeline

Let no of clock cycle =  $t_p$

no of task =  $n$

Here,

1st task takes  $K \times t_p$  time

Remaining ( $n-1$ ) task require  $(n-1) \times t_p$

Thus,

$$\text{The required clock cycle} = K \times t_p + (n-1) \times t_p$$

$$= (K + (n-1)) \cdot t_p$$

$$= K + (n-1) \text{ clock cycle}$$

The speed up of pipeline processing over an equivalent non-pipeline processing is defined by the ratio

$$\delta = \frac{n t_p}{(K+n-1) t_p}$$

|     |       |       |       |       |       |       |       |       |           |
|-----|-------|-------|-------|-------|-------|-------|-------|-------|-----------|
| 1   | 1     | 2     | 3     | 4     | .     | .     | .     | ..    | $K+(n-1)$ |
| 2   | $T_1$ | $T_2$ | $T_3$ | $T_4$ | ...   | $T_n$ |       |       |           |
| 3   | $T_1$ | $T_2$ | $T_3$ | .     | .     | .     | ..    | $T_n$ |           |
| 4   |       |       |       | $T_1$ | $T_2$ | $T_3$ | .     | .     | $T_n$     |
| :   |       |       |       |       | $T_1$ | $T_2$ | .     | .     |           |
| $n$ |       |       |       |       |       |       | $T_1$ | ..    | $T_n$     |

fig : Space time diagram

TU

Let consider,  
we have 6 segment pipeline  
no of task = 8

Here

cycle

$$\begin{aligned}\text{no of clock require} &= k + (n-1) \\ &= 6 + (8-1) \\ &= 13 \text{ tp}\end{aligned}$$

|   | 1     | 2     | 3     | 4     | 5     | 6     | 7     | 8     | 9     | 10    | 11    | 12    | 13 |
|---|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|----|
| 1 | $T_1$ | $T_2$ | $T_3$ | $T_4$ | $T_5$ | $T_6$ | $T_7$ | $T_8$ |       |       |       |       |    |
| 2 | $T_1$ | $T_2$ | $T_3$ | $T_4$ | $T_5$ | $T_6$ | $T_7$ | $T_8$ |       |       |       |       |    |
| 3 |       | $T_1$ | $T_2$ | $T_3$ | $T_4$ | $T_5$ | $T_6$ | $T_7$ | $T_8$ |       |       |       |    |
| 4 |       |       | $T_1$ | $T_2$ | $T_3$ | $T_4$ | $T_5$ | $T_6$ | $T_7$ | $T_8$ |       |       |    |
| 5 |       |       |       | $T_1$ | $T_2$ | $T_3$ | $T_4$ | $T_5$ | $T_6$ | $T_7$ | $T_8$ |       |    |
| 6 |       |       |       |       | $T_1$ | $T_2$ | $T_3$ | $T_4$ | $T_5$ | $T_6$ | $T_7$ | $T_8$ |    |

The time require = 13 tp

## Instruction Pipeline

- An instruction pipeline reads consecutive instruction from memory while previous instruction are being executed in other segment.
- This causes the instruction fetch and execute phase to overlap and perform simultaneous operation.
- The instruction fetch segment can be implemented by means of FIFO buffer.
- The instruction stream can be placed in queue, while it is waiting to be decoded and processed by execution segment.
- This queuing mechanism provide efficient way for reducing average access time to memory for next reading instruction.

## Instruction Cycle

- The process involved in execution of a single instruction is known as instruction cycle.
- The computer needs to process each instruction with following sequence of steps
  - fetch instruction from memory
  - decode the instruction
  - calculate effective address
  - fetch operand from memory
  - execute the instruction
  - store the result.



## 4 segment Instruction

- The below fig. show have instruction cycle in CPU can be processed with a four, segment pipeline.
- While instruction is being executed in segment 4, the next instruction in sequence is busy fetching an operand from memory segment.
- whenever the memory is available, all subsequent instruction are fetch and placed in instruction FIFO.
- Thus, four different instruction can be processed at same time.

Four segment are represent as

- 1 FI → Fetch instruction
- 2 DA → Decode instruction and calculate effective address
- 3 FO → Fetch operand
- 4 EX → Execute instruction

|   |    |    |    |    |    |    |    |
|---|----|----|----|----|----|----|----|
| 1 | 2  | 3  | 4  | 5  | 6  | 7  | 8  |
| 1 | FI | DA | FO | EX | -  | -  | -  |
| 2 | -  | FI | DA | FO | EX | -  | -  |
| 3 | -  | -  | FI | DA | FO | EX | -  |
| 4 | -  | -  | -  | FI | DA | FO | EX |
| 5 | -  | -  | -  | -  | FI | DA | FO |

### Three-Segment Instruction pipeline

The instruction cycle can be divided into 3-suboperation and implemented in three segment

- 1) I: Instruction fetch
- 2) A: ALU operation
- 3) E: Execute instruction

Example

LOAD :  $P_1 \leftarrow M[\text{address } 1]$

LOAD :  $P_2 \leftarrow M[\text{address } 2]$

ADD :  $P_3 \leftarrow P_1 + P_2$

STORE :  $M[\text{address } 3] \leftarrow P_3$

| Clockcycle | 1 | 2 | 3 | 4 | 5 | 6 |
|------------|---|---|---|---|---|---|
| LOAD P1    | I | A | E | - | - | - |
| LOAD P2    | - | I | A | E | - | - |
| ADD P1+P2  | - | - | I | A | E | - |
| STORE P3   | - | - | - | I | A | E |

fis : 3 segment instruction timing  
with ~~no~~ data conflict.

### Arithmetic Pipeline (floating point addition and subtraction)

- high speed
- used for implementing floating point operation.
- used in array multiplier.

say, we have input,

$$X = A \times 2^a$$

$$Y = B \times 2^b$$

A and B are fraction

a and b are exponent

so, we can perform addition and subtraction  
in four segments, and they are

- compare the exponents
- align the mantissa
- add or subtract
- normalize the result.

## Difficulties in Pipeline / Pipeline conflict

Pipeline conflict can be defined as difficulty in instruction pipeline to deviate from its normal operation.

## Pipeline Conflict / Difficulties in Pipeline

→ Pipeline conflict can be defined as difficulty in instruction pipeline to perform its normal operation.

The three major causes are

### 1) Resource conflict

→ It is the condition where, two segments are trying to access the memory at same time.

### 2) Data dependency

→ It is the condition where, an instruction is depended on result of previous instruction, but the result is not available at that moment.

### 3) Branch difficulties

→ It is the condition that arises when there is change in value of PC (Program Counter) in branch or other instruction.

### Pipeline Hazard

→ Pipeline Hazard is the condition that occurs in pipelined machine that causes delay in execution of subsequent instruction in a particular cycle.

There are 3 types of pipeline Hazard:

#### 1) Data Hazard

→ Condition where source or destination operand of instruction are not available at expected time.

- Read before write

- write before Read

- Write before write

#### 2) Instruction Hazard:

→ caused due to delay in availability of instruction

#### 3) Structural hazard

→ condition where two instruction require same hardware resource at same time.

### Data Conflict and its solution

example

$$I_1 : R_1 \leftarrow R_2 + R_3$$

$$I_2 : R_2 \leftarrow R_1 + R_3$$

$$I_3 : R_5 \leftarrow R_6 + R_7$$

Let consider we have 3 segment instruction pipeline

i.e. I : instruction fetch

A : ALU operation

E : execute

|   |                |                |                |                |                |
|---|----------------|----------------|----------------|----------------|----------------|
| I | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> |                |                |
| A |                | I <sub>2</sub> | I <sub>3</sub> |                |                |
| E |                |                | I <sub>3</sub> | I <sub>2</sub> | I <sub>1</sub> |

Here, R<sub>1</sub> is only available after execution of I<sub>1</sub> i.e. after 4th clock cycle.

But, In 8th clock cycle I<sub>2</sub> requires R<sub>4</sub>  $\leftarrow R_1 + R_3$

Here, but R<sub>1</sub> is not available, which cause the data conflict.

Solving using No-operation / Delayed load

$$I_1 : R_1 \leftarrow R_2 + R_3$$

No-operation

$$I_2 : R_4 \leftarrow R_1 + R_3$$

$$I_3 : R_5 \leftarrow R_4 + R_5$$

| Stage | clock | 1     | 2     | 3     | 4     | 5     | 6     |
|-------|-------|-------|-------|-------|-------|-------|-------|
| I     |       | $I_1$ | N     | $I_2$ | $I_3$ |       |       |
| A     |       |       | $I_1$ | N     | $I_2$ | $I_3$ |       |
| E     |       |       |       | $I_4$ | N     | $I_2$ | $I_3$ |

Here,

1 addition clock cycle is required, which we used 1 No operation.

Here,  $I_2$  can perform ALU operation using  $R_1$ , where in previous case we were unable to utilize  $R_1$ .

Solving using Rearranging

$$I_1 : R_1 \leftarrow R_2 + R_3$$

$$I_3 : R_5 \leftarrow R_4 + R_6$$

$$I_2 : R_4 \leftarrow R_1 +$$

| stage | clock | 1 | 2     | 3     | 4     | 5     | 6     |
|-------|-------|---|-------|-------|-------|-------|-------|
| T     |       |   | $I_1$ | $I_2$ | $I_2$ |       |       |
| A     |       |   |       | $I_1$ | $I_3$ | $I_2$ |       |
| E     |       |   |       |       | $I_1$ | $I_3$ | $I_3$ |

Here,

we have simply arranged the instruction, thus we didn't require additional clock cycle.

which is more advantage using delayed load.

## Vector Operation and Processing

### Vector processing

- Vector processing is the central processing unit that perform the complete vector input in individual instruction.

### Application

- Long range weather forecasting
- Seismic data analysis
- Medical diagnosis
- Space flight simulation

## Matrix Multiplication



$$\begin{bmatrix} a_{11} & a_{12} \\ a_{21} & a_{22} \end{bmatrix} \times \begin{bmatrix} b_{11} & b_{12} \\ b_{21} & b_{22} \end{bmatrix} = \begin{bmatrix} c_{11} & c_{12} \\ c_{21} & c_{22} \end{bmatrix}$$

Let A and B be two matrix.

The result will be C which is also 2x2 matrix.

## Flynn's Classification

# Unit 7: Computer Arithmetic

## # Addition and Subtraction using signed-Magnitude Data

### • Hardware Implementation

for addition :-

$$A + B = \text{addition}$$

$$\begin{aligned} A - B &= A + 2's \text{ complement of } B \\ &= A + \bar{B} + 1 \end{aligned}$$

Thus,

- We require adder, completer.
- As, Bs for the sign of A and B.
- AVF for the overflow.
- E for output carry.
- M for changing mode (addition or subtraction).

when  $M=1$ , complement of B is done  
when,  $M=0$ , B is transferred to adder



flowchart



Algorithm

- ① Input A and B in Signed Magnitude representation
- ② As and Bs store the sign.
- ③ perform ... As ⊕ Bs
  - for addition → if As ⊕ Bs = 1 in minuend and subtrahend  
As ⊕ Bs = 0 in augend and addent  
go to step ④
  - for subtraction → if As ⊕ Bs = 0 in minuend and subtrahend  
As ⊕ Bs = 1 in augend and addent  
go to step ⑤
- ④ we simply add A and B and store to EA  
if overflow we perform AVF ← E  
go to step ⑥
- ⑤ we perform addition of A and B complement and 1  
since we are subtracting there is no overflow.  
so, AVF ← 0  
if E = 0, A < B, so complement of A is done and  
As is set to 1.  
else A ≥ B  
if (A ≠ 0) end  
else As ← 0
- ⑥ END

## Booth Algorithm

→ Booth Algorithm gives the procedure for multiplying binary integer in signed 2's complement representation.

Multiply

AC  $\leftarrow 0$   
 $g_0 \leftarrow 0$   
BR  $\leftarrow$  multiplicand  
 $g \leftarrow$  multiplier  
 $s_n \leftarrow n$



Multiplication is performed using Booth Algorithm.

Here, say  $B \times Q$

We initialize value of AC  $\leftarrow 0$ ,

n represent no bits to be taken

B is multiplicand

g is multiplier

$s_n$  is for sequence counter

set  $g_{n-1}$  be 0 initially

Now,

check the  $g_0$  and  $g_{n-1}$  bits

If  $g_0, g_{n-1} = 11$  or 00

perform right shift

else if  $g_0, g_{n-1} = 10$

perform 2's complement of B and add to AC

else  $g_0, g_{n-1} = 01$

perform addition of AC and B

decrease the SC.

If  $(SC \neq 0)$

then go to the check  $g_0, g_{n-1}$

end

## Hardware for booth algorithm



Qn1 20 X -13

S01N

Here,

Initially,

$$AC = 000000$$

$$M = 20 = 010100$$

$$q = -13 = 10011$$

$$g_{n-1} = 0$$

$$\bar{M} + 1 = 101100$$

$$S_1 = n = 5$$

(no of bits in q)

Now,

| AC     | q     | g_{n-1} | S <sub>n</sub> | Remark                           |
|--------|-------|---------|----------------|----------------------------------|
| 000000 | 10011 | 0       | 5              | Initial                          |
| 101100 | 10011 | 0       | 5              | $AC \leftarrow AC + \bar{M} + 1$ |
| 110110 | 01001 | 1       | 4              | qshft and S <sub>n-1</sub>       |
| 111011 | 00100 | 1       | 3              | ashr                             |
| 001110 | 00100 | 1       | 3              | $AC \leftarrow AC + M$           |
| 000111 | 10010 | 0       | 2              | ashr                             |
| 000011 | 11001 | 0       | 1              | ashr                             |
| 101111 | 11001 | 0       | 1              | $AC \leftarrow AC + \bar{M} + 1$ |
| 110111 | 11100 | 1       | 0              | ashr                             |

Here,

$$AC = 1101111100$$

$$\begin{array}{r} \text{If complement} \\ 0010000011 \\ + 1 \\ \hline 0010000100 \end{array} = -(260)$$

Qn2

$$-4x-3$$

so 10

Here

Initially

Let,

$$AC = 0000$$

$$M = -4 = \frac{0100 + 1}{+1} = 1011$$

1100

$$M+1 = 0100$$

$$Q = -3 = 1101 \Rightarrow Q_{n-1} = 0$$

$$S_n = \text{no of bits in } Q = 1101 = 4$$

Now,

| AC   | g    | $Q_{n-1}$ | $S_n$ | Remark                     |
|------|------|-----------|-------|----------------------------|
| 0000 | 1101 | 0         | 4     | initial                    |
| 0100 | 1101 | 0         | 4     | $AC \leftarrow AC + M + 1$ |
| 0010 | 0110 | 1         | 3     | ashr                       |
| 1110 | 0110 | 1         | 3     | $AC \leftarrow AC + M$     |
| 1111 | 0011 | 0         | 2     | ashr                       |
| 0011 | 0011 | 0         | 2     | $AC \leftarrow AC + M + 1$ |
| 0001 | 1001 | 1         | 1     | ashr                       |
| 0000 | 1100 | 1         | 0     | ashr                       |

Here

$$LSB = (0000\ 1100)$$

↑ is 0 so, no need to take 2's complement

So,

1100

8 4 2 1

$$= 8 + 4 = 12 \text{ is answer}$$

Example  $15/4$  using restoring division algorithm  
Soln

$$\text{Here, } M = 4 = 0\ 0100$$

$$q = 15 = 1111.$$

$$\bar{M}+1 = 11100$$

| A      | q     | Sn | Remark                                 |
|--------|-------|----|----------------------------------------|
| 00000  | 1111  | 4  | initial                                |
| 00001  | 111*  | 4  | sh L                                   |
| 111001 | 111*  | 4  | $A \leftarrow A + \bar{M} + 1$         |
| 00001  | 1110  | 3  | $A \leftarrow A + M, S_{n-1}, q_{n-1}$ |
| 00011  | 110*  | 3  | sh L                                   |
| 1111   | 110*  | 3  | $A \leftarrow A + \bar{M} + 1$         |
| 00011  | 1100  | 2  | $A \leftarrow A + M, S_{n-1}, q_{n-1}$ |
| 00111  | 100*  | 2  | sh L                                   |
| 00011  | 100*  | 2  | $A \leftarrow A + \bar{M} + 1$         |
| 00011  | 1001  | 1  | $q_0 \leftarrow 1, S_{n-1}$            |
| 00111  | 001*  | 1  | sh L                                   |
| 000101 | 001*  | 1  | $A \leftarrow A + \bar{M} + 1$         |
| 00011  | 00101 | 0  | $S_{n-1}, q_n$                         |

Here,  
Dividend =  $\Phi = 0011 = 3$   
Reminder =  $A = 0011 = 3$

Ex. 21. 15/3 using restoring division

$$M = 00011$$

$$q = 1011$$

$$q + \bar{M} = 11100 + 1 = 11101$$

$$A = 00000$$

| A                   | q    | Sn | Remark                         |
|---------------------|------|----|--------------------------------|
| 00000               | 1011 | 4  | initial                        |
| 00001               | 011  | 4  | sh L                           |
| 11100               | 011  | 4  | $A \leftarrow A + \bar{M} + 1$ |
| 00001               | 0110 | 3  | $q_0 \leftarrow 0$             |
| 00010               | 110  | 3  | sh L                           |
| 11111               | 110  | 3  | $A \leftarrow A + \bar{M} + 1$ |
| 00010               | 1100 | 2  | $q_0 \leftarrow 0$             |
| 00101               | 100  | 2  | sh L                           |
| 00010               | 100  | 2  | $A \leftarrow A + \bar{M} + 1$ |
| 00010               | 1001 | 1  | $q_0 \leftarrow 1$             |
| 00101               | 001  | 1  | sh L                           |
| 00010               | 001  | 1  | $A \leftarrow A + \bar{M} + 1$ |
| 00010               | 0011 | 0  | $q_0 \leftarrow 1$             |
| Reminder $q = 1011$ |      |    |                                |

Ques

5/8

Here,

$$M = 0.0011$$

$$Q = 0.101$$

$$S_q = 4$$

$$A = 0.0000$$

A      Q      S<sub>q</sub>

0.0000    0.101    4

Remark

## Non restoring division



Q1-

5/3 using no-restoring

Soln

$$M = 3 = 00011$$

$$Q = 5 = 0101$$

$$A = 00000$$

$$\overline{M+1} = 11100 + 1 = 11101$$

Count = 4

| A     | Q    | Count | Remark               |
|-------|------|-------|----------------------|
| 00000 | 0101 | 4     | initial              |
| 00000 | 101* | 4     | sh L                 |
| 11101 | 101* | 4     | $A \leftarrow A + M$ |
| 11101 | 1010 | 3     | $Q_0 \leftarrow 0$   |
| 11011 | 010* | 3     | sh L                 |
| 11110 | 010* | 3     | $A \leftarrow A + M$ |
| 11110 | 0100 | 2     | $Q_0 \leftarrow 0$   |
| 11100 | 100* | 2     | sh L                 |
| 11111 | 100A | 2     | $A \leftarrow A + M$ |
| 11111 | 1000 | 1     | $Q_0 \leftarrow 0$   |
| 11111 | 000* | 1     | sh L                 |
| 00010 | 000* | 1     | $A \leftarrow A + M$ |
| 00010 | 0001 | 0     | $Q_0 \leftarrow 1$   |

$$A = 00010 = 2 \quad (\text{remainder})$$

$$Q = 0001 = 1 \quad (\text{quotient})$$

## Unit 8: Input Output Organization

### Input Output Interface

- It is a type of interface which provides a method for transferring information between internal storage and external I/O device. I/O device is called Input Output Interface.

#### Why we need I/O (device) Interface?

- Helps in synchronization mechanism.
- Speed up the data transfer between components.
- Used in conversion of signal values.

#### I/O Bus and Interface Method

- I/O bus consist of data line, address line, control line.
- Different kinds of device can be employed in any general purpose computer.
- I/O bus can be defined as a pathway used for input and output device to communicate.



### Types of I/O commands

- Status command (check various status conditions)
- Data input command (comes interface to take the data from peripheral)
- Data output command (read the data from bus and save in interface buffer)

### I/O vs Memory Bus

I/O bus

↳ used in transferring information between CPU and I/O device

→

Memory Bus

→ used in transferring information between main memory and CPU

3 ways bus can communicate between memory and I/O interface.

- Uses two separate buses, one to communicate with memory and other to communicate with I/O devices.
- Uses one common bus for memory and I/O but separate control lines.
- Uses one common bus for memory and I/O with common control line for both.

### Isolated I/O

→ Address, bus and data bus are common but separate control line for I/O and Memory

→ Different address space are used for computer memory and I/O device.

→ Special command are used in Isolated I/O such as MOV, IN, OUT instruction.

→ Separate read/write control line for I/O and memory transfer.

### Memory Mapped I/O

control line  
→ Address, data and common bus are common for I/O device and memory

→ Same address space are used by I/O device and computer memory

→ There are no such special command in memory mapped I/O. only MOV instruction.

→ Single read/write control line for I/O and memory transfer.

## Asynchronous Data transfer

- The transmission of data sent in form of byte or character is called asynchronous data transfer.
- Discontinuous and inconsistent timed transfer of data ~~is~~ by byte is called asynchronous data transfer.

## Strobe

- The method in asynchronous data transfer used for / that uses single control line for timing each transfer.



(a) block diagram



(b) Timing diagram

Strobe may be activated by source unit or destination unit.

- The data bus carries binary information from source to data unit.
- The strobe is a single line that validate the transferred binary information.

## Handshaking

- In strobe, there was no any information about, the data transfer was success or not. So, to overcome this problem, we use handshake method.
- Here, the problem is solved by introducing 2 control signal which provides / relay information on transfer of data.



## Mode of Transfer

- 1. Programmed I/O
- 2. Interrupt-Initiated I/O
  - Priority Interrupt
  - Daisy Chain Priority
  - Parallel Interrupt Priority
- 3. DMA

## Programmed I/O

- Programmed I/O operations are the result of I/O instruction written in computer program.
- Each data transfer is initiated by I/O instruction.
- CPU constantly monitors flags to let interface inform the CPU when it is ready to transfer data.
- CPU runs program periodically, checking the status of I/O device one by one.
- If any device has its status set, the CPU performs data transfer.
- Works on principle of polling.



If flag is set, then data transfer takes place through CPU.

## Interrupt-Initiated I/O

- While CPU is running a program, it doesn't check the flag constantly.
- When the flag is set, the CPU deviates from current process and takes care of input or output transfer immediately.



- CPU responds to the interrupt signal by storing return address into memory stack and then control branches to service routine that processes I/O transfer.
- Then, it resumes the previous process with the help of information stored in stack.

## Direct Memory Access (DMA)

- The process of transfer of data and instructions directly between the I/O device and memory is called direct memory access.
- The transfer between of data, between storage divide and main memory are often limit by the CPU.
- So removing CPU from the path, increases the overspeed
- It helps in efficient use of interrupt
- Expensive



- In DMA transfer, CPU is ideal and has no control of memory bus.

## DMA controller



## Input-Output processor (IOP)



fig :- Input-Output processor of computer

- I/O Processor have capacity to access memory directly.
- It control and manages input/output task
- It fetches and execute the instruction from I/O device.
- I/O processor provides a path for transfer of data between peripheral device and memory.

## IOP vs DMA

| I/O Processor                     | Direct Memory Access                           |
|-----------------------------------|------------------------------------------------|
| → Data transfer speed is slower   | → Data speed rate is higher                    |
| → less expensive                  | → more expensive                               |
| → Extra hardware are not required | → Extra hardware (DMA controller) are required |
| → used in smaller data transfer   | → used in large data transfer                  |

## Unit : 9 Memory Organization

- Memory Hierarchy
- Main Memory ✓
- RAM
- ROM
- Memory address Map ✓
- Memory Connection to CPU
- Auxiliary memory
- Hardware organization
- Match logic
- Read operation
- Write operation
- Locality of Reference
- Hit & Miss ratio
- Mapping
- Write Policies

## Memory Hierarchy

- It can be defined as the method of arranging different kind of storage present on the computing device based on speed of access.



Main memory → RAM, ROM

## key characteristics of memory system

- location
- organization
- capacity
- physical characteristics
- unit of transfer
- access method
- performance

## RAM and ROM chip



fig :- RAM Chip



fig : Typical ROM chip

## Auxiliary Memory

- Secondary memory
- Magnetic Disk
- Magnetic tape

## Memory Connection to CPU

- RAM and ROM are connected to CPU through data and address buss.
- We have used 512 bytes of RAM and 512 bytes of ROM.
- RAM receives low-order bits of address bus (AD0-AD7) (AD0 - AD9) order bits.
- ROM receives high-order of bits of address bus (AD8 - AD9)
- RD and WR output of CPU is taken as input for RAM and ROM.
- AD 10 is used for selecting between RAM and ROM.



## Associative memory / CAM

- A memory unit accessed by content is called associative memory or content addressable memory (CAM)
- Used / help to do parallel searches by data allocation
- Expensive than RAM, since each cell must have storage capacity as well as logic circuit.
- So, Associative memory is only used, where search time is very critical.



## Cache Memory

- It can be defined as the chip that makes retrieving data from computer's memory more efficient.
- It acts as temporary storage area.
- It is integrated directly into CPU chip
- It increases efficiency / speed, since it is physically close to processor.
- Less storage
- Smaller size

### • Hit and miss ratio

Cache hit refer to the situation where the cache is able to successfully retrieve data and content that was saved to it.

Cache miss refer to the situation where cache is unsuccessful to retrieve data.

Hit ratio → The ratio of the number of hits divided by total CPU references to memory is hit ratio.

## Mappings

- The transformation of data from main memory to cache memory is known as mapping.



## Types of mapping

## Associative Mappings

|        |                      | Block 0  | $w_0, w_1, w_2, w_3$ |
|--------|----------------------|----------|----------------------|
|        |                      | Block 1  |                      |
| line 0 | $B_0, \dots, B_{31}$ | :        |                      |
| line 1 | $B_0, \dots, B_{31}$ | :        |                      |
| line 2 | $B_0, \dots, B_{31}$ | :        |                      |
| line 3 | $B_0, \dots, B_{31}$ | :        |                      |
|        |                      | Block 15 | $\dots, w_6, w_7$    |
| cache  |                      |          |                      |

- Associative mapping is a technique or procedure used to assign each memory block in main memory to a freely available cache line.

- Main memory have 64 words ( consider )
  - Let assume Block size is 4 word

Thus,

no of blocks in main memory =  $\frac{64}{4} = 16$

say, we have cache of 16 word  
 line size = block size = 4 word  
 Thus,  
 no of lines in cache =  $16/4 = 4$

- The physical address required to address the 64 word is given by

$$\begin{aligned}
 &= \log_2 64 \\
 &= \log_2 2^6 \\
 &= 6 \text{ bit}
 \end{aligned}$$

- Here, 2 least significant bit is used for addressing block offset of main memory and)
  - 4 most significant bit is used for addressing block of main memory.

- and) 4 most significant bit is used for addressing block of main memory.

- Here, any block can reside on any cache line without any restriction.

## Direct Mapping

→ Direct Mapping is a technique or procedure used to assign each memory block in main memory to particular cache line.



→ Consider, Main memory has 128 word.  
say, block size is 4 word.  
Then,

$$\text{No of block in Main memory} = 128/4 = 32 (0, 1, 2, \dots, 31)$$

→ Consider, Cache memory has 16 word  
Since, block size = line size = 4 word

Then,  
 $\text{No of line in cache} = 16/4 = 4 \text{ lines}$

→ Here, to address 128 word, we need physical address of  
 $= \log_2 128$   
 $= \log_2 2^7$   
 $= 7 \text{ bit}$

→ Here, the block of main memory are assigned to particular lines in cache.  
for eg:-

B<sub>0</sub>, B<sub>4</sub>, B<sub>8</sub>, B<sub>12</sub>, ..., B<sub>28</sub> are assigned to line 0 in cache, this is done using K mod n.  
K = block no, n = no of lines i.e. 4.

→ Due to which there are more chance / possibility of miss.

### Set-Associative Mapping

→ Set-Associative mapping is the combination of direct and associative mapping.

Let say, we have  $k$ -ways set. Then

total no of set is given by =  $\frac{\text{no of line}}{k}$ .

for 2 way set associative,

$$\text{no of set} = \frac{4}{2} = 2 \text{ say } (S_0, S_1)$$

|          |                                                                                               |
|----------|-----------------------------------------------------------------------------------------------|
| Blocks   | w <sub>00</sub> w <sub>10</sub> w <sub>20</sub> w <sub>30</sub>                               |
| Block 2  |                                                                                               |
| :        |                                                                                               |
| Line 0   | B <sub>00</sub> , B <sub>01</sub> , B <sub>02</sub> , ..., B <sub>03</sub>   1 S <sub>0</sub> |
| Line 1   | B <sub>10</sub> , B <sub>11</sub> , B <sub>12</sub> , ..., B <sub>13</sub>   1                |
| Line 2   | B <sub>20</sub> , B <sub>21</sub> , B <sub>22</sub> , ..., B <sub>23</sub>   2 S <sub>1</sub> |
| Line 3   | B <sub>30</sub> , B <sub>31</sub> , B <sub>32</sub> , ..., B <sub>33</sub>   1                |
| Block 15 | ... w <sub>63</sub>                                                                           |

say, Memory word = 64

Block size = 4

$$\text{No of block} = 64/4 = 16 (0, 1, 2, \dots, 15)$$

$$\text{Cache word} = 16$$

$$\text{Line size} = 4$$

$$\text{No of line} = 16/4 = 4$$

→ Here, we have mapped the cache using P.A.

$$= \log_2 64$$

$$= \log_2 16$$

$$= 4 \text{ bit}$$



Here, mapping is done using formula  $k \bmod \text{set no}$ .

In this case,

$$k = \text{block no}$$

$$s = \text{set no}$$

for example

for block no 4,

$$= 4 \bmod 2$$

= 0 (so, it lies in set S<sub>0</sub>)

for block no 5

$$= 5 \bmod 2$$

= 1 (so, it lies in set S<sub>1</sub>)

## Address space and Memory space

- The set of virtual address is called address space.
- The set of location, where address are stored in main memory is called memory space.
- Address space is larger than memory space.
- Address space is generally generated by programs.
- Memory space is the actual main memory location.

Example :-

$$\begin{aligned}\text{Main-memory size} &= 32 \text{ k word } (K = 1024) \\ &\Rightarrow (32 \times 1024) \text{ bits} \\ &= 32768 \text{ bits}\end{aligned}$$

$$\begin{aligned}\text{Physical address to address 32768 bit} &= \log_2 32768 \\ &= \log_2 2^{15} \\ &= 15 \text{ bits}\end{aligned}$$

$$\begin{aligned}\text{Secondary memory space} &= 1024 \text{ k words} \\ &= 1024 \times 1024 \text{ bits}\end{aligned}$$

$$\begin{aligned}P.A &= \log_2 1024 \times 1024 \\ &= \log_2 2^{20} \\ &= 20 \text{ bits} \quad (\text{Virtual address})\end{aligned}$$

Here, Secondary memory have capacity of storing 32 main memory.

$$\begin{aligned}ie &= \frac{1024}{32} \quad (\text{Size of secondary memory}) \\ &\quad (\text{Size of main memory}) \\ &= 32\end{aligned}$$

Here,  $N = 1024 \text{ k denoting address space}$

$M = 32 \text{ k denoting memory space}$



## Address Mapping using Pages

Address mapping refer to process of determining a logical address using physical address and vice versa.

Page refer to the group of address space of same size.

# Unit 5: Central Processing Unit

- Major Component of CPU
- CPU organization

## • CPU Instruction



↗ RISC and CISC

↗ Overlapping window

### Major Component of CPU



fig :- Major Component of CPU

### Register Organization



### Stack Organization



fig: Register stack

- LIFO principle.
- Stack is storage device.
- Two operation
  - insert (push)
  - delete (pop)

|                        |                          |
|------------------------|--------------------------|
| IF ( FULL=0 )          | ( ie stack is not full ) |
| new item can be pushed |                          |
| if ( empty = 1 )       | ( ie stack is empty )    |
| no item can be popped  |                          |

|                                 |                                |
|---------------------------------|--------------------------------|
| SP ← SP + 1                     | ( increasing stack point )     |
| M[SP] ← DR                      | ( write item on top of stack ) |
| If ( SP = 0 ) then ( full ← 1 ) |                                |
| Empty ← 0                       |                                |

## Instruction format

- It can be defined as a sequence of bits in machine instruction that defines the layout of the instruction.
- Basic computer consist of 3-instruction code format

|      |        |         |
|------|--------|---------|
| I    | Opcode | Address |
| or,  |        |         |
| Mode | Opcode | operand |

- It helps to represent instruction.
- It is classified into 3 fields
  - Address field → It specifies the address of memory or register.
  - Opcode → It specifies the operation, which we want to perform.
  - Mode → It specifies how the operand is determined

## Address field in Computer Organization

### Single Accumulator organization

- No need to specify the accumulator.
- Operation automatically be performed on it without defining accumulator.

for example:- CMA;  
It complement content of accumulator.  
ADD A;

It perform AC → AC + A

### General purpose Register organization

- It is the organization where corresponding instruction contains multiple address(register or operand)

for example

ADD R1, R2, R3  
Here, R ← R1 + R2 + R3 (R1 is source)  
MOV R1, R2 (R1 ← R2)

### 3. Stack Organization

→ According to address field, the stack organization have 5 address instruction

- 3 address instruction
- 2 address instruction
- 1 address instruction
- 0 address instruction
- RISC instruction

$$X = (A+B) * (C+D)$$

#### 3 address instruction

→ The correspond instruction consist of three address that may be register or some memory address.

ADD R<sub>1</sub>, A, B

ADD R<sub>2</sub>, C, D

MUL X, R<sub>1</sub>, R<sub>2</sub>

$$R_1 \leftarrow M[A] + M[B]$$

$$R_2 \leftarrow M[C] + M[D]$$

$$X \leftarrow R_1 * R_2$$

#### 2 address instruction (Mov)

MOV R<sub>1</sub>, A

ADD R<sub>1</sub>, B

MOV R<sub>2</sub>, C

ADD R<sub>2</sub>, D

ADD R<sub>1</sub>, R<sub>2</sub>

MOV X, R<sub>1</sub>

$$R_1 \leftarrow M[A]$$

$$R_1 \leftarrow R_1 + M[B]$$

$$R_2 \leftarrow M[C]$$

$$R_2 \leftarrow R_1 + M[D]$$

$$R_1 \leftarrow R_1 * R_2$$

$$M[X] \leftarrow R_1$$

#### 3. One address Instruction (load, store)

LOAD A

$$AC \leftarrow M[A]$$

ADD B

$$AC \leftarrow AC + B$$

STORE T

$$M[Temp] \leftarrow AC$$

LOAD C

$$AC \leftarrow M[C]$$

ADD D

$$AC \leftarrow AC + M[D]$$

MUL T

$$AC \leftarrow AC * M[T]$$

STORE X

$$M[X] \leftarrow AC$$

#### 4. Zero Address Instruction

→ We use stack operation such as push, pop

PUSH A

PUSH B

ADD

PUSH C

PUSH D

ADD

MUL

POP X

$$C+D \leftarrow Pop$$

$$B \leftarrow Pop$$

$$A+B \leftarrow Pop$$

$$B \leftarrow Pop$$

$$A \leftarrow Pop$$

$$X = A+B * C+D$$

### 5. RISC Instruction

- Reduce Instruction Set Computer instruction where, it is restricted or denied to use load and store when communicating between memory and CPU.
- All the instruction should be executed within register without referring to memory.

|                                                      |                                                  |
|------------------------------------------------------|--------------------------------------------------|
| LOAD R <sub>1</sub> , A                              | R <sub>1</sub> ← M[A]                            |
| LOAD R <sub>2</sub> , B                              | R <sub>2</sub> ← M[B]                            |
| LOAD R <sub>3</sub> , C                              | R <sub>3</sub> ← M[C]                            |
| LOAD R <sub>4</sub> , B                              | R <sub>4</sub> ← M[B]                            |
| ADD R <sub>1</sub> , R <sub>1</sub> , R <sub>2</sub> | R <sub>1</sub> ← R <sub>1</sub> + R <sub>2</sub> |
| ADD R <sub>3</sub> , R <sub>3</sub> , R <sub>4</sub> | R <sub>3</sub> ← R <sub>3</sub> + R <sub>4</sub> |
| MUL R <sub>1</sub> , R <sub>1</sub> , R <sub>3</sub> | R <sub>1</sub> ← R <sub>1</sub> * R <sub>3</sub> |
| STORE X, R <sub>1</sub>                              |                                                  |

### Addressing Mode

- Addressing mode can be defined as the method of specifying the location of operand.
- It is used by microprocessor to deliver the instruction to machine.
- It specifies how to calculate effective memory address of operand.

#### Addressing

#### Types of Implied Mode

(CLA)

1. Implied Mode :- Operand is implicitly available. (CHA)  
eg :- CMA, CLA
2. Immediate Mode :- Operand is specified in instruction itself.  
eg :- JMP 3000, MVI B 05
3. Register Mode :- Operand are register that is present in CPU.  
eg :- ADD B, INRD
4. Register Indirect Addressing Mode :- Selected register contains the address of operand  
eg :- MOV A, M
5. Direct Addressing Mode :-
  - Effective address = Address part of instruction.
6. Indirect Addressing Mode :-
  - Effective address = address part of instruction  
or, address contain offset add.
  - + content of CPU register

7. Relative Address Mode:-

\* Effective address = PC content + Address part of instruction

8. Index Addressing Mode :-

\* Effective address = Index register content + Address part of instruction

9. Base Register Addressing Mode :-

\* Effective address = base register content + Address part of instruction

|          | 200  | Load to AC       | Mode |
|----------|------|------------------|------|
|          | 204  | Address = 500    |      |
| PC = 200 | 202  | NEXT instruction |      |
| <br>     | <br> | <br>             | <br> |
| R1 = 400 | 399  | 450              |      |
|          | 400  | 700              |      |
| <br>     | <br> | <br>             | <br> |
| XR = 100 | 500  | 800              |      |
| (AC)     | 600  | 900              |      |
| <br>     | <br> | <br>             | <br> |
|          | 702  | 325              |      |
|          | 800  | 300              |      |

| Addressing Mode   | Effective Address     | AC content |
|-------------------|-----------------------|------------|
| *                 | 500 (500 → 800)       | 800        |
| *                 | 800 (500 → 100 → 200) | 300        |
| *                 | 204                   | 500        |
| Register Address  | -                     | 400        |
| Register Indirect | 400                   | 700        |
| Auto Increment    | 400                   | 700        |
| auto decrement    | 399                   | 450        |
| *                 | 702                   | 325        |
| *                 | 600                   | 900        |
| *                 | 600                   | 900        |

## Data Transfer and Manipulation

### Data transfer Instruction

- load
- store
- MOV
- IN
- OUT
- PUSH
- POP

### Data Manipulation Instruction

- Arithmetic Instruction
  - Inc
  - Dec
  - ADD
  - MUL
  - SUB
- Logical
  - clear
  - AND
  - OR
  - EX-OR
- shift Instruction
  - shr
  - shl
  - ROR
  - Rol

### Program Control

- JMP
- BR
- CALL
- RETURN

## Subroutine Call and Return

- A subroutine can be defined as the self contained sequence of instruction that perform a given computational task.
- We use different instruction to transfer program control to subroutine like:-

CALL SUBROUTINE  
JMP SUBROUTINE  
BR(Branch) SUBROUTINE

### Main Routine



## Program Interrupt

- Program Interrupt can be defined as the transfer of program control from current programme to another service programme as result of internal or external request.
- The main difference between subroutine and program interrupt is that :-
  - program interrupt is initiated by external or internal signal
  - subroutine call is initiated by execution of an instruction.

### Type of Interrupt

- ⇒ External Interrupt → Due to I/O device.
  - eg: I/O device requesting transfer of data.
- ⇒ Internal Interrupt → cause due to illegal or erroneous use of instruction.
  - eg:- overflow, attempt to divide by zero, stack overflow, invalid opcode
- ⇒ Software Interrupt → initiated by executing an instruction.
  - eg: Subroutine call

## Complex Instruction Set Computer (CISC)

- Computer consisting of large number of instruction and addressing mode is known as CISC.
- Complex architecture
- High cost
- More powerful
- There are variation in length instruction format.
- Consist of several Instruction Cycle.
- Manipulation of data takes place directly in memory.
- It is Microprogrammed Control Unit based.
- Intel 8080, Motorola 6800 etc

example

MULT 2 2 3 3

17

ADD 2 2 1 3 3

i.e.  $x = x + y$

## Reduced Instruction Set Computer (RISC)

- Computer consisting of reduced size of instruction and few addressing mode is called RISC
- simple architecture
- Lower cost
- most powerful
- The is fixed length Instruction format.
- consist of single Instruction cycle.
- uses only in Register.
- It is hardwired Control unit based.
- MIPS, Fugaku, SPARC etc

example

LOAD R1, A

LOAD R2, B

ADD R1, R2

STORE R1

$R_1 \leftarrow R_1 + R_2$

## Register Overlapped Window



Here,

We use register Overlapped window in order to increase the performance of the computer.

There are 3 classes of register.

- Global register (G)
  - ↳ available to all function
- Window local register (L)
  - ↳ Variable local to function
- Window shared register (C)
  - ↳ permits data to be share

Here,

$$\text{Window size } W = L + 2C + G$$

$$\text{Total no of register} = (L+G) \cdot W + G$$

new

- When function is called by a function, the higher numbered register of calling function window are shared with called function which consist of lower numbered register. This way the caller's high and called function's low register overlap.

## Unit 4: Microprogrammed Control

\* Hardwire Control Unit

\* Up Control Unit

- Control word
- Micro program
- Control memory
- Address Register
- Sequencer

} Concept

\* Addressing Sequencing

- Conditional Branching
- Mapping of instruction
- subroutine
- Microinstruction format
- Symbolic Microinstruction

\* Design of Control unit

- Decoding
- Microprogram Sequence

| Hardwired Control Unit                                                                                                           | Microprogram Control Unit                                                                              |
|----------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------|
| → The control unit where the control signal are generated by hardware using conventional logic is called Hardwired control unit. | → The control unit which stores or save the binary control values is called microprogram control unit. |
| → Fast                                                                                                                           | → slow                                                                                                 |
| → Cost more                                                                                                                      | → cost less                                                                                            |
| → It uses sequential circuit composed of logic gate                                                                              | → It uses micro programming having micro instruction                                                   |
| → Difficult to alter hardware for new instruction                                                                                | → Easy to alter instruction for new operation                                                          |
| → instruction, which are complex, they are implemented using Hardware CBA.                                                       | → It can process complex instruction                                                                   |
|                                                                                                                                  | → Design are systematic<br>→ Decoding is easy.                                                         |
| → Design are complicated<br>→ Decoding is complex                                                                                |                                                                                                        |

### Terms

#### ① Control Word

→ The word whose individual bits represent the various control signal is called control word.

→ Control variable is represented by control word string of 1's and 0's.

#### ② Microprogram

→ A micro program can be defined as a sequence of microinstruction.

→ easy to alter

→ flexible

→ can solve complex instruction

#### ③ Control Address Register

→ Control address register are those register which contains the address of next microinstruction.

#### ④ Sequencer

→ It determine the next address from within control memory (where the micro instruction are stored).

## Address Sequencing / Microprogram Sequence

used in Microprogrammed Control unit



fig: Address Sequencing Selection

- The Addressing sequencing provides a mapping from the instruction bit to control memory address.

The process of finding address of next micro-instruction to be executed is called address sequencing.

→ An Address Sequencer must have capability of finding address next micro instruction under following situation such as

- loop
- unconditional branch
- subroutine call
- in-line sequencing

→ Incrementer → increment the content of Control Address Register by one to select next instruction in sequence.

→ condition Branching

↳ check zero, overflow, Negative, carry  
if condition is true, we perform branching  
else  
we increment Control address Memory

Mapping :- The process of generating the address from opcode is called mapping.

## Mapping of MicroInstruction

|                          |           |         |
|--------------------------|-----------|---------|
| Computer instruction     | 10 15     | Address |
| Mapping Bits             | 0 XXXX 00 |         |
| Microinstruction address | 0101100   |         |

fig: Mapping from Instruction Code to microinstruction Address

- It has an operation code of 4 bit which can specify up to 16 distinct instruction.
- The mapping procedure converts the 4 bit operation code to 7 bit address for control memory.
- Mapping consist of 0 at MSB of all the address.
- The mapping function is sometimes implemented through integrated circuit called PLD.

## Microinstruction format

| 15 14 13 12.. 0 | I opcode | Address | 7 6 5 4 3 2 1 0                   |
|-----------------|----------|---------|-----------------------------------|
|                 | I        |         | F1 F2 F3 CB BR AD                 |
| ADD             | 0000     |         |                                   |
| BRANCH          | 0001     |         | F1, F2, F3 : Microoperation field |
| STORE           | 0010     |         | CD : Condition for branching      |
| EXCHANGE        | 0011     |         | BR : Branch field                 |
|                 |          |         | AD : Address field                |

## Symbolic Microinstruction

→ used to represent each field in microinstruction.

It is divided into - five fields

- Iable → may be empty or may specify specific symbol
- Micro operation : consist of 1, 2, 3 symbol separated by commas
- CD : { U, I, S, Z } and
- BR { JMP, CALL, RET } and
- AD { NEXT, EMPTY } and
- symbols are used in microinstruction as in assembly language

## Microprogram Sequencer

- Used in Microprogram Control Unit.
- It generates the next microinstruction address and provide to Control Address Register.
- Microprogram sequencer can be defined as the subunit of Microprogram Control Unit with generate the next microinstruction address and provide to Control Memory so that execution to that instruction can be performed.



## Unit-3 (Basic Computer Organization)

### \* Concepts

- Instruction code
- operation code
- Concept of instruction format ✓
- store program ✓

### \* Computer Register and Memory

- List of register
- Memory of Basic Computer
- Common bus for basic computer

### \* Computer Instruction

- Fetch, Decode
- Type of Instruction
- Memory Reference Instruction
- I/O Instruction
- I/O interrupt
- Program Interrupt
- Interrupt Cycle

### \* Flowchart of basic Computer

## Instruction code

→ It is a group of bits that instruct computer to perform specific operation.

## \* Operation Code

→ Operation code can be defined as those code  
are to define operation such as add, subtract,  
multiply, shift etc.

## \* Instruction format

4

|    |        |         |    |   |
|----|--------|---------|----|---|
| 15 | 14     | 12      | 11 | 0 |
| I  | Opcode | Address |    |   |

- Defines layout of the instruction.
  - consist of 3 format code
  - represent instruction

Classified in 3 files

Address field :- It specifies the address of memory location.

Opcode: specifies the operation  
memory or registers

I = Mode, specifies how operand is determined.

If  $opcode = 111$  and  $I = 0$

|   |     |                    |
|---|-----|--------------------|
| 0 | 111 | Register operation |
|   |     | Register reference |

If  $\text{opcode} = 111$  and  $T = 1$

I/O | I/O operation  
I/O instruction

## \* Stored Program Organization

Memory 4096x16

| opcode             | Address | Instruction<br>(Program) |
|--------------------|---------|--------------------------|
| Instruction format |         |                          |
| 15                 | 6       |                          |
| Binary Operand     |         | Operands<br>(AC)         |

Process Register  
(AC)

→ One processor register and instruction format of 2 part are required for organize a computer in simplest way.

- Here, one part of Instruction Format is opcode, which specifies the operation to be performed with operands.
- Next part is address part which specifies the address of operand.
- This operand is read from the memory and operated together with the data stored in processor register.
- Instruction are stored in one section of memory and data in another.
- Instruction code is stored in 16-bit memory word.
- Here, ~~we~~ we have taken memory of 4096 words.

We need 12 bit to specify the address in memory.

$$2^{12} = 4096$$

We have now 4 bit left, which is used to specify one of 16 possible operation.

$$2^4 = 16$$

### Instruction Set Completeness

- A computer should have a certain set of instruction which allows the user to construct machine language program to implement or evaluate function. should so, the user, have access to the complete instruction set.
- The Instruction set is said to be complete if it includes:-
  - Arithmetic and logical instruction
  - Shift instruction
  - Input/Output instruction
  - Instruction for moving information from one memory and register
  - Program Control instruction

## Timing and Control



## Direct and Indirect Addressing of Basic Computer



In Direct Addressing Mode, the address part holds the operand. When  $I=0$ , it indicates direct addressing mode.

In Indirect Addressing Mode, the address part holds the effective address of operand.

When  $I=1$ , it indicates indirect address mode.

## Common bus System



## Computer Register

Register can be defined as the high speed memory storing unit which is used by microprocessor during any operation.

Let's say a memory unit of 4096 words and each word contain 16-bit.

Here,  
12 bit are used to specify address of operand  
3 bit for the operation  
1 bit to specify direct or indirect address

### List of Register

- DR (Data Register) → holds data - 16 bit
- AC (Accumulator) → all operations are performed in AC - 16 bit
- IR (Instruction Register) → instruction read from memory are held - 16 bit
- TR (Temporary Register) → temporary hold of data during processing - 16 bit
- PC (Program Counter) → holds the address of next instruction - 12 bit
- AR (Address Register) → holds address of current instruction - 12 bit
- INPR → input { 8 bit }
- OUTR → output

## Instruction Cycle



## Fetch and Decode

$T_0 : AR \leftarrow PC$   
 $T_1 : IR \leftarrow M[AR], PC \leftarrow PC + 1$   
 $T_2 : D_0 \dots D_7 \leftarrow \text{Decode } IR(12-14)$

- Initially, PC consist the address of 1st instruction
- Then, SC is set to zero, ( $SC \leftarrow 0$ )
- The value of PC is loaded to AR (address Register).
- No AR holds the address of 1st instruction
- The data / memory value of AR is loaded to IR.

After every clock pulse, the value of Sequence Counter (SC) is incremented by 1, so the timing signal also go through sequence  $T_0, T_1, T_2 \dots T_n$ .



D<sub>7</sub>I<sub>3</sub>: I/O-reference Instruction is Executed  
 D<sub>7</sub>I'3: Register reference Instruction is Executed  
 D<sub>7</sub>I'3: AR ← M[AR]  
 D<sub>7</sub>I'3: Nothing

### Register Reference Instruction

|                        |                  |                                                    |
|------------------------|------------------|----------------------------------------------------|
| → INC                  | increase AC      | D <sub>7</sub> I'3 = γ<br>B <sub>i</sub> = IR(0-1) |
| → CMA                  | complement       |                                                    |
| → CLA                  | clear            |                                                    |
| → SPA rB <sub>4</sub>  | skip if positive |                                                    |
| → SNA rB <sub>2</sub>  | skip if -ve      |                                                    |
| → SZ AIB <sub>2</sub>  | skip if 0        |                                                    |
| → SZ E rB <sub>1</sub> | skip if E=0      |                                                    |
| → HLT rB <sub>0</sub>  | Halt computer    |                                                    |

### Memory Reference Instruction

|         |                |                       |
|---------|----------------|-----------------------|
| ✓ → AND | D <sub>0</sub> | AR ← AC & M[AR]       |
| ✓ → ADD | D <sub>1</sub> | AC ← AC + M[AR]       |
| ✓ → LDA | D <sub>2</sub> | AC ← M[AR]            |
| ✓ → STA | D <sub>3</sub> | M[AR] ← AC            |
| → BUN   | D <sub>4</sub> | PC ← AR               |
| → BSA   | D <sub>5</sub> | M[AR] ← PC, PC ← AR+1 |
| → ISR   | D <sub>6</sub> | M[AR] ← M[AR]+1       |

if (M[AR]+1 = 0), then PC ← PC+1

### \* ADD to AC

D<sub>7</sub>I<sub>4</sub>: DR ← M[AR]  
 D<sub>7</sub>I<sub>5</sub>: AC ← AC + PR

### \* LDA

DR ← AC & M[AR]

\* BUN : Branch Unconditionally

→ This instruction transfer the program to the instruction specified by the effective address.

$$D_4 T_4 : PC \leftarrow AR, SC \leftarrow 0$$

and

\* BSA (Branch Save Return Address)

→ Used in subroutine.



at Time  $T_4$

At time  $T_5$

$$D_4 T_4 : M[AR] \leftarrow PC, AR \leftarrow AR+1$$

$$D_5 T_5 : PC \leftarrow AR, SC \leftarrow 0$$

\* ISZ : Increment and Skip if zero

$$D_6 T_4 : DR \leftarrow M[AR]$$

$$D_6 T_5 : DR \leftarrow DR+1$$

$$D_6 T_6 : M[AR] \leftarrow DR, \text{ if } (DR=0) \text{ then } PC \leftarrow PC+1, SC \leftarrow 0$$

## Interrupt Cycle



→ The flowchart gives the information who the interrupt is handled by the computer.

→ It consists of interrupt flip-flop R.

when, R=0,  
computer goes through instruction cycle

when R=1,  
computer goes through interrupt cycle

→ IEN (Interrupt Enable flag) = 0,  
then programmer don't want to use interrupt  
so, control continues with next instruction.

when, IEN=1,  
it check for flag bit, if both (FFI<sub>I</sub> and FFI<sub>O</sub> are zero)  
neither Input or output register are ready.  
If either of flag are set, then R is set to 1. (R=1)

A-B

$$A = \frac{x_1}{z_1} = R_{on}$$

$$B = \frac{x_2}{z_2} = R_{on}$$

## Input - Output and Interrupt



## Input output - Instruction

INP

OUT

SKI

SKD

IDN

IOf

$FGI \leftarrow 0$  (initial)

$FGI \leftarrow 1$  (when new info is available at the input device)

When  $FGI = 0$ ,

Info from INPR to OUTR

then AC

$FG0 \leftarrow 1$  (initial)

then  $OUTR \leftarrow AC$   
then  $FG0 \leftarrow 0$ .

## flow chart of Basic Computer

