

# Parallel processing



diff for CPU to load big programs.



## Types of Organisations:

SMP (Symmetric Multiprocessors).



solution: NUMA

NUMA - Non uniform Memory location



Fast

Asymmetric  
(Non uniform).

③ Multi-threading



I

2 control units

2 instructions performed  
at same time

CPU = 1

(C.U) executes  
instruction

④ Clustering

group of computers



works well

PC

## Applications

- Astrophysics
- Atmosphere & Ocean modelling
- CAD
- Military application
- VLSI design.
- Weather forecast.

## Interprocess Communication

Means the process to communicate with each other while they are running.  
when we send a message we need sender & receiver.

## Message passing system

- Ⓐ send (destination name, message).
- Ⓑ receive (source-name, message).

## Design Characteristics

- ① Synchronization
- ② Addressing
- ③ format of msg
- ④ Queuing discipline

## Synchronization

The communication of a msg b/w 2 processes implies some level of synchronization between the 2 process

Sender & receiver can be blocking or non blocking.

② Blocking send, Blocking receive

③



Direct Communication

④



Indirect Communication

⑤ Queuing / Buffering

- a) zero capacity
- b) Bounded Capacity
- c) unbounded capacity

## ④ Message format

used to settle  
a dispute



## Interprocessor Arbitration

- \* Computer system needs buses to facilitate the transfer of info b/w its various components.
- \* System bus connects CPU, IOP & Memory.
- \* Only 1 of CPU, IOP & Memory can be granted to use the bus at a time.
- \* Hence an appropriate priority resolving mechanism is required to decide.
- \* Arbitration mechanism is needed to handle multiple requests for bus.

## Arbitration Techniques

- ① Static Techniques - [
  - Serial Arbitration
  - Parallel Arbitration ]

## ② Dynamic Techniques (S/W)

### ① Static Arbitration Techniques

- \* The priorities assigned are fixed.
- \* Also called Daisy chain Arbitration.
- \* It is obtained by daisy chain connection of bus arbitration units.



- \* This scheme is because the grant line can flow from highest priority to lowest priority.
- \* The highest priority device will pass to LP. only if it does not want it.
- \* If bus busy line is free, the bus arbiter ignores bus grant signal.

Advantages:-

- ① Simple design.
- ② Less no. of control lines.

## Interconnection Structure

- \* A multi processor consists of a set of components like CPU, memory, I/O processor that communicates with each other.
- \* The collection of paths connecting the various modules is called interconnection structure.
- \* The physical form for establishing an interconnection network in multi processor system are
- \* The physical forms are
  - 1) Bus-oriented system
  - 2) Multi-port memory
  - 3) Cross-bar connection system
  - 4) Multi stage switching Network
  - 5) hypercube system

## Inter processor communication with using shared variables

- \* In a shared memory systems, interprocessor communication is achieved using shared variables. In such systems, the shared variables are stored in common memory which is accessible to all processors.



- \* While sharing common resources, or shared variables conflict problem may arise
- \* It is necessary to prevent conflicting use of shared resources by several processors.
- \* This task is done by operating system.

## Characteristics of Multi processors

- \* A multiprocessor system is an interconnection of 2 or more CPUs with memory & input-output

- equipment.
- \* The term "processor" in multi-processor can be either a CPU or IOP.
  - \* Multiprocessors are classified as MIMD systems.
  - \* The similarity b/w Multi processor & multi computer is that both support concurrent operations.
  - \* Multi processing improves reliability of the system.
  - \* The benefit derived from a multi processor organisation is an improved system performance.
  - \* Multi processing can improve performance by decomposing a program into parallel executable tasks.
  - \* Multi processor are classified by the way their memory is organised.

- Equipment.
- \* The term "processor" in multi-processor can be either a CPU or IOP.
  - \* Multiprocessors are classified as SIMD systems.
  - \* The similarity b/w Multi processor & multi computer is that both support concurrent operations.
  - \* Multi processing improves reliability of the system.
  - \* The benefit derived from a multi processor organisation is an improved system performance.
  - \* Multi processing can improve performance by decomposing a program into parallel executable tasks.
  - \* Multi processor are classified by the way their memory is organised.

# Array Processing

It performs computations on large arrays of data.

## Array Processors

- ① Attached array processor
  - \* To improve the performance of the host computer in specific numerical computation task. auxiliary processor attached to it.
- ② SIMD array processors
  - \* This is computer with multiple processing



- \* Attached array processors has 2 interface.
  - 1) Fix I/O interface to a common bus.
  - 2) Interface with local memory.
- \* The local memory interconnects main memory.
- \* Host computer is general purpose.
- \* Attached processor is black and machine driven by the host computer.
- \* The array processor is connected through an I/O controller.

## ② SIMD

- \* This is computer with multiple processing unit operating in parallel.
- \* Both types of array processors, manipulate vectors but their internal organization is different.

## SIMD - Array processors



- \* SIMD is a computer with multiple processing units operating in parallel.
- \* The processing units are synchronised to perform the same operation under the control of a common control unit, this provides a simple instruction stream, multiple data stream organisation
- \* Each PE includes F ALU
  - floating point arithmetic
  - working registers

- \* Master CPU controls the operations of PEs.
- \* Main memory is used for storage of the program and PE uses operands stored in local variable.

## Main Memory & Auxiliary Memory

Main Memory

Auxiliary Memory

- |                                         |                                        |
|-----------------------------------------|----------------------------------------|
| * Also called - primary memory          | * Also called - secondary memory       |
| * It is temporary memory of a computer. | * It is permanent memory of a computer |
| * It has fast access time               | * slow                                 |
| * It is smallest unit size.             | * biggest                              |
| * It is directly connected to CPU       | * not directly                         |
| * It is more expensive.                 | * less expensive.                      |
| * Volatile in nature except ROM         | * Non volatile.                        |
| eg RAM / ROM.                           | * eg HDD, BDFDD, pendrive etc          |

## Programmed I/O Interrupt

- 3 techniques of I/O
- programmed I/O
  - interrupt driven I/O
  - DMA
- } exchange info  
} b/w user CPU



## Programmed I/O

suspension of process



- \* There is no provision through which I/O can inform its CPU about data transfer
- \* I/O sets its own status and waits.
- \* CPU runs program periodically and checks

- the status of each device one-by-one
- \* If any device has its status,  $\neq 0$  then the CPU performs data transfer for it.
  - \* It works on the principle of polling.
  - \* Time required in programmed I/O = Time to check status + Data transfer time



## ② Interrupt Initiated I/O

- \* I/O device has a provision to inform to CPU about communication.
- \* When CPU receives an interrupt.
- \* It completes execution of current instruction.
- \* Saves the status of current process.
- \* Branches to service the interrupt.

- \* Resumes the previous process by taking out of stack

## Cache Memory

- \* The term cache means a safe place for hiding or storing things.
- \* CM is a small, fast memory which holds copies recently accessed instructions and data.
- \* When the processor makes a request for memory reference, the request is first sought in the cache.
- \* If we get that memory reference which is requested, we call it 'CACHE HIT'. Otherwise CACHE MISS.
- \* A block of elements are transferred from Main memory to cache memory by expecting that the next requested element will be residing in the - - -

## Associative Memory

- \* The time required to find an item in memory can be reduced considerably if stored data can be identified for access by the content of the data itself rather than by address.
- \* The memory unit accessed by content is called associative memory or CAM.
- \* This type of memory is accessed simultaneously and in parallel on the basis of data content rather than specific address or location.

## Memory Hierarchy

- \* All computers need different kinds of memory.
- \* Volatile Memory & Non Volatile memory
- \* The system combined memories working together is called the Memory Hierarchy.

\* It can be represented in a pyramid.



### Priority Interrupt DMA

- \* It is a system that establishes a priority over the various sources to determine which condition is to be serviced first when 2 or more requests arrive simultaneously.
- \* Higher priority levels are assigned to requests which if delayed or interrupted, could have serious consequences.

- \* Devices with high speed transfer rate given high speed priority & slow devices given low priority

### Establishment of priority of simultaneous Interrupts

Software polling procedure

HW

- \* One common branch address for all interrupts
- \* It functions as overall interrupt manager in case of interrupt
- \* Interrupt program begins at the branch address and polls the interrupt sources in sequence.
- \* Each interrupt source has its own interrupt vector to access its own series.
- \* The order in which they are tested depends upon the priority of each interrupt.
- \* No polling is required
- \* Can be established by either series or parallel.
- \* The highest priority source is tested first.

## Input Output Interface

- \* It is a hardware unit used by CPU & I/O devices to supervise & synchronize all input & output transfers.
- \* I/O devices are electromechanical & electro-magnetic while CPU & memory are electrical devices. So they have different signals. So conversion of signals is required.
- \* Data transfer rate of CPU is faster as compared to the I/O devices. So synchronisation is needed.
- \* Data format of I/O devices are different from word format of CPU & memory.
- \* Operating modes of I/O devices are different from each other and so each I/O device must be controlled so that it will not disturb the operation of other I/O devices to CPU.

# Hardwired Control Unit & Micro - Pro- grammed Control Unit.

| Attribute                                      | Hardwired<br>Control Unit.                   | Micro pro<br>grammed CU.                      |
|------------------------------------------------|----------------------------------------------|-----------------------------------------------|
| ① Speed.                                       | Fast                                         | Slow.                                         |
| ② Cost of<br>Implementation.                   | Expensive                                    | Cheaper.                                      |
| ③ Control<br>functions                         | Implement in<br>Hardware                     | Implement in<br>software.                     |
| ④ Flexibility                                  | Not flexible<br>to accommodate<br>new system | More flexible<br>to accommodate<br>new system |
| ⑤ Ability to<br>Handle complex<br>instructions | Difficult                                    | Easier.                                       |

|                                       |                     |                      |
|---------------------------------------|---------------------|----------------------|
| ⑥ Ability to support operating system | Very difficult      | Easy                 |
| ⑦ Design process                      | Complicated         | Orderly & systematic |
| ⑧ Applications                        | RISC microprocessor | CISC microprocessor  |

### Microprogram Examples

|   | 3              | 3              | 3              | 2  | 2  | 4    |
|---|----------------|----------------|----------------|----|----|------|
| * | F <sub>1</sub> | F <sub>2</sub> | F <sub>3</sub> | CD | BR | (AD) |
| * |                |                |                |    |    |      |

- \* The micro Instruction includes 4 fields.
- \* F<sub>1</sub>, F<sub>2</sub>, F<sub>3</sub>: These are micro operation fields. Each field is of 3 bits. They specify micro operations for the computer.
- \* CD: This two-bit field selects status bit condition for branching operation. The condition includes zero value in AC, sign bit of AC equal to 1 or 0 etc.

- \* BR: This 2-bit field specifies the type of branch to be used. Branch type includes unconditional branch, branch if zero, branch if negative and so on
- \* AD: This is an address field which contains a branch address. This field is of seven bits since control memory has 128 words.

## Instruction Formats

\* Instructions are categorized into diff formats with respect to the operand fields in the instruction.

- 1) Three A. I.
- 2) Two A. I
- 3) One A.I.
- 4) zero A.I.
- 5) Risi Instn.

## Three Address Instruction

- \* Computers with 3-address instruction formats can use each address field to specify either a processor register or a memory operand.
- \* The program is in assembly language.
- \* The advantage of 3-address format is that it results in short program when evaluating arithmetic expressions.
- \* The disadvantage is that the binary-coded instructions require too many bits for three address.

## Data Transfer & Manipulation

- \* In Computer instructions we have data transfer instructions, data manipulation instructions and program control instructions.
- \* In Data transfer instructions, transfer of data from one location to other without changing the binary information content.
- \* In Data manipulation Instructions, it performs arithmetic logic & shift operations.
- \* In program Control Instructions, It provides decision making capabilities and change the path taken by the program when executed in the computer.
- \* In Data transfer instructions, the most common data transfers are:-
  - a) b/w memory and processor registers

- b) between processor & registers @ I/O
  - c) between processor registers themselves
- \* Few of the typical data transfer instructions are load, store, move, exchange, Input, Output, Push and Pop

### Program Control Instructions

- \* A program control type of instruction, when executed may change the address value in the program counter and cause the flow of control to be altered
- \* The change in value of the program counter as a result of execution of a program control instruction causes a break in the sequence of instruction execution.
- \* It has Branch, Jump, Skip, call, Return, Compare & Test

## Three Address Instruction

- \* Computers with 3-address instruction formats can use each address field to specify either a processor register or a memory operand.
- \* The program is in assembly language.
- \* The advantage of 3-address format is that it results in short program when evaluating arithmetic expressions.
- \* The disadvantage is that, the binary-coded instructions require too many bits to three address.

## Two Address Instruction

- \* Two address Instruction are the most common in commercial computers.  
Here again each address field can specify either a processor register or a memory word.

## One Address Instruction

- \* One Address Instruction use can implied accumulator (AC) register for all data manipulation.
- \* For multiplication & division there is a need for a second register.
- \* However, here we will neglect the second register and assume that the AC contains the result of all operations.

## Zero Address Instruction

- \* A stack-organised computer does not use an address field for the instructions ADD & P MUL.
- \* The PUSH & POP instructions, however need an address field to specify the operand that communicates with the stack.

## RISC Instruction

- \* The instruction set of a typical RISC processor is restricted to the use of load & store instruction when communicating between memory & CPU.
- \* All other instructions are executed within the registers of the CPU without referring to a memory.
- \* A program for a RISC type CPU consists of LOAD & STORE instructions that have 1 memory and 1 register address, and computational-type instructions have 3 address with all 3 specifying processor register.

# Aynchronous Data Transfer

"At a regular interval"

- \* In the transmitter it transmits data bytes at any instant of time.
- \* Only 1 byte is sent at a time. There is idle time b/w 2 data bytes.
- \* Transmitter & receiver operate at diff clock frequencies.
- \* To help receiver start & stop bits are used along with data in middle.
- \* Turning off signals is not unique.



## Methods used in Asynchronous Data Transfer

- 1) Stroke control → source initiated → destination initiated
- 2) Hand shaking → source initiated with destination initiated hand shake

\* down unit first placed data involves data valid data

### Stroke Control { Asynchronous Data Transfer

¶

- \* It uses single control single for each transfer
- \* source initiated stroke
- \* Destination initiated stroke

② Hand



Timing diagram

Push ...



Using any

sour

Hand  
are  
the

\* Source unit is first placed data on bus.



\* first destination unit activates the strobe pulse info the source to provide data.



\* Source activates the strobe lines pulse



②



② Handshaking { ADT }.

↓.  
acknowledgment

source to destination

↓

Handshaking signals  
are used to synchronize  
the bus activities.

Destination to source



RISC & CISC.

RICS = Reduced

| CISC                                                                                                                             | RISC.                                                                                                                        |
|----------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
| <ul style="list-style-type: none"> <li>Load from memory.</li> <li>Arithmetic operations</li> <li>Stores in the memory</li> </ul> | <ul style="list-style-type: none"> <li>Load from memory</li> <li>Arithmetic operations</li> <li>Stores in memory.</li> </ul> |
| $\rightarrow 10010001000$<br>$\rightarrow 100111001101$<br>$011011001011$<br>$11010111111$<br><br>$\rightarrow 010110101$        | $\rightarrow 1011101001001010$<br>$\rightarrow 0101100101001100$<br>$\rightarrow 01010110001000$                             |
| <ul style="list-style-type: none"> <li>Code size is small</li> <li>no of cycles are more</li> </ul>                              | <ul style="list-style-type: none"> <li>Code size is large</li> <li>work cycles are less</li> </ul>                           |

needs little  
memory  
processor

- uses more memory
- uses less memory
- Large?
- Complex set of instructions
- Large?
- mode

E

Type I

\* New  
\* Old

needs better hardware & powerful processing.

- uses more RAM
- uses less registers
- Large no of instructions
- Complex Instruction set computer
- Large no of Address mode

• Less costly hardware would also work.

- uses less RAM
- uses more registers
- Less no of Instructions
- Reduced Instruction
- Few no of A.M

$$EPIC = RISC + CISC$$

### Pipeline

- \* Arithmetic pipeline
- \* Instruction pipeline

# Dividing powers

Exponents

a



R

b



R

Alg 1: compare the exponents

Difference



R.



Choose exponent

Alg 2:

R.

Alg 3:

R.

Alg 4:

Adjust exponent

R.

Mantissa

a



R.

R.



R.

align mantissa



R



R

Add or subtract

R

Normalize result

R.

## Adding & Subtracting

Exponents



Mgt 1: compare the exponents → Difference

Mgt 2: take exponent

Mgt 3:

Adjust exponents



Add or subtract mantissa



Normalise result



- ① Compare the exponents
- ② Align the mantissas
- ③ Add or subtract
- ④ Normalise result

$$x = 0.9504 \times 10^3$$

$$y = 0.8200 \times 10^2$$

$$x = 0.9504 \times 10^3$$

$$y = 0.820 \times 10^3$$

$$z = 1.0324 \times 10^3$$

$$z = 0.10324 \times 10^4$$

## Data Hazard

- \* When either the source or the destination operands of an instruction are not available at the time expected in a pipeline.
- \* So As a result, the pipeline is stalled, we say such a situation as data Hazard.
- \* Consider a program with 2 instructions  $I_1$  &  $I_2$
- \* When this program is executed, in a pipeline, the execution of these instructions can be performed concurrently.
- \* In such a case the result of  $I_1$  may not be available for the execution of  $I_2$ .

- \* If the result of  $I_2$  is 0 on the result of  $I_1$ , we get incorrect results if both are executed concurrently.

- \* Assume  $A = 10$ .

$$I_1 : A \leftarrow A + 5$$

$$I_2 : A \leftarrow A \times 2$$

- \* When these 2 operations are performed one after the other we get result 30.

- \* But if they are performed concurrently it leads to incorrect result as in this case the val of  $I_2$  depends on the result of  $I_1$ .

- \* The hazard due to such situation is called as data Hazard or data dependent Hazard.

- \* To avoid such incorrect results we have to execute dependent instructions one after the other -

## Interprocessor Communication using Shared Variables

- \* In a shared memory system, interprocessor communication is achieved using shared variables.
- \* In such systems, the shared variables are stored in common memory which is accessible to all processors in system.



- \* While sharing common resources / Shared variables, conflict problem may arise
- \* It is necessary to prevent conflicts in use of shared resources by several processes
- \* This task is done by operating system

## Interprocessor Communication using Shared Variables

- \* In a shared memory system, interprocessor communication is achieved using shared variables.
- \* In such systems, the shared variables are stored in common memory which is accessible to all processors in system.



- \* while sharing common resources / shared variables, conflict problem may arise
- \* It is necessary to prevent conflicts over shared resources by several processes
- \* This task is done by operating system

## Cache Mapping Techniques

- \* It tells us which word of the main memory will be placed in which location of the cache memory.
- \* Basic idea : Mapping b/w the Cache Addresses & Main Memory addresses referring to same unit of information
- \* Types of Mapping
  - 1) Direct Mapping Technique
  - 2) Associative Mapping "
    - Fully
    - Set Associative

## Direct Mapping

- \* It is simplest mapping technique
- \* In this technique each block from

The main memory has only one possible location in the cache organisation

main mem      location  
cache org

### Associative Mapping (Fully Associative " )

- \* In this technique, a main memory block can be placed into any cache block position.
- \* As there is no fixed block, the memory address has only 2 fields: word & tag.
- \* The set associative mapping is a combination of both direct & associative mapping.
- \* It contains several groups of direct mapped blocks that operate as several direct mapped blocks in parallel.

- \* A block of data from any page in the main memory can go into a particular block location of any direct-mapped cache.
- \* The main address comparisons depend on the no. of direct mapped cache in the cache system.
- \* These comparisons are always less than the comparisons used in the fully associative mapping.

A word is :-

word bits : each block contains 64 words . . . to identify each word, we must have  $(2^6 - 2^{64})$  six bits reserved for it.

$$\text{Total no of block} = \frac{\text{Total words in cache}}{\text{words per block}} \\ = \frac{4096}{64} = 64$$

tot no. of sets =  $\frac{\text{no. of blocks}}{\text{blocks per set}}$  =  $\frac{64}{16} = 4$

to identify each set ( $2^4 = 16$ )  
four bits are used

tag bits: Tot no. of words in  
main memory.

$$= 65536 \times 64 = 4194304 = 2^{22}$$

Main memory address is 22 bits under.  
tag bits = 22 - set bits - word bits  
 $= 22 - 4 - 6 = 12$ .

Main Memory address = Tag | Set | word  
12 | 4 | 6

## Typical DMA Controller 10M

- \* DMA - Direct Memory Access
- \* It is a hardware controlled data transfer.
- \* DMA controller is used to carry out data transfer.
- \* During data transfer, data is not routed through processor.
- \* To perform DMA operation the basic blocks require a DMA channel/ controller are shown in diagram.
- \* DMA controller communicates with CPU via the data bus & control lines.
- \* The registers in DMA are selected by the CPU through address bus.
- \* The RD & WR inputs are bi-directional.
- \* When the BG input is 0, CPU can communicate with DMA registers through the data bus to read & write.

the DMA registers RD & WR input  
signals for DMA

- \* when  $BO_1 = 1$ , The CPU has relinquished the buses & the DMA can communicate directly with the memory.
- \* DMA consists of Data count, data register, address register & control logic.
- \* Data counter register stores the no. which gives the no. of data transfers to be done in one DMA cycle. It is automatically decremented after each word transfer.
- \* Data registers act as buffer whereas address register initially holds the starting address of the device.
- \* After each transfer, data counter is tested for 0.

- \* when the data count reaches '0',  
the DMA transfer halts.
- \* The DMA controller is normally  
provided with an interrupt capability

## Various Addressing Modes 104

- \* Implied Mode
- \* Immediate Mode
- 3) Register Mode
- 4) Register Indirect Mode
- 5) Auto Increment / Auto Decrement mode
- 6) Direct Address Mode
- 7) Indirect Address Mode
- 8) Relative Addressing mode.
- 9) Indexed Addressing Mode
- 10) Base Register Address .

The different ways in which source operand is denoted in an instruction

1).

### 1) Register Addressing Mode:

\* The operand is the contents of processor register.

\* The name of register is specified in the instruction.

\* Ex: MOV R1, R2:

- The instruction copies the contents of R2 to R1.

### 2) Direct Addressing Mode:

\* The address of the location of operand is given explicitly as a part of instruction

\* Ex: MOV A, 2000:

- This instruction copies the contents of the memory location 2000 unto the A register.

### 3) Immediate Addressing Mode:

The operand is given explicitly in

the instruction

Ex :- MOV A, #20.

The instruction copies operand  
20 in the register A.

The sign # in front of the value  
of an operand is used to  
indicate that this value is  
immediate operand.

4) Indirect Addressing Mode:-

The instruction contains the  
address of the memory which  
refers to the address of the operand.

5) Register

## Immediate Addressing

- # initialize a variable.
- # no additional memory
- # Operand is mentioned explicitly

e.g.  $MOV R_0, 300$ .

$\underbrace{\text{CPU}}$   
register

$\underbrace{\text{Initialized}}$   
value

## Direct Addressing

- # address of memory location is given explicitly

# contents of the memory is provided

e.g.  $MOV R_1, \underbrace{x}_{\text{memory [content: 10]}}$ .

$\underbrace{\text{CPU}}$   
register

$$P(\bigcup_{i=1}^n A_i) \geq \sum_{i=1}^n P(A_i) - (n-1)$$

Mathematical induction method is used to prove this.

For any event,  $0 \leq P(E) \leq 1$ .

Consider two events ( $n=2$ ) then

$$P(A_1 \cup A_2) \leq 1$$

$\therefore A_1 \cup A_2$  is an event

By using addition Theorem.

$$P(A_1) + P(A_2) - P(A_1 \cap A_2) \leq 1.$$

$$P(A_1) + P(A_2) \leq 1 + P(A_1 \cap A_2).$$

we can write the above eq<sup>n</sup> as

$$P(A_1 \cap A_2) \geq P(A_1) + P(A_2) - 1$$

The statement is true for  $n=2$ .

Let us assume that the statement is

$$\text{True for } n=k \text{ i.e. } P\left(\bigcap_{i=1}^k A_i\right) \geq \sum_{i=1}^k P(A_i)$$

UNIT-1COS

Q. List four basic functions of the CPU?

Ans: The digital computer consist of five functionally independent unit:

input, memory, arithmetic and logic, output and control unit



Register transfer language :-

A register is a collection of flip flop where every flip-flop contain 1-bit of information. 3 flip flop contain 3-bit of memory.

Registers show in four ways:-

Block diagram

Registers X

(a)

(b) Showing individual bits

15

0

15

0

37

0

(c) Number of bits

(d) divide in 2 parts

Register transfer transfer the content of R<sub>1</sub> register to R<sub>2</sub> register

~~R<sub>1</sub> ← R<sub>2</sub>~~      R<sub>2</sub> ← R<sub>1</sub> (replacement operator)

if (P=1) then (R<sub>2</sub> ← R<sub>1</sub>)

P: R<sub>2</sub> ← R<sub>1</sub>

Transfer from  $R_1$  to  $R_2$  when  $P=1$

Block diagram



Register Transfer language (RTL)

A symbolic representation which specify register transfer microcode.

$$R_2 \leftarrow R_1$$

$$\text{if } (P=1) \text{ then } R_2 \leftarrow R_1$$

$$P : R_2 \leftarrow R_1$$

\* Common bus:-

In multiple register configurations a common bus system is used to transfer information between two register. A common bus consist of a set of common lines, one for each bit of register, through which binary information is transferred one at a time.

The common bus scheme can be implemented in two ways

- using Multiplexers
- using tri-state bus buffers.

(v) using multiplexers

The number of multiplexers depend on number of registers or size of multiplexers is also depend on size of registers.

$$n \rightarrow 2^n \text{ inputs} \rightarrow 1 \text{ output}$$

Select one input line and give output at one corresponding

$[S_0, S_1] \rightarrow$  are two selection line

$$2 \rightarrow 2^2 \text{ inputs} \rightarrow 1 \text{ output}$$

Bus transfer for n registers



Read

$$R \leftarrow M[AR]$$

Transferring the data  
from memory to  
any register

$$M[AR] \leftarrow R$$

Transferring the data from any  
register to memory.

Q. Design a circuit for parallel load operation into one of the 4-bit register from a bus mention clearly control / selection bits and selection logic.  
 JK flip flop.



Date:

(1) three state bus buffers:

Microoperation

## ① Arithmetic Micro operation:

There are mainly seven type of <sup>arithmetic</sup> microoperations.

|                             |                                      |                                                                                                                     |
|-----------------------------|--------------------------------------|---------------------------------------------------------------------------------------------------------------------|
| ① Add                       | $R_3 \leftarrow R_1 + R_2$           | Content of R <sub>1</sub> and R <sub>2</sub> are added and result is transferred to R <sub>3</sub>                  |
| ② Subtract                  | $R_3 \leftarrow R_1 - R_2$           | Content of R <sub>2</sub> are subtracted from content of R <sub>1</sub> and result is transferred to R <sub>3</sub> |
| ③ 1's complement            | $R_1 \rightarrow \bar{R}_1$          | Complement the content of R <sub>1</sub>                                                                            |
| ④ 2's complement            | $R_1 \rightarrow \bar{R}_1 + 1$      | Complement the content of R <sub>1</sub> and add 1 in it                                                            |
| ⑤ 2's Complement Subtractor | $R_3 \leftarrow R_1 + \bar{R}_2 + 1$ | Add R <sub>1</sub> and R <sub>2</sub> the 2's complement of R <sub>2</sub>                                          |
| ⑥ Increment                 | $R_1 \leftarrow R_1 + 1$             | Increment the content of R <sub>1</sub> by 1                                                                        |
| ⑦ Decrement                 | $R_1 \leftarrow R_1 - 1$             | Decrement the content of R <sub>1</sub> by 1                                                                        |

### 4 bit binary Adder



$$\begin{array}{r}
 \text{Inputs} \\
 \begin{array}{c} B_3 \quad A_3 \quad A_0 \\ \hline 0 \quad 1 \quad 1 \quad 0 \\ \text{---} \\ \text{---} \end{array} \\
 \begin{array}{c} B_2 \quad A_2 \quad B_0 \\ \hline 1 \quad 0 \quad 1 \\ \text{---} \\ \text{---} \end{array} \\
 \begin{array}{c} B_1 \quad A_1 \\ \hline 1 \quad 1 \quad 0 \end{array} \\
 \hline
 \begin{array}{c} S_3 \quad S_2 \quad S_1 \quad S_0 \\ \hline 0 \quad 1 \quad 1 \quad 0 \\ 1 \quad 0 \quad 1 \quad 1 \\ \hline 0 \quad 0 \quad 0 \quad 1 \end{array}
 \end{array}$$

$$D = \overline{a} + \overline{b} + 1$$

9's Complement

### 4 bit binary Adder / subtractor



Design of Control unit

(1a)

Microoperations fields:-



→ Number of times the control unit is required <sup>decide</sup> do the status of the condition code or external input to choose between alternative courses of action. In such situation, microprogrammed control use conditional branch micro-instruction. In addition to the branch address, the instruction specify which of the external inputs, condition codes, or possibly bits of the instruction register should be checked as a condition for branching to take place.



**Advantages :-**

- It simplifies the design of control unit. That it is both cheaper and less error prone to implement.
- Control function are implemented in software rather than hardware.

Unit-2

Control Memory

① CPU ② I/O ③ Memory

→ CPU, I/O, Control Unit

## # Microprogrammed control unit

It consists of control memory, control address register, microinstruction register and microprogram sequencer.

The components of control unit work together as follows:

- The control address register (uPC) holds the address of the next microoperation to be read. Every time a new instruction is to be located in its IR, the output of the block labelled "starting address register generator" is loaded into the uPC.
- When address is available in control address register, the sequencer issues READ command to the control memory.
- After issue of READ command, the word from the address location is read into the microinstruction register.
- The uPC is then automatically incremented by the clock, causing successive microinstruction to be read from the control memory.
- The content of the microinstruction which are delivered to various parts of the processor in the correct sequence. register generates control signals

Address Sequencing :-

To generate the next micro instruction address

Selection of address for control memory



Working :-

$$\text{Control memory address} = \log_2 9001 - 12 \text{ bits}$$

00 x xxxx xxxx 000

o Plo do

|      |         |
|------|---------|
| 1011 | Address |
|------|---------|

0 xxxx 00

0 1011 00 X

Date :

Page No.

Truth table for 16 functions of 2 variable

| $x$ | $y$ | $f_0$ | $f_1$ | $F_2$ | $F_3$ | $F_4$ | $F_5$ | $F_6$ | $F_7$ | $F_8$ | $F_9$ | $F_{10}$ | $F_{11}$ | $F_{12}$ | $F_{13}$ | $F_{14}$ | $F_{15}$ |
|-----|-----|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|----------|----------|----------|----------|----------|----------|
| 0   | 0   | 0     | 0     | 0     | 0     | 0     | 0     | 0     | 0     | 0     | 0     | 1        | 1        | 1        | 1        | 1        | 1        |
| 0   | 1   | 0     | 0     | 0     | 0     | 1     | 1     | 1     | 1     | 0     | 0     | 0        | 1        | 1        | 1        | 1        | 1        |
| 1   | 0   | 0     | 0     | 1     | 1     | 0     | 0     | 1     | 1     | 0     | 0     | 1        | 1        | 0        | 1        | 1        | 1        |
| 1   | 1   | 0     | 1     | 0     | 1     | 0     | 1     | 0     | 1     | 0     | 1     | 0        | 1        | 0        | 1        | 0        | 1        |

Boolean functions

$$F_0 = 0$$

$$F_1 = AB$$

$$F_2 = A\bar{B}$$

$$F_3 = A$$

$$F_4 = \bar{A}B$$

$$F_5 = B$$

$$F_6 = A \oplus B$$

$$F_7 = A + B$$

$$F_8 = (\bar{A} + B)$$

$$F_9 = \overline{A \oplus B}$$

$$F_{10} = \bar{B}$$

$$F_{11} = A + \bar{B}$$

$$F_{12} = \bar{A}$$

$$F_{13} = \bar{A} + B$$

$$F_{14} = \bar{A}\bar{B}$$

$$F_{15} = 1$$

Microoperation

$$F \leftarrow 0$$

$$F \leftarrow A \wedge B$$

$$F \leftarrow A \wedge \bar{B}$$

$$F \leftarrow A$$

$$F \leftarrow \bar{A} \wedge B$$

$$F \leftarrow B$$

$$F \leftarrow A \oplus B$$

$$F \leftarrow A \vee B$$

$$F \leftarrow \overline{A \vee B}$$

$$F \leftarrow \overline{A \oplus B}$$

$$F \leftarrow \bar{B}$$

$$F \leftarrow A \vee \bar{B}$$

$$F \leftarrow \bar{A}$$

$$F \leftarrow \bar{A} \vee B$$

$$F \leftarrow \overline{\bar{A} \wedge B}$$

$$F \leftarrow \text{all } 1's$$

Name of operation

Clear

AND

AND with second operand complement

Transfer A

AND will just open comple.

Transfer B

Exclusive OR

OR

NOR

Exclusive-NOR

Complement B

OR operand with same

Complement A

OR with just -

NAND

Set all 1's

Incrementor / Decrementor

If the number to be incremented is 1111, the output carry  $C_4$  is generated and we get 0000 at so through  $S_3$ .



$$A-1 = A+2's \text{ complement of } 1$$

$$\begin{aligned} 2's \text{ complement of } 1 &= 1's \text{ complement of } 1 (0.001 + 1) = 1110 + 1 \\ &= 1111 \end{aligned}$$

- The design process is orderly and systematic
- More flexible, can be changed to accommodate new system specification or do correct the design errors quickly and cheaply.
- Complex function such as floating point, can be realised efficiently.
- The new or modified instruction set of CPU can be easily implemented by simply rewriting or modifying the contents of Control memory.
- The fault can be easily diagnosed in the micro program control unit using diagnostic tools - by maintaining the contents of flags, registers and counters.

### Disadvantages

- A microprogrammed control unit is somewhat slower than the hardwired control unit, because time is required to access the microinstruction from CM.
- The flexibility is achieved at some extra hardware cost due to the control memory and its access circuitry.

The design duration of microprogram control unit is more than hardwired control unit for smaller CPU.

## Instruction Code



The address of an operand is called effective address.

## Computer Registers



Date :

Regis

① Data R  
Memory

② Address  
Memory  
③ Accu  
Reg

④ Insti  
⑤ Regis  
⑥ Cycles

⑦ In  
⑧ Out

⑨