

DB



(st 2)

The part of computer that performs the bulk of data-processing operations is called the central processing unit.

CPU is made up of 3 major parts : Register set, ALU & control unit.

The register set stores intermediate data used during the execution of the instructions.

The ALU performs the required microoperations for executing the instructions.

The control unit supervises the transfer of information among the registers and instructs the ALU as to which operation to perform.



## → Types of CPU organisation

- All operations are performed with implied accumulators register.
- The instruction format in this type of computer uses 1 address field.

### ① Single accumulator Organisation

ADD B

one

→ for this type of organization, value should be in accumulator

ADD B

$A \leftarrow A + M[B]$

$A$  is the accumulator register &  $M[B]$  is memory word located at  $B$ .

### ② General Register Organisation

- when we are using multiple general-purpose registers, instead of single accumulator register, in the CPU organization then it's called general-register based CPU organization.
- by this type of organization, computer uses (2 or 3) address fields in their instruction format.

\* Three address field :

| m | opcode | destination<br>Add. | source<br>Add. | source <sup>2</sup><br>Address |
|---|--------|---------------------|----------------|--------------------------------|
|   |        |                     | A              | B                              |

$$X = (A+B) * (C+D)$$

ADD R<sub>1</sub> A, B

$R_1 \leftarrow A + B$

ADD R<sub>2</sub> C, D

$R_2 \leftarrow C + D$

MUL X R<sub>1</sub>, R<sub>2</sub>

$X \leftarrow R_1 \times R_2$

Note

→ No. of instructions are less so speed is high.

\* Two Address field :

$$x = (A+B) * (C+D)$$

MOV R<sub>1</sub> A

$$R_1 \leftarrow R_1 + B$$

MOV R<sub>2</sub> C

ADD R<sub>2</sub> D

$$R_2 \leftarrow R_2 + D$$

MUL R<sub>1</sub> R<sub>2</sub>

$$R_1 \leftarrow R_1 * R_2$$

→ here no. of instructions are more so, the speed is less

### ③ Register Stack Organisation

A useful feature that included in the CPU of most computer is stack or last-in, first-out LIFO. A stack storage is a storage device that stores information in such a manner that the item stored in last is first retrieved.

It contains a register that holds the address for the stack called stack pointer (SP), because its value always points at the top item in the stack.

Stack has 2 operations :

PUSH : The operation of insertion is called push, which is used for pushing a new item in stack.

POP : The operation of deletion is called POP, it's used for removing the topmost item from stack.

It's a G4 wood register



PUSH

$SP \leftarrow SP + 1$

$M[SP] \leftarrow DR$

if ( $SP=0$ ) then ( $FULL \leftarrow 1$ )

EMPTY ← 0

104

$DR \leftarrow m[SP]$

$$SP \subseteq SP - 1$$

If ( $SP = 0$ ) then (EMPTY?)  
     $\leftarrow \text{NULL}$

FULL ← 0

- Initially SP will point to 0.
  - Let in data register we have a & we want to push a in stack, for this we will perform
 
$$\begin{aligned} SP &\leftarrow SP + 1 \\ M[SP] &\leftarrow DR \end{aligned}$$
  - Now SP will point to 1, a will be inserted there.
  - In next step it will check if ( $SP = 0$ ), No then Value of FULL = 1 & Empty = 0

\* Zero Address Instruction

$$(a+b) * (c+d)$$

Push a  
Push b  
Add



Push c  
Push d  
Add  
Mul



### INSTRUCTION FORMAT

Instruction :  $X = (A+B) \times (C+D)$

1) Three Address Instruction

ADD R<sub>1</sub> A, B       $R_1 \leftarrow A+B$

ADD R<sub>2</sub> C, D       $R_2 \leftarrow C+D$

MUL X R<sub>1</sub>, R<sub>2</sub>       $X \leftarrow R_1 \times R_2$

It results in short program, when evaluating arithmetic expressions.

The binary-coded instruction requires too many bits to specify three addresses.

2)

3)

LOA  
STO

4)

## Two Address Instruction

|     |                |                |                                                  |
|-----|----------------|----------------|--------------------------------------------------|
| MOV | R <sub>1</sub> | A              | R <sub>1</sub> ← M[A]                            |
| ADD | R <sub>1</sub> | B              | R <sub>1</sub> ← R <sub>1</sub> + M[B]           |
| MOV | R <sub>2</sub> | C              | R <sub>2</sub> ← M[C]                            |
| ADD | R <sub>2</sub> | D              | R <sub>2</sub> ← R <sub>2</sub> + M[D]           |
| MUL | R <sub>1</sub> | R <sub>2</sub> | R <sub>1</sub> ← R <sub>1</sub> * R <sub>2</sub> |
| MOV | X              | R <sub>1</sub> | M[X] ← R <sub>1</sub>                            |

→ MOV is used for moving the data operand to and from memory and processor registers.

## One Address Instruction

LOAD A → content of A will be stored in accumulator

STORE A → content of Accumulator will be stored in A

|       |   |                |                          |
|-------|---|----------------|--------------------------|
| LOAD  | A | AC ← M[A]      |                          |
| ADD   | B | AC ← AC + M[B] | { Accumulator<br>↓ A+B } |
| STORE | T | M[T] ← AC      |                          |
| LOAD  | C | AC ← M[C]      |                          |
| ADD   | D | AC ← AC + M[D] | ( Accumulator<br>CFA )   |
| STORE | T | AC ← AC * M[T] |                          |
| STORE | X | M[X] ← AC      |                          |

## Zero Address Instruction

( back )

## ADDRESSING MODES

- The operation field of an instruction specifies the operation to be performed. This operation must be executed on some data stored in computer register or memory words. The way the operands are chosen during program execution is dependent on the addressing mode.
- It specifies a rule for interpreting or modifying the address field of the instruction before the operand is actually executed.
- Computer uses addressing modes for 2 purposes:
  - To reduce the no. of bits in the addressing field of instruction.
  - To give programming versatility to the user by providing such facilities as pointers to memory, counters for loop control, indexing of data etc.
- To understand various addressing modes it's necessary to go through instruction cycle, that's divided into 3 parts:
  - \* Fetch the instruction from memory
  - \* Decode the instruction
  - \* Execute the instruction

→ Various Addressing Modes

### ① Implied Mode

- In this mode the operand are specified implicitly in the definition of the instruction.
- For example instruction CM A (complement accumulator) is an implied-mode instruction. In this the operand in the accumulator register is implied in the definition of the instruction. It is used for 0 or 1 address of the instruction registers.

ex INC A (Increment Accumulator)

ACC ACC + 1

ADD

Instruction

Data

### ② Immediate Addressing Mode

- In this mode operand is specified in the instruction itself.
- Operand is directly provided as a constant.
- No computation is required to calculate effective Address.
- Size of the operand/constant depends on the size of Address field.

Opcode | Address → Data is directly stored here

### ③ Register Mode

- Here the operands are in registers that resides within the CPU.
- Register No. is written in the instruction.

Opcode | operand

000000 (Register no.)



\* Register Indirect Mode : In this mode the selected register contains the address of the operand rather than the operand itself.



#### ④ Autoincrement or Autodecrement Mode :

This is similar to the register indirect mode except that the register is incremented or decremented after/before its value is used to access memory. When the address stored in the register refers to a table of data in memory, it's necessary to increment or decrement the register after every access to the table. This can be achieved by using increment/decrement instruction.



LD (R1) +

#### Direct Addressing Mode (Absolute Addressing mode)

In this mode the effective Address is equal to the address part of the instruction. It's used to access variables.



• limitation : size of address must be within the size of addressing part of instruction



size  $\rightarrow 0 - 2^{12}$

### (6) Indirect Addressing Mode

- In this mode address field of the instruction gives the address where the effective Address is stored in memory.
- Control fetches the instruction from memory & uses its address part to access memory again to read the effective address.



### (7) Relative Address Mode

$$\text{Effective Address} = \text{PC} + \text{offset}$$



Note

$\text{PC} \rightarrow$  shift by default be by 1 so, offset  $\rightarrow 4^9$



## COMPUTER INSTRUCTIONS

Computer instructions are classified in 3 categories :

### Data Transfer Instructions :

These instructions are used to move data from one place in the computer to another without changing the data content.

LOAD : It's used to transfer data from memory to accumulator.

STORE : It's used " " " " accumulator to memory.

MOV : It's has been used in computers with multiple CPU registers to designate a transfer from one register to another.

XCH : (exchange) It swaps information b/w two registers or a register & memory.

INPUT, OUTPUT, PUSH, POP

### Program control Instructions

### ③ Data Manipulation Instructions

- These instructions perform operations on data & provide the computational capabilities for the computer.
- They are divided into - parts
- \* Arithmetic
- \* Logical & bit manipulation
- \* Shift

$\Rightarrow$  INR, DEC, ADD, SUB, MUL, DIV, ADC, SUBB, CLR etc

### STATUS BITS

- The status register comprises of status bits.
- The bits of the status register are modified according to the operations performed by ALU.
- Block diagram of an 8-bit ALU with a 4-bit status register?
- Four status bits are symbolized by C, S, Z, V



- Bit C (carry) is set to 1 if the end carry (c8 is 1), if carry is 0, it's cleared to 0.
- Bit S (sign) is set to 1 if the highest order bit (F7 is 1, otherwise 0).
- Bit Z is 1 if the output is 0, but 0 when the output is not zero.
- Bit V (overflow) is 1 if the XOR of last two carries is equal to 1, otherwise 0.
- Result of 8-bit ALU operation is either 127 or -127



## INTERRUPTS AND ITS TYPES

- The interrupt is a signal emitted by hardware or software when a process or event needs immediate attention.
- It alerts the processor to a high-priority process requiring interruption of the current working process.
- In I/O devices one of bus control line is dedicated for this purpose & it's called (ISR) Interrupt Service Routine.
- When the device raises an interrupt, at let's say process i, the processor first completes the execution of i. Then it loads the program counter with the address of the first instruction of ISR.

### # Software Interrupt

- A sort of interrupt called a software interrupt is one that is produced by software or a system as opposed to hardware.
- Traps and exceptions are other names for them.
- They serve as a signal for the operating system or a system service to carry out a certain function or respond to an error conditions.

### # Hardware Interrupt :

- In hardware interrupt, all the devices are connected to the Interrupt Request line. A single request line is used for all the n<sup>th</sup> devices.
- To request an interrupt, a device closes its associated switch.
- When more than one device raises an interrupt request signal, the additional information is needed to decide which device to consider first.



- \* **Interrupt Nesting** : Here I/O devices are organised in priority structure. Hence interrupt request from higher priority is recognized.
- \* **Polling** : devices encountered with IRQ is served first.

### CHARACTERISTICS OF RISC/CISC

#### RISC

- It stands for Reduced instructions set Architecture
  - simpler instruction, hence simple instruction decoding
  - instruction come undersized of one-word
  - Instructions take a single clock cycle to get executed.
  - More general purpose registers
  - Simple Addressing modes
  - Fewer data types
- Focus on software
- 1) simple instructions
  - 2) Faster executions
  - 3) Lower power consumption
- 1) more instruction required
  - 2) ↑ memory usage
  - 3) high cost

#### CISC

- It stands for complex instruction set Architecture.
  - complex instruction, hence complex instruction decoding
  - Instructions are larger than one word size.
  - Instruction may take more than single clock cycle to get executed
  - These general purpose register
  - Complex Addressing modes.
  - More data types
  - Focus on hardware.
- Pro
- 1) Reduce code size
  - 2) widely used

(or)

- 1) ↑ power consumption
- 2) More complex

## PIPELINING      PROCESSING

- Pipelining is a process of arrangement of hardware elements of the CPU such that its overall performance is increased.
- Simultaneous execution of more than one instruction of take place in pipelined processor.
- Ex → consider a bottle packaging plant. Let there be 3 stages that bottle should pass through I (insertion), F (filling), S (sealed).

without pipelining

I F S | | | | |  
| | | I F S | | |  
| | | | | | I F S

$9/3 \Rightarrow 3$  minutes

"with" pipelining

I F S | |  
| I F S |  
| | I F S

$5/3 \Rightarrow 1.67$  minutes

- Pipelining is a technique of breaking a sequential process into small fragments or sub operations. The execution of each of these sub-procedure takes place in a certain dedicated segment that functions together with all other segments.



## Parallel Processing

- Parallel processing is a term used to denote a large class of techniques that are used to provide simultaneous data processing task for the purpose of increasing computational speed of a computer system.
- Instead of processing each instruction sequentially, a parallel processing system is able to perform concurrent data processing to achieve faster execution time. For example, while an instruction is being executed in the ALU, the next instruction can be read from memory.
- Systems may have 2 or more ALU & be able to execute 2 or more instruction at the same time - the purpose of parallel processing is to increase



throughput, that is the amount of processing that can be accomplished during a given interval of time

### Flynn's Classification

- M.J. Flynn considers the organization of a computer system by the no. of instructions & data items that are manipulated simultaneously.
- The normal operation of a computer is to fetch instructions from memory & execute them in processor.
- The sequence of instructions read from memory constitutes an instruction stream.
- The operations performed on data in the processor constitutes a data stream.

→ It is divided into 4 parts

instruction

1) single stream, single data stream (SISD)



2) Single instruction stream Multiple data stream (SIMD)



3) Multiple Instruction stream, single data stream (MISB)



4) Multiple Instruction stream, Multiple data stream (MIMD)



Ques

Demonstrate  $A_i * B_i + C_i$  for  $i = 1, \dots, 7$   
using pipeline concept.

$$R_1 \leftarrow A_1, R_2 \leftarrow B \quad \text{Input } A, B$$

$$R_3 \leftarrow R_1 * R_2, R_4 \leftarrow C \quad \text{Multiply } & \text{ Input } C$$

$$R_5 \leftarrow R_3 + R_4 \quad \text{Add } C \text{ to product}$$

figure



table

| Clock phase No. | Segment 1<br>$R_1, R_2$ | Segment 2<br>$R_3, R_4$ | Segment 3<br>$R_5$ |
|-----------------|-------------------------|-------------------------|--------------------|
| 1               | $A_1, B_1$              |                         |                    |
| 2               | $A_2, B_2$              | $A_1 * B_1, C_1$        | $C_1$              |
| 3               | $A_3, B_3$              | $A_2 * B_2, C_2$        | $A_1 * B_1 + C_1$  |
| 4               | $A_4, B_4$              | $A_3 * B_3, C_3$        | $A_2 * B_2 + C_2$  |
| 5               | $A_5, B_5$              | $A_4 * B_4, C_4$        | $A_3 * B_3 + C_3$  |
| 6               | $A_6, B_6$              | $A_5 * B_5, C_5$        | $A_4 * B_4 + C_4$  |
| 7               | $A_7, B_7$              | $A_6 * B_6, C_6$        | $A_5 * B_5 + C_5$  |
| 8               |                         | $A_7 * B_7, C_7$        | $A_6 * B_6 + C_6$  |
| 9               |                         |                         | $A_7 * B_7 + C_7$  |

## PIPELINING

## GENERAL CONSIDERATIONS

- Any operation that can be decomposed into a sequence of sub operations of about the same complexity can be implemented by a pipeline processor. This technique is efficient for those applications that need to repeat the same task many times with different sets of data.
- A 4 stage segment pipeline is given below:



The operand pass through all four segments in a fixed sequence. Each segment consists of a combinational unit, that performs a sub-operation over the data stream flowing through the pipe.

The segments are separated by registers  $R$ , that hold the intermediate results b/w the stages.

Task is defined as the total operation performed going through all the segments in the pipeline.



Space-time diagram for pipeline



| Segment | 1     | 2     | 3     | 4     | 5     | 6     | 7     | 8     | 9     | Clock |
|---------|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|
| 1       | $T_1$ |       |       |       |       |       |       |       |       |       |
| 2       |       | $T_2$ |       | $T_3$ | $T_1$ | $T_5$ | $T_6$ |       |       |       |
| 3       |       |       | $T_1$ | $T_2$ | $T_3$ | $T_1$ | $T_6$ |       |       |       |
| 4       |       |       |       | $T_1$ | $T_2$ | $T_3$ | $T_1$ | $T_5$ | $T_6$ |       |
|         |       |       |       |       | $T_1$ | $T_2$ | $T_3$ | $T_1$ | $T_5$ | $T_6$ |
|         |       |       |       |       |       |       | $T_1$ | $T_5$ | $T_6$ | $T_c$ |
|         |       |       |       |       |       |       |       | $T_1$ | $T_5$ | $T_c$ |
|         |       |       |       |       |       |       |       |       | $T_5$ | $T_c$ |
|         |       |       |       |       |       |       |       |       |       |       |

Ques

find the no. of clock cycles when segments & task are given.

$$T = 6 \quad S = 4$$

$$\begin{aligned} \text{clock cycles} &= k + (n-1) \\ &= 8 + (7-1) \Rightarrow 4 + 6 - 1 \\ &= 9 + 5 = 9 \text{ Ans} \end{aligned}$$

### ARITHMETIC PIPELINE

$$x = 0.9504 \times 10^3 \rightarrow \text{mantissa}$$

$$y = 0.8200 \times 10^2 \rightarrow$$

Step 1 : compare the power,  $x$  has less power  $(3-2)=1$   
so, we have to shift  $y$  by 1

$$x = .9504 \times 10^3 \quad y = 0.08200 \times 10^3$$

Step 2 : Now add both mantissas

$$z = 1.0324 \times 10^3$$

Step 3 : Normalize the result, so that it can be a fraction of with nonzero first digit

$$z = 0.10324 \times 10^4$$



### INPUT - OUTPUT ORGANISATION

#### Peripheral Devices

All the devices, that are connected to the computer system whether they are used for taking the input or displaying the output are called peripheral devices.

- All the input / output devices attached to the computer system, designed to read information into or out of the memory unit upon command from the CPU are called peripheral devices.
- peripheral that provides auxiliary storage → magnetic disk, tapes

### ① Monitor & keyboard :

- Video monitor are most commonly used peripherals, consist of keyboard as input device & display unit as output devices.
- most popular monitor is CRT (cathode ray tube)

### ② Printer :

- Printers provide a permanent record on paper of computer output data / text.

Type: daisywheel, laser, dot matrix

### ③ Magnetic tape & disk

- Magnetic tapes are used mostly for storing files of data for example a company's payroll record.
- Magnetic disks are used mostly for bulk storage of program & data

I/O INTERFACE



Block diagram

BD  
diagram

chip

Registers

- Input-output interface provides a method for transferring information b/w internal storage & external I/O devices.
- Peripherals connected to a computer need special communication links for interfacing them with CPU. The purpose of the communication link is to resolve the differences that exist b/w the central computer & each peripheral.

→ Major differences are:

- 1) Peripherals are electro-mechanical & electro-magnetic devices & their manner of operation is different from the operation of the CPU & memory, which are electronic devices. The conversion of signal values may be required.
  - 2) The data transfer rate of peripherals is usually slower than the transfer rate of CPU & consequently a synchronization mechanism may be needed.
  - 3) Data codes & formats in peripherals differ from the word format in the CPU & memory.
- The operating modes of peripherals are different from each other & each must be controlled so as not to disturb the operation of other peripherals connected to CPU.

So to resolve these differences, computer system include special hardware components b/w the CPU & peripherals to supervise & synchronize all input & output transfers, these components are interface units bcoz they interface b/w processor bus & peripheral devices.

block diagram



| CS | RS1 | RS0 | Register Selected                                    | impedance |
|----|-----|-----|------------------------------------------------------|-----------|
| 0  | X   | X   | Port A Register<br>(None data bus in high impedance) |           |
| 1  | 0   | 0   | Port B Register                                      |           |
| 1  | 0   | 1   | Port B Register                                      | "         |
| 1  | 1   | 0   | Control Register                                     | "         |
| 1  | 1   | 1   | Status Register                                      |           |

- The I/O data to and from the device can be transferred into either port A or port B. The interface may operate as output device or input device, or device that requires both I/O.
- If interface is connected to printer, it will only output data, & if its connected to keyboard it will input data.
- The control Register receives control information from the CPU.
- The bits in status register are used for status conditions for recording errors that may occur during the data transfer. For example, a status bit may indicate that port A has received a new data item from I/O device. Another bit in status register may indicate that a parity error has occurred during the transfer.

## ASYNCHRONOUS DATA TRANSFER

- Two units such as a CPU & I/O interface are designed independently of each other. If the registers in the interface shares a common clock pulse with the CPU registers, the data transfer b/w two units is called synchronous.  
In most cases the internal timing in each unit is independent from the other in that each uses its own private clock for internal registers.  
In that case the two units are said to be asynchronous to each other.

①

### Strobe control

- The strobe control method of asynchronous data transfer employs a single control line to time each transfer.
- The strobe (may be activated by source or destination) data bus carries the binary information from source unit to the destination unit. Typically, the bus has multiple lines to transfer an entire byte. The strobe is a single line that informs the destination unit when a valid data word is available in bus.
- The source unit first places the data on the data bus. After a brief delay to ensure that the data settle to a steady value, the source activates the strobe pulse.
- The information on the data bus & the strobe signal remain in the active state for a sufficient time period to allow destination unit to receive data.



(Block Diagram) ↑



(Timing Diagram) ↑

## Handshaking :

The disadvantage of the strobe method is that the source unit that initiates the transfer has no way of knowing whether destination unit has actually received the data item that was placed in the bus.

The 2 handshaking lines are data valid, which is generated by source unit, and data accepted generated by the destination unit.



↑ [Block diagram]

Timing Diagram →



## MEMORY ORGANIZATION IN COMPUTERS

### # Hierarchy Design

External or Secondary Memory

External or Primary Memory



### # Main Memory

- It's also called primary memory, which is used to store data, program or instruction during computer operation.

**RAM** : Random Access Memory

- It's volatile in nature (If power supply cuts, data loss).
- It stores data temporarily.
- It's used for booting or start the computer.

**SRAM (Static RAM)** : It uses transistors

- It has no. of flip flops containing 1 bit.

- It's faster.



- \* **D RAM** (Dynamic RAM) :
  - It uses capacitor & transistors
  - It contains thousands of memory cells.
  - It's slow than SRAM.

(2) ROM : Random Read only memory.

- It's non volatile
- here we can read the programs & data stored
- Information stored here is in binary form.

\* **MROM** → Masked ROM → Least cost ROM

\* **PROM** → Programmable ROM → here the user purchase the blank PROM & uses it to put required data in it, data can be erased once written.

\* **EPROM** → Erasable programmable ROM

- U can erase its content by exposing it to UV Rays

\* **EEPROM** → Electrically Erasable Programmable ROM

- written contents can be erased electrically (4-10)ms