



7.12.2024

Computer Organisation - study of diff. hardware & their interconnection to built a computer.

Computer Architecture :- study of the behavior of the computer with the user along with system softwares like Operating Systems, ISA (instruction set architecture), compilers & device drivers software.

→ most popular microprocessor → 8085 LLP (first gen) (8 bit)

8086 LLP (2nd) (16 bit)

dual core → 2 layers → \* 2 ALU

\* supports parallel processing

Registers is the smallest unit of storage in a processor.



Registers

\* it can be 16 bits, 32 bits or 64 bits

Registers

Special Purpose Registers  
(SPR)

\* for specific storage.

General Purpose Registers  
(GPR)

\* holds only intermediate data.

~~imp~~ \* Basic Operational Concept :-



↔ viceversa

|           | opcode     | operands                                  |
|-----------|------------|-------------------------------------------|
|           | ↓          | ↓                                         |
| 1000. ADD | R0, R1, R2 | * registers are denoted by R.<br>(GPR)    |
| 1004. SUB | A, B, C    |                                           |
| 1008.     |            | * A, B, C means data is stored in Memory. |
| 1012.     |            |                                           |
| 1016.     |            |                                           |



→ C.U (control unit) → only generates clock pulse ; by sending these clock pulse , it controls all hardware.

→ A.L.U (arithmetic & logic unit) → it performs all the arithmetic & logic operations.

\* only combinational ckt's are present.

Registers - are smallest storage unit of a comp. & are present inside the processor

- 2 types :-

(i) General Purpose Registers (GPR)

(ii) Special Purpose Registers (SPR)

General Purpose Registers → are used to hold data for temp. period.  
→ it is a memory.

Special Purpose Registers → also known as PC (Program Counter)  
(Program Counter) PC → PC is a SPR which holds the address of the next instruction to be executed inside the CPU.

Instruction Register (IR) → IR holds the currently executing instruction.

Memory Address Register (MAR) → MAR is a SPR which holds the address of memory location from where we need information & bring it to CPU.

\* Memory Data Register (MDR) → MDR is a SPR which holds the data to be sent from memory to processor or vice-versa. (processor to memory)

| memory access |     |                |                |                |
|---------------|-----|----------------|----------------|----------------|
| 2000          | ADD | R <sub>0</sub> | R <sub>1</sub> | R <sub>2</sub> |
| 2004          | SUB | A              | B              | C              |
| 2008          |     |                |                |                |
| .             |     |                |                |                |

1  
 $1 + 1 + 1 + 1$   
↓    ↓    ↓    ↓  
read A B Write inc  
inst?

\*imp

## Instruction Execution Steps :-

### ① Fetch

1st instruction's address is stored in PC.

add stored in PC → OS gives 'instruction' to PC  
sends add to MAR.

Any add. proc. wants to send to memory ; it will send first to MAR ; then MAR sends to Primary memory.

then 1<sup>o</sup> memory checks & Instruction (data) transferred to MDR  
then MDR to IR.

While reading the 1st instruction , PC has the add of 2nd inst.

PC → MAR → PM (Primary mem) → MDR → IR

Searched inst.

on that add

then PC increment  
to store add. of  
next instr.

### ② Decode

1st when CPU gets instruction , it will understand how to process it.

CU does the decoding of the work.

decode → ADD

it'll understand - Source R<sub>0</sub>, R<sub>1</sub> (inside CPU)

" " - destination R<sub>2</sub> (inside CPU)

### ③ Execute

ALU gets the R<sub>0</sub>, R<sub>1</sub> values & operations Add & performs the operation

### ④ Write the result

ALU writes the result in the destination R<sub>2</sub>.

SUB A, B, C

① fetch

② Decode - subtract

source - A, B (memory)

Destination - C (memory)



ALU operates &  $A - B = 30$  sends  $30$  to MDR &  $C$  to MAR



\* CPU sends a control signal to memory to signify whether read or write operation.

Interrupt :- external signal which stops the currently executing process.

It uses a program called 'Interrupt Service Routine'

↓  
it serves the interrupt

Whenever interrupt comes in CPU

"it'll store all immediate information in a stack."

then allow the interrupt program to execute.

→ then after completion of the instruction,  
it loads the instruction intermediate  
information to the desired location & all the  
process to execute from where it stops.

Series of wire are used to transfer data  
 16 bit = 16 wire  
 8 " = 8 "

BUS - is a series of wire used to establish the connection + communication betw diff. hardwares present in the comp.

2 types :- (I) Internal BUS - internal conn (inside RAM)  
 (II) External BUS - external

### CLASSIFICATION :-

Data & Address are transferred through BUS.

So ; types of BUS are :-

- (i) Data BUS - to transfer data
- (ii) Address BUS - to transfer address
- (iii) Control BUS - to transfer control signals.



- to transfer data using same wires.

Disadvantages of Single BUS :-

- \* only one way of communication
- \* it can't communicate with other devices.

### Buffer

- \* very small memory
- \* is a small memory basically attached with low speed hardwares to facilitate the data transfer betw two diff. hardwares of dissimilar speed.

14.12.2024 Basic Performance Equation :-

$$T = \frac{N \times S}{R}$$

T → Time taken to execute a task .

N → Number of instructions in a task .

S → Average Number of clock cycles per instruction .

R → Clock rate / frequency

( $N \times S$  → Total no of clock cycles for 'x' no of instruction)

(Clock rate means how many clock pulse generated in 1 sec)

$$\text{Clock Time (P)} = \frac{1}{R}$$

or  $T = N \times S \times P$

| R                 | P    |
|-------------------|------|
| 1 KHz = $10^3$ Hz | 1 MS |
| 1 MHz = $10^6$ Hz | 1 US |
| 1 GHz = $10^9$ Hz | 1 ns |

Q → If R is 2.4 GHz ; then calc P.

$$2.4 \text{ GHz} = 2.4 \times 10^9 \text{ Hz}$$

$$P = \frac{1}{R} = \frac{1}{2.4} \times 10^{-9} = 0.41 \text{ ns}$$

Q - A task contains 200 instructions & each instruction will take on an average 8 clk cycles to execute. If the system operate with a clock rate of 400 MHz . Calc the time reqd. to execute a task.

$$T = \frac{N \times S}{R}$$

$$N = 200$$

$$S = 8$$

$$R = 400 \text{ MHz}$$

$$T = \frac{200 \times 8^2}{400}$$

$$T = 4 \text{ us}$$

Q - Let a processor operates by a freq. of 10 MHz & it executes a program having 90 instructions, out of which 50% of register referans. inst. & 30% are memory trans inst. & 20% are branch inst.

Register ref. inst , Memory & Branch takes 4,8 & 6 clk cycles respectively . Find out the total time taken by the processor to execute the program.

Ans.  $R \rightarrow 10 \text{ MHz}$

$$N = 90$$

$$\text{register. ref. ins} = 50\% \text{ of } 90 = 45 = N_1$$

$$\text{memory " " } 30\% \text{ of } 90 = 27 = N_2$$

$$\text{branches " " } 20\% \text{ of } 90 = 18 = N_3$$

$$S = \frac{45(4) + 27(8) + 18(6)}{90}$$

$$= 5.6$$

$$T = \frac{N \times S}{R} = \frac{5.6 \times 90}{10} = 50.4 \text{ us}$$

or

$$R = 10 \text{ MHz}$$

$$P = 0.1 \text{ us}$$

$$\begin{aligned}T_1 &= \{45 \times 4 \times 0.1\} \text{ ns} \\&= 18\end{aligned}$$

$$T = N \times S \times P$$

$$\begin{aligned}T_2 &= \{27 \times 8 \times 0.1\} \text{ ns} \\&= 21.6\end{aligned}$$

$$\begin{aligned}T_3 &= \{18 \times 6 \times 0.1\} \text{ ns} \\&= 10.8\end{aligned}$$

$$\begin{aligned}T &= T_1 + T_2 + T_3 \\&\approx 50.4\end{aligned}$$

$$\boxed{\downarrow T = \downarrow \frac{N \times S}{R}}$$

16.12.2024 Criterias for Performance Enhance :-

Perf. can be optimised by doing some hardware & software modifications:-

1) Inclusion of cache memory in the processor.



speed of processor is much faster than PM ; so execution takes time .

Cache is a very small memory present inside the processor to reduce the memory access time  
(if PM is 8 Gb ; then cache is almost 4 mb)

There are 2 reasons for reduction in fetch time.

(i) Cache memory is very fast.

(ii) Cache memory is present inside the processor.

### Locality of Instruction (Property)

90% of instruction of the program will execute for the 10% of time & 10% of instruction of program will execute for 90% of time.

2) Pipeline of Processor :-

\* Overlapping is done



↓ efficient way



18-12-24

Pipeline is a hardware enhancement to achieve parallel processing in a single processor by logically dividing the processor into diff modules & executing multiple instructions simultaneously in an overlapped fashion.

So any particular time, multiple instructions are executed in the processor simultaneously.





Pipeline I

|       |       |       |       |       |       |       |  |  |  |  |
|-------|-------|-------|-------|-------|-------|-------|--|--|--|--|
| $F_1$ | $D_1$ | $E_1$ | $W_1$ |       |       |       |  |  |  |  |
| $I_2$ |       | $F_2$ | $D_2$ | $E_2$ | $W_2$ |       |  |  |  |  |
| $I_3$ |       |       | $F_3$ | $D_3$ | $E_3$ | $W_3$ |  |  |  |  |

$100 = 4 + 99$   
 $= 103$  cik

$$\text{Speedup} = \frac{400}{103} \approx 4$$

$$\text{Speedup} = \frac{n-k}{k+(n-1)}$$

$K$  = no of stages in the pipeline processes

Superscalar pipeline :-

(Pipeline of pipeline)

Non-Pipeline

|       |       |       |       |       |       |       |       |   |    |    |
|-------|-------|-------|-------|-------|-------|-------|-------|---|----|----|
| 1     | 2     | 3     | 4     | 5     | 6     | 7     | 8     | 9 | 10 | 11 |
| $F_1$ | $D_1$ | $E_1$ | $W_1$ | $F_2$ | $D_2$ | $E_2$ | $W_2$ |   |    |    |

$I_1$        $I_2$

Pipeline I

|       |       |       |       |       |       |       |  |  |  |  |
|-------|-------|-------|-------|-------|-------|-------|--|--|--|--|
| $F_1$ | $D_1$ | $E_1$ | $W_1$ |       |       |       |  |  |  |  |
| $I_2$ |       | $F_2$ | $D_2$ | $E_2$ | $W_2$ |       |  |  |  |  |
| $I_3$ |       |       | $F_3$ | $D_3$ | $E_3$ | $W_3$ |  |  |  |  |

Superscalar

Pipeline

(with dual  
core  
processor)

|       |       |       |       |       |  |  |  |  |  |  |
|-------|-------|-------|-------|-------|--|--|--|--|--|--|
| $F_1$ | $D_1$ | $E_1$ | $W_1$ |       |  |  |  |  |  |  |
| $F_2$ | $D_1$ | $E_2$ | $W_2$ |       |  |  |  |  |  |  |
|       | $F_3$ | $D_3$ | $E_3$ | $W_3$ |  |  |  |  |  |  |
|       | $F_4$ | $D_4$ | $E_4$ | $W_4$ |  |  |  |  |  |  |
|       | $F_5$ | $D_5$ | $E_5$ | $W_5$ |  |  |  |  |  |  |
|       | $F_6$ | $F_6$ | $E_6$ | $W_6$ |  |  |  |  |  |  |

Performance Eqn for Pipeline :-

$$T = \frac{N \times S}{R} \quad S \rightarrow 1 \text{ ( 1 clk pulse)}$$

So eqn becomes

$$T = \frac{N}{R}$$

### ③ Clock rate

- \* hardware enhancement
- \* enhancement in 'T'. (time taken)

### ④ Computer

- \* software enhancement.
- \* 'N' will be reduced.

↳ by using advanced & efficient computer;  
the no of instructions per program (N) is  
reduced.



Types of Instruction Set Architecture / Assembly LanguageInstructions :-

Every system has 2 different types of ISA :-

| <u>RISC</u>                                                                                | <u>CISC</u>                                               |
|--------------------------------------------------------------------------------------------|-----------------------------------------------------------|
| * Reduced Instruction Set Computers                                                        | * Complex Instruction Set Computers                       |
| $C = A + B$                                                                                | * ADD A, B, C } only one instruction                      |
| * LOAD A, R1<br>LOAD B, R2<br>ADD R1, R2, R3<br>STORE R3, C                                |                                                           |
| * also known Load - Store Interaction bcoz Memory Interaction is only done by Load & store | * no such name                                            |
| * supports maximum one memory interaction.                                                 | * supports any no of memory interactions.                 |
| * Size of instruction is less.<br>(it deals with registers)                                | * Size of instruction is more.<br>(it deals with address) |
| * supports pipeline                                                                        | * doesn't support pipeline                                |

H-W 10 diff. b/w RISC & CISC

\* imp

| Feature                | RISC                                   | CISC                                   |
|------------------------|----------------------------------------|----------------------------------------|
| 1. Instructions Format | fixed instruction size                 | variable instruction set               |
| 2. Pipelining          | Easier to implement                    | Difficult to implement                 |
| 3. Registers           | more registers                         | fewer registers                        |
| 4. Memory Usage        | needs more RAM                         | less RAM                               |
| 5. Execution Time      | one instruction per clock cycle        | multiple cycles per instruction        |
| 6. Power efficiency    | consumes less power                    | consumes more power                    |
| 7. Cost                | cheaper to design                      | expensive                              |
| 8. Complex usage       | relies heavily on complex organization | less dependent on complex organization |
| 9. Addressing Mode     | fewer addressing mode                  | many addressing mode.                  |
| 10. Example            | ARM, MIPS                              | x86, Intel 8086                        |

ARM → Advanced RISC Machine

MIPS → Microprocessor without interleaved pipeline stages

## Classifications of Computers :-

- 1) Von-Neumann Architecture
- 2) Harvard's Architecture

Von-Neumann is the 1st scientist who proposed memory in a computer.

### Von-Neumann Architecture



But Harvard divided the memory into 2 parts -

(i) Instruction Memory

(ii) Data Instruction

