

# Computer Organisation

Prof  
Ram  
Sarkar

01/08/2019

Books :- Computer Organisation &  
Architecture + Wilson &  
Stallings

William-Stalling - Computer.Org.



- IBM - 370 Architecture

- Hierarchy      for eg. C → Procedural programming

↓  
Memory

↓  
Secondary Memory,  
Primary Memory, Cache, etc .

- 1. Separated into blocks.
- 2. Reuse.
- 3. Error detection becomes easy.

- Structure & Function

Consistent data is highest priority.

Peripherals

Communication  
lines

Computer

→ Storage  
→ Processing

Movement

Control

Storage

Processing

Compiler

- Acts as a translator transforms HLL into Machine Oriented Language.
- Helps the programmer ignore machine dependent details.



Compiler → Assembly Level Language

Compiler → Relocatable binary (Relative Address).

Compiler → Absolute binary. (Absolute → Fixed address).

Structure of Compiler

→ Compiler generates a symbol table.  
(Refer to the example).

## Operating System

Main Goals { → Resource Management  
→ Resource Abstraction  
→ Virtualization.

### Execution Mode

- User Mode
- Kernel (or Supervisor) Mode

Process → Instance of a program in execution

- Process image
  - static & state instr.
  - Process switch.
- encompasses static concept of program & dynamic aspect of execution

### Threads

- Division of process.

### Process Execution

### Process State Transition Diagram



other states → New → Exit



## Context switch



## Concurrent Processes

Uniprocessor → Time sharing using Multiprogramming  
 Multiprocessor → True Parallel execution

↓  
 Round Robin Scheduling

## Scheduling

- Process (thread) v/s Preemptive
- FCFS & RRS.

## Goals

- Avoid starvation (be "fair").
- optimize throughput.

## Deadlock

- When two dependent processes are in blocked state.

## Deadlock v/s Starvation

→ Von Neumann Architecture

→ Instruction Cycle

- Basic
- Exception

→ Instruction Architecture

- Software design
- Hardware circuit

→ Digital Design

Eg. ADD

Opcode Operand(Address)

→ Different addressing mode.

opcode Mode op(Address) Mode OP(Address)

Computer system



Control Unit

connect every other component of the computer.

①

① Data & instruction are both stored in main memory.

② Content is addressable by location



Basic Architecture

③ Instructions are executed sequentially under order until explicitly modified.

④

⑤ component

1. CPU
2. Main Memory
3. I/O Device
4. Main Storage
5. Interconnect Network

## Instruction Cycle

### Basic

All instructions are stored in memory



~~All instructions~~

→ All instructions are stored in memory

### Intermediate view

#### Complete instruction

- operation code
- addressing mode
- zero (or) more operands.

#### Exception

Eg. Addressing :- Memory does not exist or inaccurate



Address

Faulty op code

Execution



At the hardware level, they are implemented using XNOR, OR, XOR, NOT, ~~.....~~, etc.

Data → signals

Combination of gates  
→ Integrated Circuits (ICs).

→ LSI, VLSI, ULSI.

Design complexity

→ Control Unit

- ① Manages the data stored in registers & accumulators stages, must decode instructions, communication with RAM addresses.
- ② Each operation requires separate circuitry
- ③ Necessary to have knowledge of various data representations.

ALU

- CISC (Complex instruction set)
- RISC (Reduced instruction set)

## Instruction Set Design

- Instruction type
- No. of address
- Addressing modes.
- Instruction support data-types
  - ① Characters
  - ② Integers
  - ③ Floating Point type
- Instruction Overload
- Separate Instruction eg. MIPS

- Arithmetic & Logic (4)
- Data Movement
- I/O (Memory-mapped, isolated I/O)
- Flow control
  - Branches
  - Procedural calls

## Number of Address

- ~~3~~ 3-address machine
- 2 address machine
- 1 address machine
- 0 address machine.

$$\text{eg. } Y = \frac{A - B}{C + (D \times E)}$$

(eg) 3-address

SUB Y, A, B.

MUL ~~Y~~, T, D, E

ADD T, T, C

DIV Y, Y, T

## Addressing Modes

- Operands placed in
  - Registers
  - Part of instruction
  - Memory

→ Common addressing modes,

→ Implied addressing

→ Immediate addressing

- Direct addressing



opcode operands

Addressing Modes

→ Place where operand is stored.

CLC; ← implied addressing

ADD 5; ← immediate addressing

→ direct addressing, indirect addressing.

→ register addressing

→ Register - direct



→ Register - indirect



→ Displacement addressing

→ Stack addressing

implicit



First In Last Out  
Last In First Out

EA → effective address

William Strattling → Chapter - 2

→ IAS computer. ( Von Neumann Architecture)

→ Structure of IAS computer

→ IAS instruction set

Contains the  
opcode



②

$M(x) \rightarrow$  memory contents at  $x$ .  
Conditional branch  $\rightarrow$  [ Left instruction  
                          ] Right instruction

### IAS Operations

- Problems with clock speed & Login Density.
  - Power
  - RC Delay
  - Memory Latency

24/8/2019

# Computer Organization

①

Post  
Ram  
Sarkar

## Booth's Algorithm

(S) + (P)

M A



②

Example

$$(-7) \times (3)$$

$$M \quad Q$$

2's complement  $\rightarrow 1001$ 

$$M = 0011$$

| <u>n</u>    | A                       | Q                       | $q_0$ | Action                 |
|-------------|-------------------------|-------------------------|-------|------------------------|
| 4           | 0000                    | <u>0011</u>             | 0     | Initialization         |
| 3 (ruleone) | <del>0111</del><br>0011 | 0011                    | 0     | $A = A - N$            |
| 2           | 0001                    | <del>1100</del><br>1100 | 1     | Right shift $A \& q_0$ |
|             | 1010                    | 1100                    | 1     | $A = A + N$            |
| 1001        | 1                       | <del>0101</del><br>1101 | 0     | RS $A \& q_0$ .        |
| 0           | 1110                    | 1011                    | 0     | RS $A \& q_0$          |
|             | check 2's<br>complement |                         |       |                        |
|             | -21                     |                         |       |                        |

## Restoring Division Algorithm.

(9)



11/3

$$= 3(Q) \\ 2(R)$$

11 → Dividend

3 → Divisor

~~M = 1101~~

~~M = 11101~~

3(Q), 2(R).

2

$$\begin{array}{r} 5 \\ -3 \\ \hline 2 \end{array}$$

$$\begin{array}{r} 3 \\ -1 \\ \hline 2 \end{array}$$

3

5

3

$$\begin{array}{r} 2 \\ -1 \\ \hline 1 \end{array}$$

| n | M     | A     | Q                 | Action         |
|---|-------|-------|-------------------|----------------|
| 4 | 00011 | 00000 | 1011              | Initialization |
| 3 | 0     | 00001 | 011-              | SLAQ           |
| 2 | 11110 | 011-  | A = A - M         |                |
| 1 | 00001 | 0110  | Q[0] ← 0, Restore |                |
|   | 00010 | 110-  | SLAQ              |                |
|   | 11111 | 110-  | A = A - M         |                |
|   | 00010 | 1100  | Q[0] ← 0, Restore |                |
|   | 00001 | 00101 | 100-              | SLAQ           |
|   | 00001 | 00101 | 100-              | A = A - M      |
|   | 1     | 00010 | 1001              | Q[0] = 1       |
|   | 00101 | 001-  | SLAQ              |                |
|   | 00010 | 001-  | A = A - M         |                |
|   | 0     | 00010 | 0011              | Q[0] = 1       |

Q = 0011 = 3

R = 00010 = 2



## Non-restoring Division Algorithm



| n | M     | A     | Q    | Action              | Index   | Parity |
|---|-------|-------|------|---------------------|---------|--------|
| 4 | 00011 | 00000 | 1011 | Initialization      |         | 3      |
|   | 00001 | 011-  |      | SL A Q              | (Add 1) | 11     |
|   | 11110 | 011-  |      | A = A - M           |         |        |
| 3 | 11110 | 0110  |      | Q <sub>0</sub> ← 0  |         |        |
|   | 11100 | 110-  |      | SL A Q              | (Add 1) | 31     |
|   | 11111 | 110-  |      | A = A + M           | (Add 1) | 3      |
| 2 | 11111 | 1100  |      | Q <sub>0</sub> ← 0  |         |        |
|   | 10000 | 100-  |      | SL A Q              |         | 34     |
|   | 11111 |       |      |                     |         |        |
|   | 00010 | 100-  |      | A = A + M           |         |        |
| 1 | 00010 | 1001  |      | Q <sub>0</sub> ← 1  |         |        |
|   | 00101 | 001-  |      | SL A Q              |         |        |
|   | 00010 |       |      |                     |         |        |
|   | 00010 | 001-  |      | A = A - M           |         |        |
| 0 | 00010 | 0011  |      |                     |         |        |
|   | 00010 | 0010  |      | Q <sub>0</sub> ← 01 |         |        |

$$R = \textcircled{2}$$

$$\textcircled{2}$$

$$\textcircled{3} = Q.$$



①

Control Design

[JB Hayes]

- Digital Processing Unit (DPU)
- Control Unit (CU)
- DPU is reconfigured by the CU to perform a set of (micro) operations.

CU

- Instruction Sequencing → methods by which instructions are selected for execution.
- Instruction Interpretation → methods used for activating the control signals.

Instruction Sequencing

- if → Memory becomes a problem. The instruction length becomes increased, which in turn increase the cost of memory it is stored.

Soln. PC (Program Counter).

PC contains A of  $\epsilon_1$ .

$PC \leftarrow PC + K$ . ( $K$  is the word length of  $\epsilon_1$ )

- Increment is automatic
- However, the actual address must be stored in PC.

$PC \leftarrow X$ .

Before PC, you need a buffer.

②

Conditional branch test  
the value of condition C

If C is present

PC  $\leftarrow$  X

else

PC is incremented

to point to the next instruction

### Program Control Transfer

- Often it is required to transfer control from P1 to P2 due to subroutine call / interrupt.
- In case of interrupt, call instruction is replaced by interrupt signal.

### Control Stack

- Stack stores return address.
- Also used to store pass variables from P1 to P2, and variables local to P2.
- LIFO  $\rightarrow$  Last In First Out
- Pop out using return.

### Stack Implementation

- Pop & Push
- Smaller stack: Shift Register
- Larger stack: RAM
- Stack pointer contains the pointer acting as top of the stack.



## Instruction Interpretation



## Behaviour of CU

→ 3 states identified by  $C_{ij}$ .

## Implementation methods

→ Hardwired CU (HCU)

## Goal

→ minimize the no. of components

→ maximize the speed of operation.

→ lack of structure makes HCU costly to design & debug.

## Microprogramming

→ A (micro)instruction stored in specially addressable memory called a ~~soft~~ Control Memory (M).

→ flexible & systematic.

## Emulation & Limitation

## Microprogram

→ Sequence of microinstructions needed to execute a particular operation.

(A)

## Hardwired Control

### 3 main methods

- 1) state-table method (Consult sir's slides)
- 2) Delay-element method (Consult sir's slide)
- 3) Sequence-Counter Method

30/8/19

↳ ① Many digital circuits are designed to perform a small no. of actions repeatedly.

$$C_{in} = \{ C_{in}', C_{in}'' \}$$

→ CU based sequence counter is used.



→ Cascade of delay elements can be made to behave like a sequence counter

## Micro-programmed control

- every instruction is implemented using no. of operations
- Information is stored in RAM/ROM, called control memory (CM).
- each microinstruction explicitly/implicitly specifies the next microinstruction to be used.
- set of related micro-instructions → microprogram

## Witke's Design

- Control field
- Address field.

05/5/19

## Computer Organisation

Prof -  
Ran  
Sarkar

### Area of concern

- Word length of micro-instruction
- Way control info. is represented
- ~~Way next micro~~

Class Test.  
upto Control  
unit.  
19th Sept, 2019  
2:30 → 3:30  
(upto 6/9/19).

### Control Memory

- ROM (diode)
- RAM (ferrite core)

ROS → Read Only Store

### Wilkes Idea :- Writable CM (WCM)

1. Dynamically microprogrammable.
2. Computer with WCM :- no instruction set in real sense.

### Parallelism in Microinstructions



Micro-instruction format  
IBM sys. 370

Unencoded format :- Control signals may be derived directly from micro-instruction.

MI is loaded into CM

Data register (called MIR)  
[microinstruction Register]

②

## Horizontal Microinstructions

(i) Long formats

→ Parallelism.

(ii) ...

→ Standard format for stored branch

## Vertical Microinstructions

(i) Short formats

→ No parallelism.

## Micro-instruction Addressing

→ Microprogram counter

(μPC) for branching

→ Analogous to PC at instruction level

μPC → used as CM address register

## Conditional Branching

→ flag generated by DPV.

if several such conditions exist,

$$R \leftarrow f(R_1, R_2)$$

Register

Transfer

Operation

## 4-phases

→ Refer to sir's slide.

Phase  $\phi_1$

## Micro-operation timing

→ Fetch & execute steps can be overlapped.

→ Replace μIR by IM

a pair of registers forming a two segment pipeline.

Phase  $\phi_2$

first μIR<sub>1</sub> & μIR<sub>0</sub>.

Phase  $\phi_3$

Phase  $\phi_4$



### Residual Control

MI is used to allocate a resource e.g. connection b/w 2 units may be established by MI and maintained via Residual control.

### Control Unit Organization

(Please refer to the diagram)

Prot.  
Rak  
Sarkar

04/01/19

### Nano-programmed Computer

- Typical microprogramme are encoded in vertical format.
- They have highly parallel horizontal format.

#### Advantages

- Reduce size of chip.
- Greater design flexibility resulting from the loosening of the bonds b/w instruction & hardware.

#### Disadvantage

- Reduction in speed due to extra memory & more complex CU organisation.

(For the Math, refer to sir's slides)

(4)

$$S_2 = H_M (2 \log_2 H_M + \log_2 r + rN).$$

$$\begin{aligned} H_M &\rightarrow 650 \\ r &\rightarrow 0.4 \\ N &\rightarrow 70 \\ H_N &= 260 \end{aligned}$$

20 12.11

$$S_1 = H_H (N + [\log_2 H_H])$$

$$\Rightarrow S_2 = 650 ([2 \log_2 (650)] + [\log_2 (0.4)] + 0.4 \times 70)$$

$$S_1 = 650 (70 + [\log_2 650])$$

$$\Rightarrow S_1 = \frac{80 \times 650}{80 \times 650} = \frac{52000}{2}$$

$$\underline{S_2 = 29,900}$$

$$20 - 2 + 28$$

$$= \frac{20 + 26}{= 46}$$

$$\underline{S_2 < S_1}$$

### Key characteristics of Memory Systems

[Refer to William Stallings]

#### Method of Accessing

##### Units of Data

Linked List → Sequential Access

Array → Random Access

Accessing a variable → Direct Access.

non structured

Acc Key → Associative

#### Capacity & Performance

16/08/2019

Computer Organisation

(1)

Arithmetic

- Addition/ Subtraction of Signed Numbers.
- Propagation of carry.
- n-bit adder.
- k n-bit adder
- n-bit subtractor

Proof  
Ram  
Sarkar

concept of  
overflow &  
underflow

$$S_i = x_i \oplus y_i \oplus c_i$$

$$c_{i+1} = x_i y_i + y_i c_i + c_i x_i$$

'Control-Line' in adder/subtractor

- Overflow occurs when sign of the result & sign of the operands is different.

$$\text{Overflow} = x_{n-1} y_{n-1} s_{n-1} + \bar{x}_{n-1} \bar{y}_{n-1} s_{n-1}$$

$$\text{Overflow} = c_n \oplus c_{n+1}$$

For - Sum

Time →

1 Gate Delay

For Carry

2 Gate Delays.



For a n-bit  
adder

$s_{n-1}$  available  
after  $2n-1$

$c_n$  is available  
after  $2n$   
Gate Delays

②

$$S'_i = x_i \oplus y_i \oplus c_i$$

$$c_{i+1} = x_i y_i + (x_i + y_i)c_i$$

$$c'_{i+1} = \underline{G_i} + \underline{P'_i c_i}$$

$$G_i = x_i y_i$$

$$P'_i = x'_i + y'_i$$

$G_i \rightarrow$  generate fn.

$P'_i \rightarrow$  propagate fn.

### → Carry Lookahead Adder

$$c'_{i+1} = G_i + P'_i G_{i-1} + P'_i P_{i-1} G_{i-2}$$

All carries  
are obtained by  
3 gate delays

$$+ \dots + P'_i P_{i-1} \dots P_0$$

### → Multiplication of unsigned numbers.



Unsigned Binary Multipl.

12/9/19

# Computer Organisation

William Stallings

Prof.  
Ram  
Sarkar

## Memory

- Volatile memory
- Non-volatile memory
- Magnetic surface memory
- Semi-conductor

Refer to sir's  
slide

## Memory - Hierarchy

- All about

### Inbound Memory



## Cache & Main Memory



## Cache Read Operation

### Typical Cache Organisation

### Elements of Cache Design

Logical cache  
and virtual  
cache

## Cache Address

Main Virtual Memory

## Logical & Physical Cache

## Disadvantage

Mapping th.

Set Associative

## Direct Mapping

## Replacement Algorithm

## 4) most common replacement algorithm

→ LRU

→ FIFO

→ LFU.

## Write Policy.

→ Write Through & Write Back

13/9/19

Computer OrganisationProf.  
Ram  
SarkarMemory Mapping1) Direct-mapping

line no.

 $i = \text{cache line no}$  $j = \text{block in main memory}$  $M = \text{no. of lines in cache}$ Cache memory

Jog line Block no. after

Block size

4 words

$$\text{No. of blocks in main memory} = \frac{64}{4} = 16$$

Physical address = 6 bits.

Words

- (0, 1, 2, 3)
- (52, 53, 54, 55)
- (40, 41, 42, 43)
- (60, 61, 62, 63)

2) Associative Mapping

Main Memory = 32 GB (35 bits)

Cache Memory = 32 KB (15 bits)

Jog = (35 - 15) = 20 bits

Propagation delay for comparator

= 10s

" (20 x 10) = 200 ns

" OR gate = 10s

Hit latency = 210 ns





29/9/19

## Computer Organisation

Foot.  
Ram  
Sarkar

### → Multi-Level Cache

### → Hit Ratio

→ Write Policy, Complicated design issues  
Optimal Ratio b/w L1 & L2.

### Unified cache vs Split cache

Read this from Sir's slides

- one dedicated to instruction
- One dedicated to data
- Both exist at same level, typically L2 cache
- Eliminates cache contention.

### Pipeline Architecture

When the fetch unit is idle, it can be used for another instruction.



### Split cache

## RAM

### → Memory Cell Operation

### → Semi-conductor memory

### Terminal

Select terminal, Control terminal, (terminal)

Both volatile,

### Dynamic RAM

↓  
Refer to the  
circuit diagram  
from Sir's slides.



SRAM v/s DRAM

No ~~refer~~ recharging is required.

Read the mechanism  
Used for cache memory

26/09/2019

## Computer Organisation

Proof.

RAM

Sarkar

### DRAM

→ we need refresh counter  
as capacitors are used

→ Typical 16 Mb DRAM (4M x 4)  
(Refer to sir's slides)

→ 4 bits can be read/written at a time.

→ Data In/Sense

$$2^{11} = 2048 \rightarrow \text{Row Decoder}$$

address lines

→ Buffer is required

Output is passed to the sense amplifier  
and presented to the data lines.

### 22 address lines

↓  
11 column  
address lines      ↓  
11 row  
address lines

### Inter-leaved Memory

Collection is called memory bank.

- Not good for localism
- Not sequential.

Hard Failure

Soft Error

## Error Correcting Code Function

### Advanced DRAM Organisation



### Cache DRAM

with working set



### Row-column burst mode

direct column address is modified

- external not too fast
- is decreased rate

27/9/19

## Computer Organisation

(Courtesy:- Titir Adhikary)

Prof.  
Ram  
Sarkar.

### Parity - Bit

→ only flip the bit in case of inconsistency.

M: 4. No. of comparisons :- 7

Data:- Inner component

Parity :- Outer component.



### Design of Error Correction Function

$\gamma$  = Syndrome word  $\Rightarrow$  Function  
(X-OR)

8 bits (M)

4 bits (K)

12 bits (M+K).

K bit syndrome word ( $2^{K-1}$ ).

$$(2^{K-1}) \geq (M+K).$$

Check bit:- Bit posn whose position members are powers of 2.

- If K has all zeros, there is no error
- If K has only 1, error has occurred in K, no correction is reqd.
- If more than one 1s in K, posn indicate error, bits need to be inverted,

(Please refer to Mita Ma'am's notes).

$$\begin{array}{ll} D_3 \text{ avoided} & D_5 \text{ avoided} \\ \downarrow & \downarrow \\ C_1 = D_1 \oplus D_2 \oplus D_4 \oplus D_5 \oplus D_7 & . \\ C_2 = D_1 \oplus D_3 \oplus D_4 \oplus D_6 \oplus D_7 & \\ & \begin{array}{c} \uparrow \\ D_2 \text{ avoided} \end{array} \quad \begin{array}{c} \uparrow \\ D_5 \text{ avoided} \end{array} \\ C_4 = D_2 \oplus D_3 \oplus D_4 \oplus D_8 & . \\ C_8 = D_5 \oplus D_6 \oplus D_7 \oplus D_8 & \end{array}$$

| <u>Bit posn</u>        | 12             | 11             | 10             | 9              | 8              | 7              | 6              | 5              | 4    | 3              | 2              | 1    |
|------------------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|------|----------------|----------------|------|
| <u>Posn no</u>         | 1100           | 1011           | 1010           | 1001           | 1000           | 0111           | 0110           | 0101           | 0100 | 0011           | 0010           | 0001 |
| <u>Data bit</u>        | D <sub>8</sub> | D <sub>7</sub> | D <sub>6</sub> | D <sub>5</sub> | D <sub>4</sub> | D <sub>3</sub> | D <sub>2</sub> | D <sub>1</sub> |      |                |                |      |
| <u>Check bit</u>       |                |                |                |                | C <sub>8</sub> |                |                | C <sub>4</sub> |      | C <sub>2</sub> | C <sub>1</sub> |      |
| <u>Word stored at</u>  | 0              | 0              | 1              | 1              | 0              | 1              | 0              | 0              | 1    | 1              | 1              | 1    |
| <u>Word fetched at</u> | 0              | 0              | 1              | 1              | 0              | 1              | 1              | 0              | 1    | 1              | 1              | 1    |

$$\begin{array}{r}
 \text{6th posn} \\
 \downarrow \\
 \text{6th bit} \\
 \text{0110} \\
 + 0111 \\
 \hline
 000
 \end{array}$$

(110) binary number = 120  
 $(110)_2 = (12)_10$

Result of addition 00000000000000000000000000000000

Part of result = 00000000000000000000000000000000  
 i.e. 00000000000000000000000000000000  
 i.e. 00000000000000000000000000000000

Result of addition = 00000000000000000000000000000000

(either 1's complement or 2's complement)

positive 23 bits = 00000000000000000000000000000000

negative 23 bits = 11111111111111111111111111111111

positive 23 bits = 00000000000000000000000000000000

negative 23 bits = 11111111111111111111111111111111

positive 23 bits = 00000000000000000000000000000000

negative 23 bits = 11111111111111111111111111111111

positive 23 bits = 00000000000000000000000000000000

negative 23 bits = 11111111111111111111111111111111

positive 23 bits = 00000000000000000000000000000000

negative 23 bits = 11111111111111111111111111111111

positive 23 bits = 00000000000000000000000000000000

negative 23 bits = 11111111111111111111111111111111

positive 23 bits = 00000000000000000000000000000000

negative 23 bits = 11111111111111111111111111111111

positive 23 bits = 00000000000000000000000000000000

negative 23 bits = 11111111111111111111111111111111

positive 23 bits = 00000000000000000000000000000000

negative 23 bits = 11111111111111111111111111111111

External MemoryMagnetic Disk

- Aluminium & Aluminium alloy.
- Magnetic film surface is uniform.
- Ability to support lower fly height.
- Better stiffness to withstand shock & damage.

Magnetic Read & Write Mechanisms

→ 2 heads.

Read & Write.

+ve pole, -ve pole

Write Mechanism :- produces a magnetic field. electricity flowing through a coil

Inductive Write / Magnetoresistive Read HeadDisk Data LayoutDisk Layout Methods Diagram

Constant angular velocity layout is not preferred.  
Multiple zones is preferred.

Winchester Disk FormatSeagate ST506Physical Characteristics of Disk SystemsCharacteristics

- Fixed Head Disk
- Movable Head disk
- Non-removable disk.

(2)

## Unit 10: Hard Disk

Page 3/03

### Characteristics

#### Tracks & Cylinders

#### Disk Classification

##### Winchester Heads

##### Typical Hard Disk Parameters

##### Timing of Disk I/O Transfer

#### Disk Performance Parameters

Non movable head

system < delay

Movable head system.

→ Access time

→ Transfer time

No dependency

### RAID

#### Redundant Array of Independent Disk

##### No hierarchy implied

##### RAID 0 (Non-Redundant)

##### RAID 1 (Mirrored)

##### RAID 2 (Redundancy through Hamming code)

:

##### RAID 6

#### Advantages & Disadvantages

Advantages

Fast access speed

Robust error detection

Cost effective

RAID.3 levels

- OS should view the entire thing as a single drive.
- Striping
- ~~extra~~ Concept of redundancy.

Raid Levels ( RAID 2 )

for Redundant via Hamming Code  $N+m$  drives are required.

$$m \propto \log N.$$

RAID 0 →  $N+2$ .

RAID 4, 5, 6 → Independent Access.

RAID 2, 3 → Parallel Access.

Parallel access → ~~Multiple User~~ Single User

Independent access → ~~Independent User~~ Multiple User.

RAID ComparisonRAID Level 0RAID Level 1

- No write penalty.

RAID Level 2

- Hamming code is used

RAID Level 3RAID Level 4

RAID Level 5 → Round Robin Scheme is used.

RAID Level 6 → Prevent single point failure.

→ ~~the~~ Most advantageous.

Morris - Mano

Pipelining → Improves throughput

- Makes the execution of programs fail.
  - Arrange the hardware so that more operation can be performed at the same time.
  - The no. of operations can be increased.

Parallelism is there.

bogged down 4-

## Traditional Pipeline Concept

## Laundry pg. .

- It does not help latency, rather it helps throughput of entire workload.

Refer  
to the  
slide.

再び→右  
ABP→日

## Use the Idea of Pipelining in a Computer

10M-107newsp

F·DEW

- ```

graph LR
    F((F)) --> D((D))
    D --> E((E))
    E --> W((W))
    W --> F

```

The diagram illustrates the sequential stages of a processor cycle:

  - Fetch**: The first stage, represented by a blue box labeled 'F'.
  - Decode**: The second stage, represented by a blue box labeled 'D'.
  - Execution**: The third stage, represented by a blue box labeled 'E'.
  - Write**: The fourth stage, represented by a blue box labeled 'W'.

The stages are connected by arrows pointing from left to right, indicating the flow of the process. The entire sequence loops back to the **Fetch** stage at the end.

affectionate

zurzeit sehr positiv

→ With:  
↳ booster will subtract → will remove all bad effects

## Role of Cache Memory

Jeff E. Shire

- Expected to complete in one complete clock cycle.
  - The clock cycle should be long enough to wait for the slowest pipeline stage to complete.
  - Pipeline becomes useless, when each instruction has to be extracted from main memory.

## Pipeline Performance

- Refer to sir's slides
- Any condition that cause a pipeline to stall is called a Hazard
- Data Hazard or occurs due to shared bus between two instructions
- Instruction (or control) Hazard occurs due to control hazard
- Structural Hazard

[Throughput is measured by the rate at which instruction execution is completed.]

### Data Hazards

e.g.

$$\begin{aligned} A &\leftarrow 3 + A \\ B &\leftarrow 4 \times A \end{aligned}$$

Sequential: MUL R2, R3, R4      Add R5, R4, R6

### Operand Forwarding

#### Handling Data Hazards

→ Let the compiler detect & handle the hazard.

### Side Effects

e.g. Autodecrement

→ Side effects happens due to side effects of subsequent instructions  
→ If forward pass of branch takes path  $A \rightarrow B$   
→ Side effects of update will occur before all instructions along path, unless demand update  
→ Example: NAME map function can't do

W3Q3

WSTF ← 7  
UBSF ← 9  
WITSF ← 3  
SISF ← 6

process who? to who?

(morris Mano)

Instruction QueueDelayed BranchBranchPrediction

Instruction history can be used.

(LT → Likely to be taken,  
INT → likely not to be taken).

Compiler should extra bit when pipelining is used.  
However, it takes more time.

✓ 2-state algorithm✓ 4-state algorithmSuperscalar operation

- Max throughput :- 1 inst./clock cycle.
- Processor capable of achieving an instruction execution throughput of more than one inst./cycle. → superscalar processor.

Superscalar

- Order of execution may change.  
(May cause inconsistency).

Out of order Execution

- Inconsistent state.
- Imprecise Exception
- Precise Exception
- Commitment step

Execution Completion

(Cache miss)

- Commitment unit → solves the problem of out of order execution and wait hazard.
- Reorder Buffer

Refinedwith now → Concept of a deadlock situation

Anticipate deadlock

Waiting for resource

Resource exhausted

Process 1 → Process 2 → Process 1 → Process 2 → Process 1 → Process 2 → ...  
 Hence neither is possible to undergo release of resources → both two won't ever be completed.

Resource deadlock

→ occurs when either of the following conditions are met:

Waited longer for resources

More transactions

higher priority

higher priority

and more resources