





## Floating - Point Representation



# The floating-point representation of a number has two parts:-

(i) Mantissa :-

Signed, fixed point number

(ii) Exponent :-

Designates the position of the decimal (or binary) point.

E.g.: + 6132.789 is represented in floating-point as :-

| Fraction   | Exponent |
|------------|----------|
| +0.6132789 | +04      |

The value of the exponent indicates that the actual position of the decimal point is four positions to right of the indicated decimal point in fraction.

E.g.: The binary number + 1001.11 is represented with an 8-bit fraction and 6-bit exponent as follows:-

| Fraction             | Exponent |
|----------------------|----------|
| 01001110<br>Sign bit | 000100   |

The fraction has a 0 in the leftmost position to denote positive.

The binary point of the fraction follows the sign bit but is not shown in the register.

### Normalized Floating Point Number :-

A floating-point number is said to be **normalized** if the most significant digit of the mantissa is nonzero.

E.g.:- The decimal number 350 is normalized but 000350 is not.

# Normalized numbers provide the maximum possible precision for the floating point number.

# A zero can not be normalized because it does not have a nonzero digit. It is usually represented in floating-point by all 0's in the mantissa and exponent.

### Micro-operation :-

The operations executed on data stored in registers are called micro-operations. A micro-operation is an elementary operation. E.g.: shift, count, clear and load.

### Register Transfer Language :-

The symbolic notation used to describe the microoperation transfers among registers is called a Register Transfer Language (RTL).

[www.ankurgupta.net](http://www.ankurgupta.net)

# The internal hardware organization of a digital computer is best defined by specifying:-

- (i) The set of registers it contains and their function.
- (ii) The sequence of micro-operations performed on the binary information stored in the registers.
- (iii) The control that initiates the sequence of micro-operations.

Registers :-

(i) Memory Address Register (MAR) :-  
Holds an address of the memory unit.

(ii) PC :-  
Program Counter

(iii) IR :-  
Instruction Register

(iv) Rs :-  
Processor Register.

→ Computer registers are designated by capital letters.

Register Transfer :-

$$R_2 \leftarrow R_1$$

Control Function :-

$$P : R_2 \leftarrow R_1$$

It represents the transfer operation be executed by the hardware only if  $P = 1$ .

→ A control function is a boolean variable that is equal to 1 or 0.

Note :- The clock is not included as a variable in the register transfer statements. It is assumed that all transfers occur during a clock edge transition.

Bus and Memory Transfer :-Common Bus :-

It consists of a set of common lines, one for each bit of a register, through which binary information is transferred one at a time.

Control signals determine which register is selected by the bus during each particular register transfer.

# # Bus System for 4 Registers :-

# # → In general, a bus system will multiplex  $k$ -registers of  $n$ -bits each to produce an  $n$ -line common bus.

# # → The number of multiplexers needed to construct the bus is equal to  $n$ , the number of bits in each register.

# # → The size of each multiplexer must be  $4 \times 1$ , since it multiplexes to data lines.

### Bus Transfer :-

$$\begin{aligned} \text{BUS} &\leftarrow R_2, \quad R_1 \leftarrow \text{BUS} \\ \Rightarrow \quad R_1 &\leftarrow R_2 \end{aligned}$$

### Memory Transfer :-

Read :  $DR \leftarrow M[AR]$

Write :  $M[AR] \leftarrow R_1$

where, DR  $\rightarrow$  Data Register  
 AR  $\rightarrow$  Address Register

### Micro-operations :-

The microoperations are classified into four categories :-

- (i) Register Transfer Microoperations
- (ii) Arithmetic Microoperations
- (iii) Logic Microoperations
- (iv) Shift Microoperations

### Arithmetic Microoperations :-

(i) Addition :-  $R_3 \leftarrow R_1 + R_2$

(ii) Subtraction :-  $R_3 \leftarrow R_1 - R_2$   
 $\Rightarrow R_3 \leftarrow R_1 + \bar{R}_2 + 1$

(iii) 1's complement :-  $R_2 \leftarrow \bar{R}_2$

(iv) 2's Complement :-  $R_2 \leftarrow \bar{R}_2 + 1$

(v) Increment :-  $R_1 \leftarrow R_1 + 1$

(vi) Decrement :-  $R_1 \leftarrow R_1 - 1$

(vii) Arithmetic Shift :- Explained later with shift microops

→ The increment and decrement microoperations are implemented with a binary up-down counter.

### Logic Microoperations :-

These operations consider each bit of the register separately and treat them as binary variables.

(i) OR :-  $R_1 \leftarrow R_2 \vee R_3$

(ii) AND :-  $R_1 \leftarrow R_2 \wedge R_3$

(iii) Ex-OR :-  $R_1 \leftarrow R_2 \oplus R_3$

(iv) Complement :-  $R_1 \leftarrow \bar{R}_1$

### The effect between PC and CCR

When the symbol '+' occurs in a micro-operation, it will denote an arithmetic plus.

When it occurs in a control function, it will denote an OR operation.

$$P + Q : R_1 \leftarrow R_2 + R_3, R_4 \leftarrow R_5 \vee R_6$$

OR                  Add                  OR

### List of Logic Microoperations :-

There are 16 different logic operations that can be performed with two binary variables.

| x | y | $F_0$ | $F_1$ | $F_2$ | $F_3$ | $F_4$ | $F_5$ | $F_6$ | $F_7$ | $F_8$ | $F_9$ | $F_{10}$ | $F_{11}$ | $F_{12}$ | $F_{13}$ | $F_{14}$ | $F_{15}$ |
|---|---|-------|-------|-------|-------|-------|-------|-------|-------|-------|-------|----------|----------|----------|----------|----------|----------|
| 0 | 0 | 0     | 0     | 0     | 0     | 0     | 0     | 1     | 1     | 1     | 1     | 1        | 1        | 1        | 1        | 1        | 1        |
| 0 | 1 | 0     | 0     | 0     | 1     | 1     | 1     | 1     | 0     | 0     | 0     | 0        | 1        | 1        | 1        | 1        | 1        |
| 1 | 0 | 0     | 0     | 1     | 1     | 0     | 0     | 1     | 1     | 0     | 0     | 1        | 1        | 1        | 1        | 1        | 1        |
| 1 | 1 | 0     | 1     | 0     | 1     | 0     | 1     | 0     | 1     | 0     | 1     | 0        | 1        | 0        | 1        | 0        | 1        |

8B

### Shift Microoperations :-

There are three types of shifts :-

- (i) Logical Shift
- (ii) Circular Shift
- (iii) Arithmetic Shift

#### # Shift-left operation :-

The serial input transfers

a bit into the rightmost position.

#### # Shift-right operation :-

The serial input transfers

a bit into the leftmost position.

#### (i) Logical Shift :-

A logical shift is one that transfers 0 through the serial input.

sl and shr symbols are used for logical left shift and right shift microoperations.

E.g.-  $R_1 \leftarrow \text{shr } R_1$   
 $R_2 \leftarrow \text{sl } R_2$

#### (ii) Circular Shift :-

Here the serial output of the shift register is connected to its serial input.

cil and cir symbols are used for circular shift left and shift right microoperations.

E.g.-  $R_1 \leftarrow \text{cir } R_1$   
 $R_2 \leftarrow \text{cil } R_2$

### #(iii) Arithmetic Shift:-

→ An arithmetic shift-left (ashl) multiplies a signed binary number by 2.

→ An arithmetic shift-right (ashr) divides the number by 2.

→ Arithmetic shifts must leave the sign bit unchanged because the sign of the number remains the same when it is multiplied or divided by 2.

→ The ashr leaves the sign bit unchanged and shifts the number (including the sign-bit) to the right.

→ The ashl inserts a 0 from right and shifts all other bits to the left.

If the multiplication by 2 causes an overflow, the sign-bit in the register changes from 0 to 1 or 1 to 0.



Checking for overflow:-

$$V_1 = R_{n-1} \oplus R_{n-2}$$

If  $V_1 = 0$  :-

There is no overflow.

If  $V_1 = 1$  :-

There is an overflow.

### Arithmetic Logic Shift Unit (ALU):-

To perform a microoperation in ALU, the contents of specified registers are placed in the inputs of the common ALU. The ALU performs an operation and the result of the operation is then transferred to a destination register.

#The ALU is a combinational circuit so that the entire operation can be performed during one clock pulse period.

## Problems



Q:- If ( $P=1$ ) then ( $R_1 \leftarrow R_2$ )  
else if ( $O=1$ ) then ( $R_1 \leftarrow R_3$ )

Represent above statements using control functions:-

Sol :-

P :  $R_1 \leftarrow R_2$

PQ :  $R_1 \leftarrow R_3$

Q:- A digital computer has a common bus system for 16 registers of 32-bits each. The bus is constructed with multiplexers.

(a) Each multiplexer will have 4 selection lines to select one of 16 registers.

(b) The size of multiplexers required is  $16 \times 1$ .

(c) 32 multiplexers will be required, one for each bit of the registers.

[www.ankuogupta.net](http://www.ankuogupta.net)

Q:-  $A = 10011100$ . Find ashf and ashl.

Sol :-

ashf  $A \Rightarrow 11001110$

ashl  $A \Rightarrow 00111000 \rightarrow$  Overflow, because a negative number changed to positive.

### Instruction Formats :-

A computer will usually have a variety of Instruction Code Formats. It is the function of the control unit within the CPU to interpret each instruction code and provide the necessary control functions needed to process the instruction.

The bits of the instruction code are usually divided into three groups:-

- (i) An operation code field that specifies the operation to be performed.
- (ii) An address field that designates a memory address or a processor register.
- (iii) A mode field that specifies the way the operand or the effective address is determined.

|                |               |      |
|----------------|---------------|------|
| Operation Code | Address Field | Mode |
|----------------|---------------|------|

# A register address is a binary number of  $k$ -bits that define one of  $2^k$  registers in the CPU.

# The instruction code is also called as control word.

# An operation code is sometimes called a macro-operation because it specifies a set of micro-operations.

### Types of CPU Organizations:-

The number of address fields in the instruction format of a computer depends on the internal organization of its registers.

Most of the computers fall into one of the three types of CPU organizations:-

- (i) Single Accumulator organization,
- (ii) General Register Organization,
- (iii) Stack Organization

#### (i) Single Accumulator Organization:-

All operations are performed with an implied accumulator register.

The instruction format in this type of computer uses one address field.

E.g.: - ADD X

$$\Rightarrow AC \leftarrow AC + M[X]$$

AC → Accumulator Register.

#### (ii) General Register Organization:-

The instruction format in this type of computer uses either three or two address fields.

E.g.: (i) ADD R1, R2, R3  $\Rightarrow R_1 \leftarrow R_2 + R_3$

(ii) ADD R1, R2  $\Rightarrow R_1 \leftarrow R_1 + R_2$

(iii) ADD R1, X  $\Rightarrow R_1 \leftarrow R_1 + M[X]$

(iv) MOV R1, R2  $\Rightarrow R_1 \leftarrow R_2$

#### (iii) Stack Organization:-

Computers with stack organization have PUSH and POP instructions which require one address field.

E.g.: - PUSH X

Operation-type instructions do not need an address field. (Zero address instruction).

E.g.: - ADD

# Single Accumulator Organization uses One-Address instructions.

# General Register Organization uses three and two-address instructions.

# Stack Organization uses zero and one-address instructions.

E.g.: In a general register's CPU organization, there are eight general-purpose registers and ALU can perform 32-different operations.

The number of selection lines of each multiplexer for selecting the operand is :- 3 ( $2^3 = 8$ )

The number of bits in operation code = 5 ( $2^5 = 32$ )

The length of the control word =  $5+3+3+3 = 14$   
(for 3-address instruction)

## CISC :-

A computer with a large number of instructions is classified as a complex instruction set computer.

The essential goal of a CISC architecture is to attempt to provide a single machine instruction for each statement that is written in a high-level language.

The major characteristics of CISC architecture are:-

(1) A large number of instructions - typically from 100 to 250 instructions.

(2) Some instructions that perform specialized tasks and are used infrequently.

(3) A large variety of addressing modes - typically from 5 to 20 different modes.

(4) Variable-length instruction formats.

(5) Instructions that manipulate operands in memory.

## RISC :-

The Reduced Instruction Set Computer is a computer containing fewer instruction with simple constructs that can be executed much faster within the CPU without having to use memory as often.

The major characteristics of a RISC processor are :-

(1) Relatively few instructions.

(2) Relatively few addressing modes.

(3) Memory access limited to load and store instructions.

(4) All operations done within the registers of the CPU.

(5) Fixed-length, easily decoded instruction format.

(6) Single-cycle instruction execution.

(7) Hardwired rather than microprogrammed control.

(8) Delayed Load → When there is a data dependency conflict in instruction pipeline.

(9) Delayed Branch → When there is a branch instruction in instruction pipeline.

### Addressing Modes :-

The way the operands are chosen during program execution is dependent on the addressing mode of the instruction.

# The addressing mode may reduce the number of bits in the addressing field of the instruction.

Different types of addressing modes are given below:-

#### (1) Implied Mode :-

In this mode the operands are specified implicitly in the definition of the instruction.

→ All register reference instructions that use only an accumulator are implied-mode instructions.

→ Zero-address instructions in a stack-organized computer are implied-mode instructions.

#### (2) Immediate Mode :-

In this mode the operand is specified in the instruction itself.

In other words, an immediate-mode instruction has an operand field rather than an address field.

E.g.: - LD #NBR  $\Rightarrow$  AC  $\leftarrow$  NBR

#### (3) Register Mode :-

In this mode the operands are in registers that reside within the CPU. The particular register is selected from a register field in the instruction.

A k-bit field can specify any one of  $2^k$ -registers.

E.g.: - E.g.: - LD R1  $\Rightarrow$  AC  $\leftarrow$  R1

#### (4) Register Indirect Mode :-

In this mode, the instruction specifies a register that resides within the CPU whose contents give the address of the operand in memory.

E.g.: - LD (R1)  $\Rightarrow$  AC  $\leftarrow$  M[R1]

#### (5) Autoincrement or Autodecrement Mode :-

This is similar to the register indirect mode except that the register is incremented or decremented after (or before) its value is used to access the memory.

E.g.: - LD (R1) +  
 $\Rightarrow$  AC  $\leftarrow$  M[R1], R1  $\leftarrow$  R1 + 1.

#### # Effective Address :-

The effective address is defined to be the memory address obtained from the computation dictated by the given addressing mode.

(6) Direct Address Mode :-

In this mode the effective address is equal to the address part of the instruction.

$$\text{E.g.:- } \text{LD ADR} \Rightarrow AC \leftarrow M[ADR]$$

(7) Indirect Address Mode :-

In this mode the address field of the instruction gives the address where the effective address is stored in memory.

$$\begin{aligned} \text{E.g.:- } \text{LD @ADR} &\Rightarrow AC \leftarrow M[M[ADR]] \\ &\Rightarrow AC \leftarrow M[M[ADR]] \end{aligned}$$

(8) Relative Address Mode :-

In this mode, the content of the program counter (PC) is added to the address part of the instruction in order to obtain the effective address.

$$\text{E.g.:- } \text{LD $ADR} \Rightarrow AC \leftarrow M[PC + ADR]$$

# Relative addressing is often used with branch-type instructions.

(9) Indexed Addressing Mode :-

In this mode the content of an index register is added to the address part of the instruction to obtain the effective address.

The address field of the instruction defines the beginning address of a data array in memory.

$$\text{E.g.:- } \text{LD ADR}(X) \Rightarrow AC \leftarrow M[ADR + X]$$

(10) Base Register Addressing Mode :-

In this mode the content of a base register is added to the address part of the instruction to obtain the effective address.

E.g.:-

# The base register addressing mode is used for relocation of programs in memory.

## Central Processing Unit

# CPU is made up of three major parts:-

- (i) Register Set
- (ii) Arithmetic-logic unit (ALU)
- (iii) Control Unit (CU)



### Some Fundamental Concepts :-

[www.ankurgupta.net](http://www.ankurgupta.net)

→ The instructions constituting a program to be executed are loaded in sequential locations in the main memory.

→ Instructions are fetched from successive memory locations until a branch or a jump instruction is executed.

→ A dedicated CPU register, referred to as the program counter (PC) contains the address of the memory location containing the next instruction.

→ After fetching an instruction, the contents of the PC are updated to point to the next instruction in sequence.

## Steps to execute an instruction :-

(i) Fetch the contents of the memory location pointed to by PC (program counter) and load them into IR (instruction register).

$$IR \leftarrow M[PC]$$

(ii) Increment the contents of PC by 1.

$$PC \leftarrow PC + 1$$

(iii) Carry out the actions specified by the instructions in the IR.

# Step 1 and 2 are usually referred to as the fetch phase.

# Step 3 constitutes the execution phase.

→ In cases where an instruction occupies more than one word, steps 1 and 2 must be repeated as many times as necessary to fetch the complete instruction.

→ The width of the data bus between the CPU and the main memory determines the number of bytes that can be transferred in one access. This width is likely to be 2, 4 or 8 bytes so the PC must be incremented by 2, 4 or 8 after each fetch step.

## Single-bus organization of the data paths inside the CPU :-



# Here the single common bus is internal to the CPU, and should not be confused with the external bus.

# The ALU and the registers are used for storing and manipulating data and referred to as datapath.

## Execution of a complete instruction :-

ADD (R<sub>3</sub>), R<sub>1</sub>  $\rightarrow$  R,  $\leftarrow M[R_3] + R_1$

Steps required to execute the above instruction are:-

| Step  | Action                                                                                             |
|-------|----------------------------------------------------------------------------------------------------|
| Fetch | 1 PC <sub>out</sub> , MAR <sub>in</sub> , Read, Clear Y, Set Carry-in to ALU, Add, Z <sub>in</sub> |
|       | 2 Z <sub>out</sub> , PC <sub>in</sub> , WMFC                                                       |
|       | 3 MDR <sub>out</sub> , IR <sub>in</sub>                                                            |
|       | 4 R <sub>3out</sub> , MAR <sub>in</sub> , Read                                                     |
|       | 5 R <sub>3out</sub> , Y <sub>in</sub> , WMFC                                                       |
|       | 6 MDR <sub>out</sub> , Add, Z <sub>in</sub>                                                        |
|       | 7 Z <sub>out</sub> , R <sub>1in</sub> , End.                                                       |

WMFC (Wait for)

WMFC  $\rightarrow$  Wait for Memory Function Completed.

# For above instruction execution, 7 non-overlapping time-slots are required. Each time slot must be at least long enough for the functions specified in the corresponding to be completed.

# Since a memory-access is slow, while a memory access is taking place, the CPU can perform other internal operations.

## CPU Control Design :-

There are two major types of control organization :-

- (i) Hardwired Control
- (ii) Microprogrammed Control.

### (i) Hardwired Control Organization :-

In the hardwired organization, the control logic is implemented with gates, flip-flops, decoders, and other digital circuits.

It has the advantage that it can be optimized to produce a fast mode of operation.

It requires changes in the wiring among the various components if the design has to be modified or changed.

### (ii) Micro-programmed Control Organization :-

In the micro-programmed organization, the control information is stored in a control memory. The control memory is programmed to initiate the required sequence of micro-operations.

In the micro-programmed control, any required changes or modifications can be done by updating the microprogram in control memory.

### Micro-programmed Control :-

In micro-programmed control, the control signals are generated by a program similar to machine language programs.

#### Control Word :-

A control word (CW) is a word whose individual bits represent the various control signals.

# A sequence of CWs corresponding to the control sequence of a machine instruction constitutes the micro-routine.

# The individual CWs in this micro-routine are referred to as micro-instructions.

# The microroutines for all instructions in the instruction set of a computer are stored in a special memory called the control store.

# The control unit can generate the control signals for any instruction by sequentially reading the CWs of the corresponding microroutine from the control store.

### Basic organization of a microprogrammed control unit :-



# The microprogram counter (μPC) is used to read the control words sequentially from the control store.

# Every time a new instruction is loaded into the IR, the output of the block labeled "starting address generator" is loaded into the μPC. The μPC is then automatically incremented by the clock, causing successive microinstructions to be read from the control store.

The μPC does not increment in the following situations :-

- When an End microinstruction is encountered.
- When a new instruction is loaded into IR.
- When a branch micro-instruction is encountered and the branch condition is satisfied.

Microinstructions :-

A straightforward way to structure microinstructions is to assign one bit position to each control signal required in the CPU.

| Micro-instruction | $E$ | $PC_{out}$ | $\Sigma$ | $MAR_{in}$ | $Read$ | $MDR_{in}$ | $IR$ | $E$ | $X$ | $Y$ | $Z$ | $Clear$ | $g_1$ | $g_2$ | $g_3$ | $g_4$ | $g_5$ | $g_6$ | $g_7$ | $g_8$ | $g_9$ | $g_{10}$ | $g_{11}$ | $g_{12}$ | $g_{13}$ | $g_{14}$ | $g_{15}$ | $g_{16}$ | $g_{17}$ | $g_{18}$ | $g_{19}$ | $g_{20}$ | $g_{21}$ | $g_{22}$ | $g_{23}$ | $g_{24}$ | $g_{25}$ | $g_{26}$ | $g_{27}$ | $g_{28}$ | $g_{29}$ | $g_{30}$ | $g_{31}$ | $g_{32}$ | $g_{33}$ | $g_{34}$ | $g_{35}$ | $g_{36}$ | $g_{37}$ | $g_{38}$ | $g_{39}$ | $g_{40}$ | $g_{41}$ | $g_{42}$ | $g_{43}$ | $g_{44}$ | $g_{45}$ | $g_{46}$ | $g_{47}$ | $g_{48}$ | $g_{49}$ | $g_{50}$ | $g_{51}$ | $g_{52}$ | $g_{53}$ | $g_{54}$ | $g_{55}$ | $g_{56}$ | $g_{57}$ | $g_{58}$ | $g_{59}$ | $g_{60}$ | $g_{61}$ | $g_{62}$ | $g_{63}$ | $g_{64}$ | $g_{65}$ | $g_{66}$ | $g_{67}$ | $g_{68}$ | $g_{69}$ | $g_{70}$ | $g_{71}$ | $g_{72}$ | $g_{73}$ | $g_{74}$ | $g_{75}$ | $g_{76}$ | $g_{77}$ | $g_{78}$ | $g_{79}$ | $g_{80}$ | $g_{81}$ | $g_{82}$ | $g_{83}$ | $g_{84}$ | $g_{85}$ | $g_{86}$ | $g_{87}$ | $g_{88}$ | $g_{89}$ | $g_{90}$ | $g_{91}$ | $g_{92}$ | $g_{93}$ | $g_{94}$ | $g_{95}$ | $g_{96}$ | $g_{97}$ | $g_{98}$ | $g_{99}$ | $g_{100}$ | $g_{101}$ | $g_{102}$ | $g_{103}$ | $g_{104}$ | $g_{105}$ | $g_{106}$ | $g_{107}$ | $g_{108}$ | $g_{109}$ | $g_{110}$ | $g_{111}$ | $g_{112}$ | $g_{113}$ | $g_{114}$ | $g_{115}$ | $g_{116}$ | $g_{117}$ | $g_{118}$ | $g_{119}$ | $g_{120}$ | $g_{121}$ | $g_{122}$ | $g_{123}$ | $g_{124}$ | $g_{125}$ | $g_{126}$ | $g_{127}$ | $g_{128}$ | $g_{129}$ | $g_{130}$ | $g_{131}$ | $g_{132}$ | $g_{133}$ | $g_{134}$ | $g_{135}$ | $g_{136}$ | $g_{137}$ | $g_{138}$ | $g_{139}$ | $g_{140}$ | $g_{141}$ | $g_{142}$ | $g_{143}$ | $g_{144}$ | $g_{145}$ | $g_{146}$ | $g_{147}$ | $g_{148}$ | $g_{149}$ | $g_{150}$ | $g_{151}$ | $g_{152}$ | $g_{153}$ | $g_{154}$ | $g_{155}$ | $g_{156}$ | $g_{157}$ | $g_{158}$ | $g_{159}$ | $g_{160}$ | $g_{161}$ | $g_{162}$ | $g_{163}$ | $g_{164}$ | $g_{165}$ | $g_{166}$ | $g_{167}$ | $g_{168}$ | $g_{169}$ | $g_{170}$ | $g_{171}$ | $g_{172}$ | $g_{173}$ | $g_{174}$ | $g_{175}$ | $g_{176}$ | $g_{177}$ | $g_{178}$ | $g_{179}$ | $g_{180}$ | $g_{181}$ | $g_{182}$ | $g_{183}$ | $g_{184}$ | $g_{185}$ | $g_{186}$ | $g_{187}$ | $g_{188}$ | $g_{189}$ | $g_{190}$ | $g_{191}$ | $g_{192}$ | $g_{193}$ | $g_{194}$ | $g_{195}$ | $g_{196}$ | $g_{197}$ | $g_{198}$ | $g_{199}$ | $g_{200}$ | $g_{201}$ | $g_{202}$ | $g_{203}$ | $g_{204}$ | $g_{205}$ | $g_{206}$ | $g_{207}$ | $g_{208}$ | $g_{209}$ | $g_{210}$ | $g_{211}$ | $g_{212}$ | $g_{213}$ | $g_{214}$ | $g_{215}$ | $g_{216}$ | $g_{217}$ | $g_{218}$ | $g_{219}$ | $g_{220}$ | $g_{221}$ | $g_{222}$ | $g_{223}$ | $g_{224}$ | $g_{225}$ | $g_{226}$ | $g_{227}$ | $g_{228}$ | $g_{229}$ | $g_{230}$ | $g_{231}$ | $g_{232}$ | $g_{233}$ | $g_{234}$ | $g_{235}$ | $g_{236}$ | $g_{237}$ | $g_{238}$ | $g_{239}$ | $g_{240}$ | $g_{241}$ | $g_{242}$ | $g_{243}$ | $g_{244}$ | $g_{245}$ | $g_{246}$ | $g_{247}$ | $g_{248}$ | $g_{249}$ | $g_{250}$ | $g_{251}$ | $g_{252}$ | $g_{253}$ | $g_{254}$ | $g_{255}$ | $g_{256}$ | $g_{257}$ | $g_{258}$ | $g_{259}$ | $g_{260}$ | $g_{261}$ | $g_{262}$ | $g_{263}$ | $g_{264}$ | $g_{265}$ | $g_{266}$ | $g_{267}$ | $g_{268}$ | $g_{269}$ | $g_{270}$ | $g_{271}$ | $g_{272}$ | $g_{273}$ | $g_{274}$ | $g_{275}$ | $g_{276}$ | $g_{277}$ | $g_{278}$ | $g_{279}$ | $g_{280}$ | $g_{281}$ | $g_{282}$ | $g_{283}$ | $g_{284}$ | $g_{285}$ | $g_{286}$ | $g_{287}$ | $g_{288}$ | $g_{289}$ | $g_{290}$ | $g_{291}$ | $g_{292}$ | $g_{293}$ | $g_{294}$ | $g_{295}$ | $g_{296}$ | $g_{297}$ | $g_{298}$ | $g_{299}$ | $g_{300}$ | $g_{301}$ | $g_{302}$ | $g_{303}$ | $g_{304}$ | $g_{305}$ | $g_{306}$ | $g_{307}$ | $g_{308}$ | $g_{309}$ | $g_{310}$ | $g_{311}$ | $g_{312}$ | $g_{313}$ | $g_{314}$ | $g_{315}$ | $g_{316}$ | $g_{317}$ | $g_{318}$ | $g_{319}$ | $g_{320}$ | $g_{321}$ | $g_{322}$ | $g_{323}$ | $g_{324}$ | $g_{325}$ | $g_{326}$ | $g_{327}$ | $g_{328}$ | $g_{329}$ | $g_{330}$ | $g_{331}$ | $g_{332}$ | $g_{333}$ | $g_{334}$ | $g_{335}$ | $g_{336}$ | $g_{337}$ | $g_{338}$ | $g_{339}$ | $g_{340}$ | $g_{341}$ | $g_{342}$ | $g_{343}$ | $g_{344}$ | $g_{345}$ | $g_{346}$ | $g_{347}$ | $g_{348}$ | $g_{349}$ | $g_{350}$ | $g_{351}$ | $g_{352}$ | $g_{353}$ | $g_{354}$ | $g_{355}$ | $g_{356}$ | $g_{357}$ | $g_{358}$ | $g_{359}$ | $g_{360}$ | $g_{361}$ | $g_{362}$ | $g_{363}$ | $g_{364}$ | $g_{365}$ | $g_{366}$ | $g_{367}$ | $g_{368}$ | $g_{369}$ | $g_{370}$ | $g_{371}$ | $g_{372}$ | $g_{373}$ | $g_{374}$ | $g_{375}$ | $g_{376}$ | $g_{377}$ | $g_{378}$ | $g_{379}$ | $g_{380}$ | $g_{381}$ | $g_{382}$ | $g_{383}$ | $g_{384}$ | $g_{385}$ | $g_{386}$ | $g_{387}$ | $g_{388}$ | $g_{389}$ | $g_{390}$ | $g_{391}$ | $g_{392}$ | $g_{393}$ | $g_{394}$ | $g_{395}$ | $g_{396}$ | $g_{397}$ | $g_{398}$ | $g_{399}$ | $g_{400}$ | $g_{401}$ | $g_{402}$ | $g_{403}$ | $g_{404}$ | $g_{405}$ | $g_{406}$ | $g_{407}$ | $g_{408}$ | $g_{409}$ | $g_{410}$ | $g_{411}$ | $g_{412}$ | $g_{413}$ | $g_{414}$ | $g_{415}$ | $g_{416}$ | $g_{417}$ | $g_{418}$ | $g_{419}$ | $g_{420}$ | $g_{421}$ | $g_{422}$ | $g_{423}$ | $g_{424}$ | $g_{425}$ | $g_{426}$ | $g_{427}$ | $g_{428}$ | $g_{429}$ | $g_{430}$ | $g_{431}$ | $g_{432}$ | $g_{433}$ | $g_{434}$ | $g_{435}$ | $g_{436}$ | $g_{437}$ | $g_{438}$ | $g_{439}$ | $g_{440}$ | $g_{441}$ | $g_{442}$ | $g_{443}$ | $g_{444}$ | $g_{445}$ | $g_{446}$ | $g_{447}$ | $g_{448}$ | $g_{449}$ | $g_{450}$ | $g_{451}$ | $g_{452}$ | $g_{453}$ | $g_{454}$ | $g_{455}$ | $g_{456}$ | $g_{457}$ | $g_{458}$ | $g_{459}$ | $g_{460}$ | $g_{461}$ | $g_{462}$ | $g_{463}$ | $g_{464}$ | $g_{465}$ | $g_{466}$ | $g_{467}$ | $g_{468}$ | $g_{469}$ | $g_{470}$ | $g_{471}$ | $g_{472}$ | $g_{473}$ | $g_{474}$ | $g_{475}$ | $g_{476}$ | $g_{477}$ | $g_{478}$ | $g_{479}$ | $g_{480}$ | $g_{481}$ | $g_{482}$ | $g_{483}$ | $g_{484}$ | $g_{485}$ | $g_{486}$ | $g_{487}$ | $g_{488}$ | $g_{489}$ | $g_{490}$ | $g_{491}$ | $g_{492}$ | $g_{493}$ | $g_{494}$ | $g_{495}$ | $g_{496}$ | $g_{497}$ | $g_{498}$ | $g_{499}$ | $g_{500}$ | $g_{501}$ | $g_{502}$ | $g_{503}$ | $g_{504}$ | $g_{505}$ | $g_{506}$ | $g_{507}$ | $g_{508}$ | $g_{509}$ | $g_{510}$ | $g_{511}$ | $g_{512}$ | $g_{513}$ | $g_{514}$ | $g_{515}$ | $g_{516}$ | $g_{517}$ | $g_{518}$ | $g_{519}$ | $g_{520}$ | $g_{521}$ | $g_{522}$ | $g_{523}$ | $g_{524}$ | $g_{525}$ | $g_{526}$ | $g_{527}$ | $g_{528}$ | $g_{529}$ | $g_{530}$ | $g_{531}$ | $g_{532}$ | $g_{533}$ | $g_{534}$ | $g_{535}$ | $g_{536}$ | $g_{537}$ | $g_{538}$ | $g_{539}$ | $g_{540}$ | $g_{541}$ | $g_{542}$ | $g_{543}$ | $g_{544}$ | $g_{545}$ | $g_{546}$ | $g_{547}$ | $g_{548}$ | $g_{549}$ | $g_{550}$ | $g_{551}$ | $g_{552}$ | $g_{553}$ | $g_{554}$ | $g_{555}$ | $g_{556}$ | $g_{557}$ | $g_{558}$ | $g_{559}$ | $g_{560}$ | $g_{561}$ | $g_{562}$ | $g_{563}$ | $g_{564}$ | $g_{565}$ | $g_{566}$ | $g_{567}$ | $g_{568}$ | $g_{569}$ | $g_{570}$ | $g_{571}$ | $g_{572}$ | $g_{573}$ | $g_{574}$ | $g_{575}$ | $g_{576}$ | $g_{577}$ | $g_{578}$ | $g_{579}$ | $g_{580}$ | $g_{581}$ | $g_{582}$ | $g_{583}$ | $g_{584}$ | $g_{585}$ | $g_{586}$ | $g_{587}$ | $g_{588}$ | $g_{589}$ | $g_{590}$ | $g_{591}$ | $g_{592}$ | $g_{593}$ | $g_{594}$ | $g_{595}$ | $g_{596}$ | $g_{597}$ | $g_{598}$ | $g_{599}$ | $g_{600}$ | $g_{601}$ | $g_{602}$ | $g_{603}$ | $g_{604}$ | $g_{605}$ | $g_{606}$ | $g_{607}$ | $g_{608}$ | $g_{609}$ | $g_{610}$ | $g_{611}$ | $g_{612}$ | $g_{613}$ | $g_{614}$ | $g_{615}$ | $g_{616}$ | $g_{617}$ | $g_{618}$ | $g_{619}$ | $g_{620}$ | $g_{621}$ | $g_{622}$ | $g_{623}$ | $g_{624}$ | $g_{625}$ | $g_{626}$ | $g_{627}$ | $g_{628}$ | $g_{629}$ | $g_{630}$ | $g_{631}$ | $g_{632}$ | $g_{633}$ | $g_{634}$ | $g_{635}$ | $g_{636}$ | $g_{637}$ | $g_{638}$ | $g_{639}$ | $g_{640}$ | $g_{641}$ | $g_{642}$ | $g_{643}$ | $g_{644}$ | $g_{645}$ | $g_{646}$ | $g_{647}$ | $g_{648}$ | $g_{649}$ | $g_{650}$ | $g_{651}$ | $g_{652}$ | $g_{653}$ | $g_{654}$ | $g_{655}$ | $g_{656}$ | $g_{657}$ | $g_{658}$ | $g_{659}$ | $g_{660}$ | $g_{661}$ | $g_{662}$ | $g_{663}$ | $g_{664}$ | $g_{665}$ | $g_{666}$ | $g_{667}$ | $g_{668}$ | $g_{669}$ | $g_{670}$ | $g_{671}$ | $g_{672}$ | $g_{673}$ | $g_{674}$ | $g_{675}$ | $g_{676}$ | $g_{677}$ | $g_{678}$ | $g_{679}$ | $g_{680}$ | $g_{681}$ | $g_{682}$ | $g_{683}$ | $g_{684}$ | $g_{685}$ | $g_{686}$ | $g_{687}$ | $g_{688}$ | $g_{689}$ | $g_{690}$ | $g_{691}$ | $g_{692}$ | $g_{693}$ | $g_{694}$ | $g_{695}$ | $g_{696}$ | $g_{697}$ | $g_{698}$ | $g_{699}$ | $g_{700}$ | $g_{701}$ | $g_{702}$ | $g_{703}$ | $g_{704}$ | $g_{705}$ | $g_{706}$ | $g_{707}$ | $g_{708}$ | $g_{709}$ | $g_{710}$ | $g_{711}$ | $g_{712}$ | $g_{713}$ | $g_{714}$ | $g_{715}$ | $g_{716}$ | $g_{717}$ | $g_{718}$ | $g_{719}$ | $g_{720}$ | $g_{721}$ | $g_{722}$ | $g_{723}$ | $g_{724}$ | $g_{725}$ | $g_{726}$ | $g_{727}$ | $g_{728}$ | $g_{729}$ | $g_{730}$ | $g_{731}$ | $g_{732}$ | $g_{733}$ | $g_{734}$ | $g_{735}$ | $g_{736}$ | $g_{737}$ | $g_{738}$ | $g_{739}$ | $g_{740}$ | $g_{741}$ | $g_{742}$ | $g_{743}$ | $g_{744}$ | $g_{745}$ | $g_{746}$ | $g_{747}$ | $g_{748}$ | $g_{749}$ | $g_{750}$ | $g_{751}$ | $g_{752}$ | $g_{753}$ | $g_{754}$ | $g_{755}$ | $g_{756}$ | $g_{757}$ | $g_{758}$ | $g_{759}$ | $g_{760}$ | $g_{761}$ | $g_{762}$ | $g_{763}$ | $g_{764}$ | $g_{765}$ | $g_{766}$ | $g_{767}$ | $g_{768}$ | $g_{769}$ | $g_{770}$ | $g_{771}$ | $g_{772}$ | $g_{773}$ | $g_{774}$ | $g_{775}$ </ |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |

### Prefetching Microinstructions :-

→ The major drawback of microprogrammed control is that it is slow.

→ Fetching a microinstruction from the control store takes considerably longer than generating control signals using hardwired circuits.

→ The performance of the microprogrammed control can be improved by prefetching the next micro-instruction while the current one is being executed.

→ Prefetching microinstructions presents some organizational difficulties because sometimes the status flags and the results of the currently executed microinstructions are needed to determine the address of the next microinstruction.

### Instruction Pipeline :-

An instruction pipeline reads consecutive instructions from memory while previous instructions are being executed in other segments.

This causes the instruction fetch and execute phases to overlap and perform simultaneous operations.

If an instruction causes a branch out of sequence, then the pipeline must be emptied and all the instructions that have been read from memory after the branch instruction must be discarded.

### Instruction Cycle :-

Generally the computer needs to process each instruction with the following sequence of steps:-

- (1) Fetch the instruction from memory.
- (2) Decode the instruction.
- (3) Calculate the effective address.
- (4) Fetch the operands from memory.
- (5) Execute the instruction.
- (6) Store the results in the proper place.

### Four-Segment Instruction Pipeline :-

Suppose the instruction cycle is divided into four segments as given below :-

(i) FI is the segment that fetches an instruction

(ii) DA is the segment that decodes the instruction and calculates the effective address

(iii) FO is the segment that fetches the operand.

(iv) EX is the segment that executes the instruction.

# It is assumed that the processor has separate instruction and data memories so that the operation in FI and FO can proceed at the same time.

| Step →         | 1  | 2  | 3  | 4  | 5  | 6  | 7  | 8  | 9  | 10 | 11 | 12 | 13 |
|----------------|----|----|----|----|----|----|----|----|----|----|----|----|----|
| Instruction: 1 | FI | DA | FO | EX |    |    |    |    |    |    |    |    |    |
| 2              | FI | DA | FO | EX |    |    |    |    |    |    |    |    |    |
| (Branch) 3     |    | FI | DA | FO | EX |    |    |    |    |    |    |    |    |
| 4              |    |    | FI | -  | -  | FI | DA | FO | EX |    |    |    |    |
| 5              |    |    |    | -  | -  | -  | FI | DA | FO | EX |    |    |    |
| 6              |    |    |    |    |    |    | FI | DA | FO | EX |    |    |    |
| 7              |    |    |    |    |    |    |    | FI | DA | FO | EX |    |    |

(Timing of instruction pipeline)

# Assume that instruction 3 is a branch instruction. As soon as this instruction is decoded in segment DA in step 4, the transfer from FI to DA of the other instructions is halted until the branch instruction is executed in step-6.

# If the branch is taken, a new instruction is fetched in step 7.

# If the branch is not taken, the instruction fetched previously in step-4 can be used.

### Pipeline Conflicts :-

In general, there are three major difficulties that cause the instruction pipeline to deviate from its normal operation :-

(i) Resource Conflicts :- Caused by access to memory by two segments at the same time.

(ii) Data dependency Conflicts :- arise when an instruction depends on the result of a previous instruction, but this result is not yet available.

(iii) Branch difficulties arise from branch and other instructions that change the value of PC.

## Hazards:-

There are situations, called hazards, that prevent the next instruction in the instruction stream from being executing during its designated clock cycle. Hazards reduce the performance from the ideal speedup gained by the pipeline.

There are three classes of hazards:-

### (i) Structural Hazards :-

They arise from resource conflicts when the hardware can not support all possible combinations of instructions in simultaneous overlapped execution.

### (ii) Data Hazards :-

They arise when an instruction depends on the result of a previous instruction in a way that is exposed by the overlapping of instructions in the pipeline.

### (iii) Control Hazards :-

They arise from the pipelining of branches and other instructions that change the PC.

Hazards in pipeline can make it necessary to stall the pipeline.

# It is possible to overcome the problem of structural hazards, but the designers allow structural hazards to reduce hardware cost.

## Resolving Data Dependency Conflicts or Data Hazards :-

### (i) Hardware Interlocks :-

An interlock is a circuit that detects instructions whose source operands are destinations of instructions farther up in pipeline. After detection of this situation, we delay the instruction by enough clock cycles to resolve the conflict.

### (ii) Operand Forwarding :-

Avoid the conflict by routing the data through special paths between pipeline segments.

### (iii) Delayed Load :-

Delay the loading of the conflicting data by inserting no-operation instructions (in compiler).

## Handling of Branch Instructions or Control Hazard :-

### (i) Prefetch Target Instruction :-

Prefetch the target instruction in addition to the instruction following the branch. Both are saved until the branch is executed.

### (ii) Branch Prediction :-

Some additional logic is used to guess the outcome of a conditional branch instruction before its executed. The pipeline then begins prefetching the instruction stream from the predicted path.

### (iii) Delayed Branch :-

→ Used in most RISC processors.

→ In this procedure, the compiler detects the branch instructions and rearranges the machine language code sequence by inserting useful instructions that keep the pipeline operating without interruptions.

## Input - Output Organization

### Synchronous Pipelines :-

These are clocked synchronously. The time between each clock signal is set to be greater than the longest delay between pipeline stages.

### Asynchronous Pipelines :-

They use request/acknowledge system, wherein each stage can detect when it's finished.

[www.ankurgupta.net](http://www.ankurgupta.net)

### I/O Interface :-

I/O interfaces connect the peripherals to the CPU.

I/O interfaces supervise and synchronize all input and output transfers.



The I/O bus consists of :-

- (i) Data Lines
- (ii) Address Lines
- (iii) Control Lines

→ Each interface decodes the address and control received from the I/O bus, interprets them for the peripheral, and provides signals for the peripheral controller.

→ Each peripheral has its own controller that operates the particular electromechanical device.

→ Address lines are used to select a peripheral.

→ Control lines are used to send commands to an interface.

There are four types of commands that an interface may receive:-

(i) Control Command :-

Used to activate the peripheral and to inform it what to do.

(ii) Status Command :-

Used to test various status conditions in the interface and the peripheral.

(iii) Output Data :-

It causes the interface to respond by transferring data from bus into one of its registers.

(iv) Input Data :-

The interface receives an item of data from the peripheral and places on the data lines.

I/O versus Memory Bus :-

The memory bus contains data, address, and read/write control lines.

There are three ways that computer buses can be used to communicate with memory and I/O :-

(i) Use two separate buses, one for memory and the other for I/O (Using **IOP - Input Output Processor**).

(ii) Use one common bus for both memory and I/O, but have separate control lines for each. (**Isolated I/O**).

(iii) Use one common bus for memory and I/O with common control lines (**memory-mapped**).

# In isolated I/O, the I/O read and I/O write control lines are enabled during an I/O transfer. The memory read and memory write control lines are enabled during a memory transfer.

# In memory mapped I/O, a segment of the total address space is reserved for interface registers.

## Program Interrupt :-

Program interrupt refers to the transfer of program control from a currently running program to another service program as a result of an external or internal generated request.

Control returns to the original program after the service program is executed.

There are two ways that the processor chooses the branch address of the service routine  
after interrupt :-

### (i) Non-vectorized interrupt :-

In a non-vectorized interrupt, the branch address is assigned to a fixed location in memory.

### (ii) Vectorized interrupt :-

In a vectorized interrupt, the source that interrupts supplies the branch information to the computer. This information is called the interrupt vector.

### Interrupt - Request Line :-

The CPU has a wire called the interrupt request line that the CPU senses after executing every instruction.

## Types of Interrupts :-

There are three major types of interrupts that cause a break in the normal program execution :-

- (i) External Interrupts
- (ii) Internal Interrupts
- (iii) Software Interrupts.

### (i) External Interrupts :-

External interrupts come from external sources like I/O devices, timing device etc

### (ii) Internal Interrupts :-

Internal interrupts arise from illegal or erroneous use of an instruction or data. Internal interrupts are also called traps. Examples are :- Register overflow, attempt to divide by zero etc.

### (iii) Software Interrupts :-

Software interrupt is a special call instruction that behaves like an interrupt rather than a subroutine call.

### Priority Interrupt :-

A priority interrupt is a system that establishes a priority over the various sources to determine which condition is to be served first when two or more requests arrive simultaneously.

The system may also determine which conditions are permitted to interrupt the computer while another interrupt is being serviced.

### Establishing the priority of simultaneous interrupts :-

It can be

done by software or hardware. There are three methods to establish priority :-

(i) Polling

(ii) Daisy-Chaining Priority

(iii) Parallel-Priority Interrupt

#### (i) Polling :-

A polling procedure is used to identify the highest-priority source by software means.

E

#### (ii) Daisy-Chaining Priority :-

It is used to identify the highest-priority source by hardware means.

It consists of a serial connection of all devices that request an interrupt.

#### (iii) Parallel Priority Interrupt :-

This method uses a register whose bits are set separately by the interrupt signal from each device. Priority is established according to the position of the bits in the register.

### Maskable Interrupt :-

A hardware interrupt that can be ignored by setting a bit in an interrupt mask register's bit mask.

### Non-maskable Interrupt :-

It is a hardware interrupt that lacks an associated bit-mask, so that it can never be ignored. It is often used for timers.

### Inter Processor Interrupt :-

It is generated by one processor to interrupt another processor in a multiprocessor system.

### Spurious Interrupt :-

It is a hardware interrupt that is unwanted.

## Modes of Data Transfer :-

Data transfer between the CPU and I/O devices may be handled in a variety of modes. Some modes use the CPU as an intermediate path; others transfer the data directly to and from the memory unit.

Data transfer to and from peripherals may be handled in one of the three possible modes :-

- (i) Programmed I/O
- (ii) Interrupt-initiated I/O
- (iii) Direct memory access (DMA)

### (i) Programmed I/O:-

In programmed I/O, each data item transfer is initiated by an instruction in the program.

Once a data transfer is initiated, the CPU is required to monitor the interface to see when a transfer can again be made.

In the programmed I/O method, the CPU stays in a program loop until the I/O unit indicates that it is ready for data transfer.

Usually the transfer is to and from a CPU register and peripheral. Other instructions are needed to transfer the data to and from CPU and memory.

Here the I/O device does not have direct access to memory.

### (ii) Interrupt-Initiated I/O:-

An alternative to the CPU constantly monitoring the flag is to let the interface inform the computer when it is ready to transfer data. This mode of transfer uses the interrupt facility. In the meantime the CPU can proceed to execute another program. The interface meanwhile keeps monitoring the device.

When the interface determines that the device is ready for data transfer, it generates an interrupt request to the computer. Upon detecting the external interrupt signal, the CPU momentarily stops the task it is processing, branches to a service program to process the I/O transfer, and then returns to the task it was originally performing.

### (iii) Direct Memory Access (DMA):-

In DMA, the interface transfers data into and out of the memory unit through the memory bus.

The CPU initiates the transfer by supplying the interface with the starting address and the number of words needed to be transferred and then proceeds to execute other tasks.

DMA is a feature of modern computers. A simple DMA Controller is a standard component in PCs.

### Direct Memory Access :-

During DMA transfer, the CPU is idle and has no control of the memory buses. A DMA controller takes over the buses to manage the transfer directly between the I/O device and memory.

### Bus Request :-

The bus request (BR) input is used by the DMA controller to request the CPU to relinquish control of buses.

### Bus Grant :-

The CPU activates the bus grant (BG) output to inform the external DMA that the buses are in high-impedance state.

### Burst Transfer :-

A block sequence consisting of a number of memory words is transferred in a continuous burst while the DMA controller is the master of the memory buses.

### Cycle Stealing :-

It allows the DMA controller to transfer one data word at a time, after which it must return control of the buses to the CPU.

### Input-Output Processor (IOP) :-

An IOP is a processor with direct memory access capability that communicates with I/O devices.

The IOP is similar to a CPU except that it is designed to handle the details of I/O processing.

Unlike the DMA controller that must be set up entirely by the CPU, the IOP can fetch and execute its own instructions. In addition, the IOP can perform other processing tasks, such as arithmetic, logic, branching and code translation.

The memory unit occupies a central position and can communicate with each processor by means of direct memory access.



The CPU is usually assigned the task of initiating the I/O program. From then on the IOP operates independent of the CPU.

## Blocking and Nonblocking I/O:-

When an application issues a blocking system call, the execution of the application is suspended. The application is moved from the running queue to the waiting queue. After the system call completes, the application is moved back to the running queue.

A nonblocking call does not halt the execution of the application for an extended time. Instead, it returns quickly, with a return value that indicates how many bytes were transferred.

## Asynchronous I/O:-

An alternative to a nonblocking system call is an asynchronous system call.

An asynchronous call returns immediately, without waiting for the I/O to complete. The application continues to execute its code.

The completion of the I/O at some future time is communicated to the application through interrupt.

## Buffering:-

A buffer is a memory area that stores data while they are transferred between two devices or between a device and ~~and~~ an application.

## Spooling:-

A spool is a buffer that holds output for a device, such as a printer, that can not accept interleaved data streams.

## Expansion Bus & PCI Bus:-

An expansion bus connects the internal hardware of the computer system (including the CPU and RAM) and peripheral devices.

or

Expansion Bus connects slow devices to processors.

PCI Bus connects fast devices to processor.

## Polling:-

Polling is also referred as busy-waiting. In this situation, when an I/O operation is required, the computer does nothing other than check the status of the I/O device until it is ready, at which point device is accessed.

It is sometimes also referred as software driven I/O.

## Data Hazard Classification

# The hazards are named by the ordering in the program that must be preserved by the pipeline.

Consider two instructions  $i$  and  $j$ , with  $i$  occurring before  $j$ . The possible data hazards are:-

(i) RAW (Read After Write):-

$j$  tries to read a source before  $i$  writes it.

→ Operand Forwarding is used to overcome it.

(ii) WAW (Write After Write):-

$j$  tries to write an operand before it is written by  $i$ .

→ It is present only in pipelines that write in more than one pipe stages or allow an instruction to proceed even when a previous instruction is stalled.

(iii) WAR (Write After Read):-

$j$  tries to write a destination before it's read by  $i$ .

→ It occurs when there are some instructions that write results early in the instruction pipeline and other instructions that read a source late in the pipeline.

Raster Displays:-

The scanning process **sweeps** the beam from left to right across the screen. At the end of the scanline the beam is moved to the start of the next line (**horizontal retrace**). After the last scanline, the beam is returned to start position (**vertical retrace**).

Memory ManagementMemory Hierarchy in a Computer System:-

# The main memory occupies a central position by being able to communicate directly with the CPU and with auxiliary memory devices through an I/O processor.

# The typical access time ratio between cache and main memory is about 1 to 7. Auxiliary memory average access time is usually 1000 times that of main memory.

Protection of Memory Space:-

We need to make sure that each process has a separate memory space. To do this, we need the ability to determine the range of legal addresses that the process may access and to ensure that the process can access only these legal addresses. We can provide this protection by using two registers:

- (i) Base Register holds the smallest legal physical memory address.
- (ii) Limit Register specifies the size of the range.

The base and limit registers can be loaded only by the operating system, which uses a special privileged instruction.

The operating system, executing in kernel mode, is given unrestricted access to both operating system and user's memory.

#### Static RAM:-

It consists of internal flip-flops that store the binary information. The stored information remains valid as long as power is applied to the unit.

The Static RAM is easier to use and has shorter read and write cycles.

#### Dynamic RAM:-

It stores the binary information in the form of electric charge that is applied to capacitors. The capacitors are provided inside the chip by MOS transistors.

The stored charge on the capacitors tend to discharge with time and the capacitors must be periodically recharged by refreshing the dynamic memory.

The Dynamic RAM offers reduced power consumption and large storage capacity in a single memory chip.

#### ROM:-

ROM is used for storing programs that are permanently resident in the computer and for tables of constants that do not change in value once the production of the computer is completed.

ROM is also Random Access.

#### Bootstrap Loader:-

The bootstrap loader is a program whose function is to start the computer software operating when power is turned on.

ROM portion of main memory is used for storing the bootstrap loader.

When power is turned on, the hardware of the computer sets the program counter (PC) to the first address of the bootstrap loader. The bootstrap program loads a portion of the operating system from disk to main memory and control is then transferred to the operating system, which prepares the computer for general use.



| CS1 | CS2 | RD | WR | Memory Function | State of data bus  |
|-----|-----|----|----|-----------------|--------------------|
| 0   | 0   | x  | *  | Inhibit         | High impedance     |
| 0   | 1   | x  | *  | Inhibit         | High impedance     |
| 1   | 0   | 0  | 0  | Inhibit         | High impedance     |
| 1   | 0   | 0  | 1  | Write           | Input data to RAM  |
| 1   | 0   | 1  | x  | Read            | Output data to RAM |
| 1   | 1   | x  | *  | Inhibit         | High impedance     |

(Function table of RAM chip)

The block diagram of a RAM chip is shown above. The capacity of the memory is 128 words of eight bits (one byte) per word. This requires a 7-bit address and an 8-bit bidirectional data bus. The read and write inputs specify the memory operation and the two chips select (CS) control inputs are for enabling the chip only when it is selected by the microprocessor.

# The read and write inputs are sometimes combined into one line labeled R/W



A ROM chip is organised externally in a similar manner. However, since a ROM can only read, the data bus can only be in an output mode.

# For the same-size chip, it is possible to have more bits of ROM than of RAM, because the internal binary cells in ROM occupy less space than in RAM.

### Memory Address Map :-

The addressing of memory can be established by means of a table that specifies the ~~memory~~ address assigned to each chip. This table is called a memory address map.

| Component | Address Bus           |
|-----------|-----------------------|
| RAM1      | 1 0 9 8 7 6 5 4 3 2 1 |
| RAM2      | 0 0 1 x x x x x x x   |
| RAM3      | 0 1 0 x x x x x x x   |
| RAM4      | 0 1 1 x x x x x x x   |
| ROM       | 1 x x x x x x x x x   |

### CPU

16-11 10 9 8 7-1 RD WR

Data Bus

Decoder  
3 2 1 0

CS1

CS2

128x8

RAM1

RD

WR

AD7

CS1

CS2

128x8

RAM2

RD

WR

AD7

CS1

CS2

128x8

RAM3

RD

WR

AD7

CS1

CS2

128x8

RAM4

RD

WR

AD7

(Memory connections to the CPU)

### Associative Memory :-

The time required to find an item stored in memory can be reduced considerably if stored data can be identified for access by the content of the data itself rather than by an address.

A memory unit accessed by content is called an associative memory or content addressable memory.

An associative memory is more expensive than a random access memory because each cell must have storage capability as well as logic circuits for matching its content with an external argument.



(Block diagram of associative memory)

The argument register A and key register K each have n bits, one for each bit of a word. The match register M has m bits, one for each memory word. Each word in memory is compared in parallel with the content of the argument register. The words that match the bits # of the argument register set a corresponding bit in the match register.

The key register provides a mask for choosing a particular field or key in the argument word. The entire argument is compared with each memory word if the key register contains all 1's. Otherwise only those bits in the argument that have 1's in their corresponding position of the key register are compared.

|                    |             |
|--------------------|-------------|
| <u>Example:- A</u> | 101 1111 00 |
| K                  | 111 0000 00 |
| Word 1             | 100 1111 00 |
| Word 2             | 101 0000 01 |

No match  
Match.

### Hit Ratio :-

When the CPU refers to memory and finds the word in cache, it is said to produce a hit. If the word is not found in cache, it is in main memory and it counts as a miss.

The ratio of the number of hits divided by the total CPU references to memory (Hits plus miss) is the hit ratio.

Example:- A computer with cache access time of 100ns, a main memory access time of 1000ns, and a hit ratio of 0.9 produces an average access time of 200 ns.

### Cache Memory :-

#### Locality of Reference :-

The references to memory at any given interval of time tend to be confined within a few localized area in memory. This phenomenon is known as the property of locality of reference.

If the active portions of the program and data are placed in a fast small memory, the average memory access time can be reduced, thus reducing the total execution time of the program. Such a fast small memory is referred to as a cache memory.



### Mapping :-

The transformation of data from main memory to cache memory is referred to as a mapping process.

Three types of mapping procedures are of practical interest :-

- (i) Associative Mapping
- (ii) Direct Mapping
- (iii) Set-associative Mapping

### (i) Associative Mapping:-

The associative memory stores both the address and content (data) of the memory word. This permits any location in cache to store any word from main memory.

CPU address (15-bits)

Argument Register

| Address | Data |
|---------|------|
| 01000   | 3450 |
| 02777   | 6710 |
| 22345   | 1234 |

[Associative mapping cache (all numbers in octal)]

A CPU address of 15 bits is placed in the argument register and the associative memory is searched for a matching address. If the address is found, the corresponding 12-bit data is read and sent to the CPU. If no match occurs, the main memory is accessed for the word. The address-data pair is then transferred to the associative cache memory.

### (ii) Direct Mapping:-

In the general case, there are  $2^k$  words in cache memory and  $2^n$  words in main memory. The  $n$ -bit memory address is divided into two fields:-

(i)  $k$ -bits for the index field.

(ii)  $(n-k)$ -bits for the tag field

The direct mapping cache organization uses the  $n$ -bit address to access the main memory and the  $k$ -bit index to access the cache.



Each word in cache consists of the data word and its associated tag. When a new word is first brought into the cache, the tag bits are stored alongside the data bits. When the CPU generates a memory request, the index field is used for the address to access the cache. The tag field of the CPU address is compared with the tag in the word read from the cache. If the two tags match, there is a hit; otherwise, there is a miss.

The disadvantage of direct mapping is that the hit ratio can drop considerably if two or more words whose addresses have the same index but different tags are accessed repeatedly.

Direct mapping cache with block size of 1 word:-

| Memory Address | Memory Data | Index   |     |      |
|----------------|-------------|---------|-----|------|
|                |             | Address | Tag | Data |
| 00000          | 1220        | 000     | 00  | 1220 |
| 00777          | 2340        |         |     |      |
| 01000          | 3450        |         |     |      |
| 01777          | 4560        |         |     |      |
| 02000          | 5670        | 777     | 02  | 6710 |
| 02777          | 6710        |         |     |      |

(b) Cache Memory

↓      ↓

(a) Main Memory

Direct mapping cache with block size of 8 words:-

The index field is now divided into two parts:-

(i) Block field, (ii) Word Field.

In a 512-word cache, there are 64 blocks of 8 words each. The block number is specified with a 6-bit field and the word within a block is specified with a 3-bit field.

The tag field stored within the cache is common to all eight words of the same block. Every time a miss occurs, an entire block of eight words must be transferred from main memory to cache memory.

| Index    | Tag | Data |      |  |  |
|----------|-----|------|------|--|--|
| Block 0  | 000 | 01   | 3450 |  |  |
|          | 007 | 01   | 6578 |  |  |
|          | 010 |      |      |  |  |
| Block 1  | 017 |      |      |  |  |
|          |     |      |      |  |  |
|          |     |      |      |  |  |
| Block 63 | 770 | 02   |      |  |  |
|          | 777 | 02   | 6710 |  |  |

Tag    Block    Word  
Index

(Direct mapping cache with block size of 8 words)

### (iii) Set-Associative Mapping :-

(III) Set-Associative Mapping: The set-associative mapping is an improvement over the direct-mapping in that each word of cache can store two or more words of memory under the same index address.

| Index | Tag | Data | Tag | Data |
|-------|-----|------|-----|------|
| 000   | 01  | 3450 | 02  | 5670 |
| 001   |     | 1    | 1   | 1    |
| 002   |     |      |     |      |
| 003   |     |      |     |      |
| 004   |     |      |     |      |
| 005   |     |      |     |      |
| 006   |     |      |     |      |
| 007   |     |      |     |      |
| 008   |     |      |     |      |
| 009   |     |      |     |      |
| 010   |     |      |     |      |
| 011   |     |      |     |      |
| 012   |     |      |     |      |
| 013   |     |      |     |      |
| 014   |     |      |     |      |
| 015   |     |      |     |      |
| 016   |     |      |     |      |
| 017   |     |      |     |      |
| 018   |     |      |     |      |
| 019   |     |      |     |      |
| 020   |     |      |     |      |
| 021   |     |      |     |      |
| 022   |     |      |     |      |
| 023   |     |      |     |      |
| 024   |     |      |     |      |
| 025   |     |      |     |      |
| 026   |     |      |     |      |
| 027   |     |      |     |      |
| 028   |     |      |     |      |
| 029   |     |      |     |      |
| 030   |     |      |     |      |
| 031   |     |      |     |      |
| 032   |     |      |     |      |
| 033   |     |      |     |      |
| 034   |     |      |     |      |
| 035   |     |      |     |      |
| 036   |     |      |     |      |
| 037   |     |      |     |      |
| 038   |     |      |     |      |
| 039   |     |      |     |      |
| 040   |     |      |     |      |
| 041   |     |      |     |      |
| 042   |     |      |     |      |
| 043   |     |      |     |      |
| 044   |     |      |     |      |
| 045   |     |      |     |      |
| 046   |     |      |     |      |
| 047   |     |      |     |      |
| 048   |     |      |     |      |
| 049   |     |      |     |      |
| 050   |     |      |     |      |
| 051   |     |      |     |      |
| 052   |     |      |     |      |
| 053   |     |      |     |      |
| 054   |     |      |     |      |
| 055   |     |      |     |      |
| 056   |     |      |     |      |
| 057   |     |      |     |      |
| 058   |     |      |     |      |
| 059   |     |      |     |      |
| 060   |     |      |     |      |
| 061   |     |      |     |      |
| 062   |     |      |     |      |
| 063   |     |      |     |      |
| 064   |     |      |     |      |
| 065   |     |      |     |      |
| 066   |     |      |     |      |
| 067   |     |      |     |      |
| 068   |     |      |     |      |
| 069   |     |      |     |      |
| 070   |     |      |     |      |
| 071   |     |      |     |      |
| 072   |     |      |     |      |
| 073   |     |      |     |      |
| 074   |     |      |     |      |
| 075   |     |      |     |      |
| 076   |     |      |     |      |
| 077   | 02  | 6710 | 00  | 2340 |

(Two-way set-associative mapping cache)

Each data word is stored together with its tag and the number of tag-data items in one word of cache is said to form a set.

In the above diagram, each index address refers to two data words and their associative tags. Each tag requires six bits and each data word has 12 bits, so the word length is  $2(6+12) = 36$  bits. An index address of 9 bits can accommodate 512 words. Thus the size of cache memory is  $512 \times 36$  and it can accommodate 1024 words of main memory.

When the CPU generates a memory request, the index value of the address is used to access the cache. The tag field of the CPU address is then compared with both tags in the cache to determine if a match occurs. The comparison logic is done by an associative search of the tags.

## Writing into Cache :-

There are two ways:-

- (i) Write-through! - With every memory write operation, update main memory, with cache memory being updated in parallel if it contains the word at the specified address.
  - (ii) Write-back! - In this method only the cache location is updated during a write operation. The location is then marked by a flag so that later when the word is removed from the cache, it is copied into main memory.

## Cache Initialization:-

Each word in cache includes a valid bit to indicate whether or not the word contains valid data.

The cache is initialized by clearing all the valid bits to 0. The valid bit of a particular cache word is set to 1 the first time this word is loaded from main memory and stays set unless the cache has to be initialized again.

## Cache Coherence :-

In a shared memory multiprocessor system, with multiple processors having separate caches, it is necessary to keep the caches in coherence by ensuring that any shared operand that is changed in any cache is changed throughout the entire system.

Content (classmate)Memory Binding :-

The binding of instructions and data to memory addresses can be done at any step along the way:-

## (i) Compile time:-

If we know at compile-time where the process will reside in memory, then absolute code can be generated.

## (ii) Load time:-

In this case, compiler generates relocatable code and final binding is delayed until load time.

## (iii) Execution time:-

If the process can be moved during its execution from one memory segment to another, then binding must be delayed until runtime.

Dynamic Loading :-

With dynamic loading, a routine is not loaded until it is called. All routines are kept on disk in a relocatable load format.

**Q.8** Dynamic loading does not require special support from the operating system. It is the responsibility of the users to design their programs to take advantage of such a method.

Dynamic Linking :-

Some operating systems support only static linking, in which system libraries are treated like any other object module and are combined by the loader into the binary program image.

The concept of dynamic linking is similar to that of dynamic loading. Here, though, linking, rather than loading, is postponed until execution time. This feature is usually used with system libraries.

With dynamic linking, a stub is included in the image for each library-routine reference. The stub is a small piece of code that indicates how to locate the appropriate memory-resident library routine or how to load the library if the routine is not already present.

Under this scheme, all processes that use a language library execute only one copy of the library code.

**A.8** Unlike dynamic loading, dynamic linking generally requires help from the operating system, because processes in memory are protected from one another.

Virtual Memory :-

Virtual memory is a concept that permits the user to construct programs as though a large memory space were available, equal to the totality of auxiliary memory.

**Logical Versus Physical Address Space :-**

An address generated by the CPU is commonly referred to as a logical address or virtual address, and the set of such addresses is referred to as address space or logical address space or virtual address space.

An address in main memory is referred to as a physical address, and the set of such addresses is referred to as physical address space or memory space.

The compile-time and load-time address-binding methods generate identical logical and physical addresses.

In the execution-time address-binding scheme, the logical and physical address spaces differ.

The run-time mapping from virtual to physical addresses is done by a hardware device called the memory-management unit (MMU).

**Contiguous Memory Allocation :-**

For a memory request of size  $n$ , there are three methods that are used to satisfy the request from a list of free holes:-

- (i) First Fit :- Allocate the first hole that is big enough.
- (ii) Best Fit :- Allocate the smallest hole that is big enough.
- (iii) Worst Fit :- Allocate the largest hole.

*Note:* Both the first-fit and best-fit strategies for memory allocation suffer from external fragmentation.

External fragmentation exists when there is enough total memory space to satisfy a request, but the available spaces are not contiguous; storage is fragmented into a large number of small holes.

One solution to the problem of external fragmentation is compaction. Compaction is possible only if relocation is dynamic and is done at execution time.

Another possible solution to the external fragmentation problem is to permit the logical address space of the processes to be noncontiguous, thus allowing a process to be allocated physical memory wherever the latter is available.

## Paging:-

Paging is a memory-management scheme that permits the physical address space of a process to be non-contiguous.

The basic method for implementing paging involves breaking physical memory into blocks of fixed size called frames or page-frames or blocks, and breaking logical memory into blocks of the same size called pages.

## Mapping from address space to memory space :-



A presence bit in each location indicates whether the page has been transferred from auxiliary memory into main memory. A 0 in the presence bit indicates that this page is not available in main memory.

When we use a paging scheme, we have no external fragmentation, however we may have some internal fragmentation.

If the memory allocated to a process is larger than the requested memory. The difference between these two is internal fragmentation - memory that is internal to a partition but is not being used.

## Translation Look-aside Buffer (TLB):-

In paging, two memory accesses are needed to access some data (one for page table entry and other for the data). Thus memory access is slowed by a factor of 2.

The solution to this problem is to use a special, small, fast-lookup hardware cache, called a translation look aside buffer (TLB). The TLB is associative, high speed memory.

The TLB is used with page tables in the following way:-

The TLB contains only a few of the page-table entries. When a logical address is generated by the CPU, its page number is presented to the TLB.

If the page number is found, its frame no. is immediately available and is used to access memory.

If the page number is not in TLB (known as TLB miss), a memory reference to the page table must be made. In addition, we add the page number and frame number to the TLB, so that they will be found quickly on the next reference.



The percentage of times that a particular page number is found in the TLB is called the hit ratio.

Example:- Suppose a TLB has 80% hit ratio. If it takes 20 ns to search the TLB and 100ns to access memory, then:-

$$\begin{aligned}\text{Effective Access Time} &= 0.80(20 + 100) \\ &\quad + 0.20(20 + 100 + 100) \\ &= 140 \text{ ns}\end{aligned}$$

### Protection Bit:-

These bits are kept in page tables. One bit can define a page to be read-write or read-only.

### Valid-invalid bit:-

This bit is also kept in page table. When this bit is set to valid, the associated page is in the process's logical address space and is thus a legal page.

### Demand Paging:-

With demand-paged virtual memory, pages are only loaded when they are demanded during program execution; pages that are never accessed are thus never loaded into physical memory.

### Page Replacement:-

# ~~The modify~~

### Modify Bit or Dirty Bit:-

The modify bit for a page is set by the hardware whenever any word or byte in the page is written into, indicating that the page has been modified.

[ Some remaining part of paging in different registers ]



## [Remaining Part of Page Table]



### Shared Pages:-

An advantage of paging is the possibility of sharing common code.

Consider a system that supports 40 users, each of whom executes a text editor. Here only one copy of the editor need to be kept in physical memory. Each user's page table maps onto the same physical copy of the editor, but data pages are mapped to different frames.

### Hierarchical Paging or Multilevel Page Tables:-

#### Problem with single level page table :-

Consider a system with a 32-bit logical address space. If the page size in such a system is 4KB ( $2^{12}$ ), then a page table may consist of  $2^{32}/2^{12} = 2^{20}$  entries. Assuming that each entry consists of 4 bytes, each process may need up to 4MB of physical address space for the page table alone.

So, the problem is that we can't hold all of the page tables in memory.

Solution:- The solution to this problem is that page the page tables. It allows portions of the page tables to be kept in memory at one time.

Consider the previous example of a 32-bit machine Hashed Page Tables:- with a page size of 4KB. The logical address is divided into a page number consisting of 20-bits and a page offset consisting of 12-bits.

A page can have  $4\text{KB}/4\text{B} = 1\text{K} = 2^{10}$  entries of a page table.

Because we page the page table, the page number is further divided into a 10-bit page number and a 10-bit page offset.

Thus, a logical address is as follows:-

Page Number      Page offset

|       |       |     |
|-------|-------|-----|
| $p_1$ | $p_2$ | $d$ |
| 10    | 10    | 12  |

logical address

|       |       |     |
|-------|-------|-----|
| $p_1$ | $p_2$ | $d$ |
|-------|-------|-----|



Because address translation works from the outer page table inward, this scheme is also known as a forward-mapped page table.

For 64-bit architectures, hierarchical page tables are generally considered inappropriate, because it will require 6-levels.

6 levels  $\rightarrow$  6 memory accesses  $\rightarrow$  slow.

A common approach for handling address spaces larger than 32-bits is to use a hashed page table, with the hash value being the virtual page number. Each entry in the hash table contains a linked list of elements that hash to the same location.

Each element consists of three fields:-

- (i) Virtual page number
- (ii) Value of the mapped page frame.
- (iii) A pointer to the next element in the linked list.

### Inverted Page Tables:-

Usually, each process has an associated page table. The page table has one entry for each virtual address (each page).

The drawback of this method is that each page table may consist of millions of entries. These tables may consume large amounts of memory. To solve this problem, we use an inverted page table.

An inverted page table has one entry for each real page (or frame) of memory. There is only one page table in the system, and it has only one entry for each page of physical memory.

Each page table entry contains:-

- (i) Process ID
- (ii) Virtual Page Number

The table is indexed by physical page number.



### Accessing Inverted Page Tables :-

For each entry in the inverted page table, compare process ID and virtual page number in entry to the segmented process ID and virtual page number.

Because of linear search its very slow.



### Accessing Hashed Inverted Page Table :-

(1) Lookup in hash anchor table for page table entry. Compare process ID and virtual page number.

(2) If match, then found.

(3) If not match, check the next pointer for another page table entry and check again.



### Hashed Inverted Page Tables :-

→ Linear inverted page tables require too many memory accesses.

→ Keep another level before actual inverted page table (hash anchor table) :-

- (i) Contains a mapping of process ID and virtual page number to page table entries.
- (ii) Use separate chaining for collisions.

## Page Replacement Algorithms:-

### (i) FIFO Page Replacement:-

A FIFO replacement algorithm associates with each page the time when that page was brought into memory. When a page must be replaced, the oldest page is chosen.

FIFO page replacement algorithm suffers from Belady's anomaly.

#### Belady's Anomaly:-

For some page replacement algorithms, the page-fault rate may increase as the number of allocated frames increases.

### (ii) Optimal Page Replacement:-

Replace the page that will not be used for the longest period of time.

An optimal page replacement algorithm has the lowest page-fault rate of all algorithms and will never suffer from Belady's anomaly.

### (iii) LRU Page Replacement:-

LRU replacement associates with each page the time of that page's last use. When a page must be replaced, LRU chooses the page that has not been used for the longest period of time.

This algorithm does not suffer from Belady's Anomaly.

## Thrashing:-

A process is thrashing if it is spending more time paging than executing.

## Local vs. Global Page Replacement:-

When a process incurs a page fault, a local page replacement algorithm selects for replacement some page that belongs to that same process.

A global replacement algorithm is free to select any page in memory.

Byte Addressing vs. Word Addressing:-

Byte Addressing  
(1 word = 4 bytes)



Address

In the above addressing scheme, the first word starts at address 0, and the second word starts at address 4.

Word Addressing  
(1 word = 4 bytes)



Address

In the above addressing scheme, all bytes of first word are located in address 0, and all bytes of the second word are located in address 1.

## Laugh out Loud!

'El Pueblo de Nuestra Señora la Reina de los Angeles del Río de Porciúncula' is the full name of the American city Los Angeles.



In Turkestan, in 1919, a train was powered by almost 9000 tonnes of dried fish!



The cuddly animal koala bear, as it is often called, is not a bear at all; it is a marsupial! It sleeps for up to 18 hours daily!

Newton served as a member of the Parliament of England for one year. During all the lengthy proceedings, he spoke only once: he asked the person next to him to close an open window!

