

Module II : Process. org? [30 mks]

1. Exp Basic CPU org? & register org?
2. Exp. different Inst? format
3. What is Inst? cycle manager
4. Exp soft wired (jup reg) & Hardwired control Unit method.
5. SN on micro inst? sequencing & execution
6. SN on Nanoprogramming
7. Explain Flynn's classification
8. Explain diff pipeline Hazards
9. Explain diff. stages of pipelining

Module 1 and 2

1. Explain von-nuemann architecture in detail
2. Explain Evolution of computer / Generation of computer

3. Diff bet? computer org? & Computer Arch?

4. Exp Architecture of 8086

5. Exp flag register of 8086

6. Write SN on Segmentation of 8086

7. Write SN on memory banking of 8086

8. How physical address is generated of 8086

9. Exp addressing modes of 8086

10. Exp diff pins of 8086

11. Exp minimum mode of 8086

12. Exp maximum mode of 8086

13. Diff between min & max mode

14. SN on 8288 Bus controller

15. LM on mixed language prog

16. Explain concept of procedure & macro with example

17. Explain various smng of 8086

18. Program of 8086

Any  
one

Module VI - I/O organization [15 to 20 m]

1. SN on I/O module.
2. SN on I/O processor [8089].
3. Exp programmed I/O data transfer technique.
4. Exp interrupt driven I/O data transfer tech.
5. SN on DMA.

Module VII Memory organization [20 to 25 m]

1. Exp memory characteristics & measure parameters.
2. SN on memory Hierarchy.
3. Concept of cache memory & L1, L2, L3 concept.
4. Exp various memory mapping techniques.
5. SN on cache coherency.
6. S.N on Interleaved & Associative memory.
7. S.N on prob. on page replacement algo.
8. Classification of primary & secondary mem.

[25 to 30m] Module IX Data representation & Arithmetic Algo

1. Problem on Booth's algo
2.  $\frac{11}{11}$  restoring division algo.
3.  $\frac{11}{11}$  Non-restoring division algo.
4.  $-11$  or SN IEEE 754 floating pt format
5. Add & Shift multiplication algo
6. 2's complement problem.

## Module 1 - Overview of computer Arch' & Org<sup>n</sup>

Q) Explain Computer Organisation and computer Architecture.

- A 1. Computer architecture refers to the attributes of system visible to programmer.
2. Computer Organisation refers to operational units and their interconnection that realize the architectural specification.

B. Eg of computer architecture attribute

include the instruction set, the number of bits used to represent various data types and I/O mechanism.

2. Eg of computer organization attribute include such as control signal, interface between computer and peripherals.

c. Computer Architecture design issue whether the instruction set will have multiply instruction or not?

2. Computer Organization issue whether instruction will be implemented by special multiply or by mechanism that makes repeated use of add units of system

\* Many manufacturer offer same architecture design but with different organization.

2) Explain von-neumann architecture.

Note: History of computer

The first generation computer (vacuum tube)  
ENIAC (Electronic Numerical Integrator  
and computer)

- It is designed by John brothers. It is first general purpose electronic digital computer. The project was response to USA war time need.
- The Army's Ballistics Research Lab & trajectory tables for new weapons, Eckert proposed to build a general computer using vacuum tubes used for BRL application.
- In 1943 work began ENIAC. The resulting computer is of weight 30 tons and containing more than 18000 vacuum tubes and occupying 15000 sq.ft space and capable of 5000 additions/sec. The drawback of ENIAC was that it had to be programming manually by setting switches & plug and unplugged the cables.

## Von neumann model

As was mentioned the task of entering and altering programs for ENIAC was extremely tedious. In 1946, von neumann and his colleagues began the design of new stored program computer referred as IAS computer.



Basic structure of IAS computer.

It consists of

- Main memory, which stores both instruction & data
- Arithmetic Logical unit, capable of operating binary data & perform various arithmetic and logical operation
- Control unit, which interprets the instruction in memory.
- An I/O Equipment, operated by control unit

With rare exception, all of today's computer have this same general structure and function.

## Extended structure of IAS computer



The memory of IAS consist of 1000 storage location & 40 bits each.

(Both instruction and data stored)

The control unit fetch the instruction from memory executing them. In figure, control unit and CPU contains storage location of register defined as follows:

MBR (Memory Buffer Register)

It is also called as MDR. It contains data from memory.

MAR (Memory Address Register)

It specifies the address of data to be written from and to memory.

IR (Instruction Register)

It stores op-code of current instruction being executed.

IBR (Instruction Buffer Register)

It is used to store the data temporarily

PC (Program Counter)

It contains address of next instruction, which fetch from memory.

AC (Accumulator) and MQ (Multiplier Quotient)

Accumulator stores the operation operand (data) and result of ALU for eg the result of multiplying two 40 bit number is 80 bits hence MSB goes to AC and LSB goes to MQ after execution

Perform unsigned multiplication using  
Add and shift method.

$$1. \quad 11 \times 13$$

$$11 \rightarrow 1011$$

$$13 \rightarrow \underline{1101}$$

$$\begin{array}{r} 1011 \\ + 1101 \\ \hline 10000 \end{array}$$

$$10000 \times$$

$$1011 \times \times$$

$$\underline{1011 \times \times \times}$$

$$\underline{10001111}$$

$$11 \times 13 \rightarrow 143$$



| Count | C     | A     | Q     |
|-------|-------|-------|-------|
| 5     | 0     | 00000 | 10000 |
| 4     | 0     | 00000 | 01000 |
| 3     | 0     | 00000 | 00100 |
| 2     | 0     | 00000 | 00010 |
| 1     | 0     | 00000 | 00001 |
| 0     | 01001 | 00001 |       |

13 X 11

| Count | C    | A    | Q    |
|-------|------|------|------|
| 0     | 0    | 0000 | 1101 |
| 4     | 0    | 1011 | 1101 |
| 0     | 0101 | 1110 |      |

3 0 0010 1111

2 0 1101 1111

0110 1111

1 1 0001 1111

0 1000 1111



Booth's Algorithm (signed) for 2's complement method.



$3 \times 7$

Data: \_\_\_\_\_  
Page: \_\_\_\_\_

| cycle | A       | Q       | $Q_{-1}$ | comment                           |
|-------|---------|---------|----------|-----------------------------------|
| 0     | 0 0 0 0 | 0 0 1 1 | 0        | Initial                           |
| 1     | 1 0 0 1 | 0 0 1 1 | 0        | $A = A - M$ ( $Q_0 Q_{-1} = 10$ ) |
| 2     | 1 1 0 0 | 1 0 0 1 | 1        | shift right                       |
| 3     | 1 1 1 0 | 0 1 0 0 | 1        | shift right                       |
| 4     | 0 1 0 1 | 0 1 0 0 | 1        | $A = A + M$                       |
| 5     | 0 0 1 0 | 1 0 1 0 | 0        | SR                                |
| 6     | 0 0 0 1 | 0 1 0 1 | 0        | SR                                |

Solve  $4 \times 7$  for B using Both algorithm.

| Count | A       | Q       | $Q_{-1}$ | Comment     | $4 \rightarrow 0100$                                      |
|-------|---------|---------|----------|-------------|-----------------------------------------------------------|
| 0     | 0 0 0 0 | 0 1 0 0 | 0        | Initial     | $7 \rightarrow 0111$                                      |
| 1     | 0 0 0 0 | 0 0 1 0 | 0        | SR          | $A = A + (-M)$                                            |
| 2     | 0 0 0 0 | 0 0 0 1 | 0        | SR          | $-7 \rightarrow 0111$                                     |
| 3     | 1 0 0 1 | 0 0 0 1 | 0        | $A = A - M$ | $\downarrow$                                              |
| 4     | 1 1 0 0 | 1 0 0 0 | 1        | SR          | $\begin{array}{r} 1000 \\ + 1 \\ \hline 1001 \end{array}$ |
| 5     | 0 0 1 1 | 1 0 0 0 | 1        | $A = A + M$ | $1100$                                                    |
| 6     | 0 0 0 1 | 1 1 0 0 | 0        | SR          | $\begin{array}{r} 0111 \\ + 1 \\ \hline 1111 \end{array}$ |

$3x - 7$

$1110$

$+ 0111$

$\underline{0001}$

| Count | A    | Q    | Q <sub>-1</sub> | Comment     |                       |
|-------|------|------|-----------------|-------------|-----------------------|
|       | 0000 | 0011 | 0               | Initial.    | $3 \rightarrow 0011$  |
| 4     | 0111 | 0011 | 0               | $A = A - M$ | $-7 \rightarrow 0111$ |
|       | 0011 | 1001 | 1               | SR          | $\frac{1000}{+ 1}$    |
| 3     | 0001 | 1100 | 1               | SR          | $\frac{1001}{+ 1}$    |
| 2     | 1010 | 1100 | 1               | $A = A + M$ | $A = A + (-M)$        |
|       | 1101 | 0110 | 0               | SR          | $= A + (-7)$          |
| 1     | 1110 | 1011 | 0               | SR          | $0001$                |
|       |      |      |                 |             | $+ 1001$              |
|       |      |      |                 |             | $\underline{1010}$    |

Ans.

$$11101011 \rightarrow 00010100 \rightarrow 10101$$

$\frac{0011}{+ 1100}$   
 $\underline{+ 1}$   
 $1101$

$(-3) \times 7$

| count | A    | Q    | Q <sub>-1</sub> | comment     |                       |
|-------|------|------|-----------------|-------------|-----------------------|
|       | 0000 | 0111 | 0               | Initial.    | $0011$                |
| 4     | 0011 | 0111 | 0               | $A = A - M$ | $\frac{1000}{+ 1}$    |
|       | 0001 | 1011 | 1               | SR          | $\frac{1001}{+ 1}$    |
| 3     | 0000 | 1100 | 1               | SR          | $\frac{0000}{+ 0111}$ |
| 2     | 1101 | 1100 | 1               | $A = A + M$ | $1101$                |
|       | 1110 | 1110 | 0               | SR          |                       |
| 1     | 1101 | 0111 | 0               | SR          |                       |
|       |      |      |                 |             |                       |

solve  $(-3) \times (-7)$  using Both algorithm.

| count | A    | Q    | $Q_{-1}$ | comment     |
|-------|------|------|----------|-------------|
| 0     | 0000 | 1001 | 0        | Initial.    |
| 1     | 0011 | 1001 | 0        | $A = A - M$ |
| 2     | 0001 | 1100 | 1        | $SR$        |
| 3     | 1110 | 1100 | 1        | $A = A + M$ |
| 4     | 1111 | 0110 | 0        | $SR$        |
| 5     | 1111 | 1011 | 1        | $SR$        |
| 6     | 0010 | 1011 | 0        | $A = A - M$ |
| 7     | 0001 | 0101 | 0        | $SR$        |

$\underbrace{\hspace{10em}}$   
 $M-A=A$   
 $Ans = 21$

Restoring division algorithm.



$Q \leftarrow \text{Divident} = 7$ Q Solve  $7 \div 3$  M  $\leftarrow$  Divisor = 3.

count count A Q Comment

|   |           |            |        |
|---|-----------|------------|--------|
|   | 0000 0111 | Initial    | 0000   |
| 4 | 0000 1110 | Shift left | + 1101 |
|   | 1101      | A = A - M  | 1101   |

|   |           |                               |      |
|---|-----------|-------------------------------|------|
|   | 0000 1110 | $Q_0 \leftarrow 0, A = A + M$ | 0001 |
| 3 | 0001 1100 | Shift left                    | 1101 |
|   | 1110      | A = A - M                     | 1110 |

|   |           |                               |  |
|---|-----------|-------------------------------|--|
|   | 0001 1100 | $Q_0 \leftarrow 0, A = A + M$ |  |
| 2 | 0011 1000 | Shift left                    |  |
|   | 0000      | A = A - M                     |  |

|   |           |                    |      |
|---|-----------|--------------------|------|
|   | 0000 1001 | $Q_0 \leftarrow 1$ | 0001 |
| 1 | 0001 0010 | Shift left         | 1101 |
|   | 1110      | A = A - M          | 1110 |

|  |           |                                 |  |
|--|-----------|---------------------------------|--|
|  | 0001 0010 | $Q_0 \leftarrow 0, A = A + M$ . |  |
|--|-----------|---------------------------------|--|

Ans :

 $Q \leftarrow \text{Quotient} = 0010$  $A \leftarrow \text{Remainder} = 0001.$

Q. Solve  $7 \div 2$

$$Q \leftarrow \text{Dividend} = 7$$

$$M \leftarrow \text{Divisor} = 2$$

count

A

Q

Comment:

$$\begin{array}{r} 0000 \\ 0000 \end{array} \quad \begin{array}{r} 0111 \\ 111\boxed{ } \end{array}$$

Initial.

$$4 \quad \begin{array}{r} 0000 \\ 0000 \end{array} \quad \begin{array}{r} 1110 \\ 111\boxed{ } \end{array}$$

shift left

$$\begin{array}{r} 1110 \\ + 0010 \\ \hline 0000 \end{array}$$

$$1110$$

$$A = A - M$$

$$\begin{array}{r} 0000 \\ 0000 \end{array} \quad \begin{array}{r} 1110 \\ 111\boxed{0} \end{array}$$

$$Q_0 \leftarrow 0, A = A + M$$

$$\begin{array}{r} 0001 \\ + 1110 \\ \hline 1111 \end{array}$$

$$3 \quad \begin{array}{r} 0001 \\ 0001 \end{array} \quad \begin{array}{r} 110 \\ 110\boxed{ } \end{array}$$

shift left

$$\begin{array}{r} 0001 \\ + 1110 \\ \hline 1111 \end{array}$$

$$1111$$

$$A = A - M$$

$$\begin{array}{r} 0001 \\ 0001 \end{array} \quad \begin{array}{r} 1100 \\ 110\boxed{0} \end{array}$$

$$Q_0 \leftarrow 0, A = A + M$$

$$\begin{array}{r} 0011 \\ + 1110 \\ \hline 0001 \end{array}$$

$$2 \quad \begin{array}{r} 0011 \\ 0011 \end{array} \quad \begin{array}{r} 100 \\ 100\boxed{ } \end{array}$$

shift left

$$\begin{array}{r} 0011 \\ + 1110 \\ \hline 0001 \end{array}$$

$$0001$$

$$A = A - M$$

$$\begin{array}{r} 0001 \\ 0001 \end{array} \quad \begin{array}{r} 100 \\ 100\boxed{1} \end{array}$$

$$Q_0 \leftarrow 1$$

shift left

$$\begin{array}{r} 0011 \\ + 1110 \\ \hline 0001 \end{array}$$

$$0001$$

$$A = A - M$$

$$\begin{array}{r} 0001 \\ 0001 \end{array} \quad \begin{array}{r} 0011 \\ 0011 \end{array}$$

$$Q \leftarrow \text{Quotient} = 3$$

$$R \leftarrow \text{Remainder} = 1.$$

Q. Solve  $16 \div 5$

Q  $\leftarrow$  Dividend = 16

M  $\leftarrow$  Divisor = 5

| Count | A              | Q | comment                              | 111011   |
|-------|----------------|---|--------------------------------------|----------|
|       | 00000000100000 |   |                                      |          |
| 6     | 000000100000   | 1 | shift left                           | 111011   |
|       | 111011         |   | $A = A - M$                          | + 111000 |
|       | 000000100000   | 0 | $Q_0 \leftarrow 0, A = A + M$        |          |
| 5     | 000001000000   |   | Shift left                           | 111011   |
|       | 111100         |   | $A = A - M$                          | + 000010 |
|       | 000001000000   | 0 | $Q_0 \leftarrow 0, A = A + M$        |          |
| 4     | 000010000000   |   | CL                                   | 111011   |
|       | 111101         |   | $A = A - M$                          | + 001000 |
|       | 000010000000   | 0 | $Q_0 \leftarrow 0, A = A + M 000011$ |          |
| 3     | 000100000000   |   | SL                                   | 000110   |
|       | 111111         |   | $A = A - M$                          | 111011   |
|       | 000100000000   | 0 | $Q_0 \leftarrow 0, A = A + M 000001$ |          |
| 2     | 001000000000   |   | SL                                   |          |
|       | 000011         |   | $A = A - M$                          |          |
|       | 000011000000   | 1 | Set A                                |          |
| 1     | 000110000000   |   | SL                                   |          |
|       | 000001         |   | $A = A - M$                          |          |
|       | 000001000000   | 1 | $Q_0 \leftarrow 1$                   |          |

Q  $\leftarrow$  Quotient = 3

R  $\rightarrow$  Remainder = 1

## Non-restoring division algorithm (unsigned)



Solve  $10 \div 3$  $A \leftarrow 10 \quad M \leftarrow 3$ 

$$\begin{array}{r}
 0011 \\
 1100 \\
 + 1 \\
 \hline
 1101
 \end{array}$$

| Count | A    | Q     | Comment             |        |
|-------|------|-------|---------------------|--------|
|       | 0000 | 1010  | Initial             | 1100   |
| 4     | 0001 | 010□  | SL                  | + 0011 |
|       | 1110 |       | $A = A - M$         | 1111   |
|       | 1110 | 010□0 | $Q_0 \rightarrow 0$ | + 0011 |
| 3     | 1100 | 100□  | SL                  | 0010   |
|       | 1111 |       | $A = A + M$         | 0100   |
|       | 1111 | 100.□ | $Q_0 \rightarrow 0$ | + 1101 |
| 2.    | 1111 | 000□  | SL                  | 0001   |
|       | 0010 |       | $A = A + M$         |        |
|       | 0010 | 000□1 | $Q_0 \rightarrow 1$ |        |
| 1     | 0100 | 001□  | SL                  |        |
|       | 0001 |       | $A = A - M$         |        |
|       | 0001 | 001□1 | $Q_0 \rightarrow 1$ |        |

 $Q_0 \leftarrow \text{Quotient} = 3$  $A \leftarrow \text{Remainder} = 1$

10011

WORLD STAR™

Date: \_\_\_\_\_  
Page: \_\_\_\_\_Solve  $19 \div 4$ 

| Count | (A)   | (S)   | comment                 |               |
|-------|-------|-------|-------------------------|---------------|
|       | 00000 | 10011 | Initial.                | <u>11100</u>  |
| 5     | 00001 | 00110 | SL                      | <u>11010</u>  |
|       | 11101 | 00110 | A = A - M               | <u>+00100</u> |
|       | 11101 | 00110 | $\oplus_0 \leftarrow 0$ | <u>11100</u>  |
| 4     | 11010 | 01101 | SL                      | <u>11100</u>  |
|       | 11110 | 01101 | A = A + M               | <u>00100</u>  |
|       | 11110 | 01101 | $\oplus_0 \leftarrow 0$ | <u>00000</u>  |
| 3     | 11100 | 11001 | SL                      | <u>00001</u>  |
|       | 00000 | 11001 | A = A + M               | <u>+11100</u> |
|       | 00000 | 11001 | $\oplus_0 \leftarrow 1$ | <u>11101</u>  |
| 2     | 00001 | 10010 | SL                      | <u>11011</u>  |
|       | 01101 | 10010 | A = A - M               | <u>+00100</u> |
|       | 01101 | 10010 |                         |               |
| 1     | 11011 | 00101 | SL                      | <u>11011</u>  |
|       | 11111 | 00101 |                         |               |

IEEE 754 floating point representation.

1) 32 bit format (Single precision)  $^{127}$

2) 64 bit format (Double precision)

$^{1023}$

Note:  $(1.f) \times 2^{e-127}$

It includes.

- Sign bit (-ve = 1 and +ve = 0)
- Exponent
- Significant (Mantissa)

1) 32 bit form (single bit format)  $^{precision}$

|                        |          |             |   |
|------------------------|----------|-------------|---|
| 31                     | 30       | 23   22     | 0 |
| Sign bit<br>↓<br>1 bit | Exponent | significant |   |

Eg. ① - 209.125.

Step 1. Convert into binary

2 209

2 104 1

2 52 0

2 26 0

2 13 0

2 6 1

2 3 0

1 1

1

$$.125 \times 2 = .250 \rightarrow 0$$

$$.250 \times 2 = .500 \rightarrow 0$$

$$.500 \times 2 = 1.000 \rightarrow 1.$$

$$(209.125) = (110100001.001)_2$$

Step 2: find out True exponent.

$$\begin{array}{r} 110100001.001 \times 2^0 \\ 1.10100001001 \times 2^7. \end{array}$$

Here 7 is True Exponent.

Step 3: Exponent = Bias component + True Exponent

$$\begin{aligned} &= 127 + 7 \\ &= 134 \end{aligned}$$

2 134

2 67 0

2 33 1  $(134)_{10} = (10000110)_2$

2 16 1

2 8 0

2 4 0

2 2 0

1 0

Step 4.

Sign bit      Exponent      Significant

|   |          |            |
|---|----------|------------|
| 1 | 10000110 | 1010001001 |
|---|----------|------------|

## 8086 Microprocessor.

## 80X86 family overview.

| chip / MP | Year    | Data Bus<br>(Bit) | Address Bus<br>(Bit) |
|-----------|---------|-------------------|----------------------|
| 4004      | 1971    | 4                 | 8                    |
| 8008      | 1972    | 8                 | 8                    |
| 8080      | 1974    | 8                 | 16                   |
| 8085      | 1977    | 8                 | 16                   |
| 8086      | 1978    | 16                | 20                   |
| 8088      | 1978-79 | 16                | 20                   |
| 80186     | 1982    | 16                | 20                   |
| 80286     | 1983    | 16                | 24                   |
| 80386 DX  | 1986    | 32                | 32                   |
| 80386 SX  | 1988    | 16                | 24                   |
| 80486 DX  | 1989    | 32                | 32                   |
| 80486 DX  | 1989    | 32                | 32                   |

## Features of 8086 MP

- ① It is an 16-bit microprocessor i.e 8086 MP can read or write or perform any arithmetic and logical operation on 16-bit data at a time.
- ② It is 40 pin IC or device.
- ③ It is capable of executing about 0.33 MIPS (Million Inst<sup>n</sup>/sec)
- ④ It has 16 bit data bus.
- ⑤ It has 20-bit address bus lines so 8086 can access upto  $2^{20} = 1\text{ MB}$  of memory.

- ⑦ At a time it can work with four 64-KB segment i.e CS, DS, ES and SS.
- ⑧ It reqd +5V supply
- ⑨ It works on max 10MHz clock freq & min 5MHz clock freq.
- ⑩ It contains 16 bit flag register.
- ⑪ It support multiprocessing
- ⑫ It can operate in single mode (minimum mode) and multiprocessor mode (maximum mode)
- ⑬ It provides 66 byte of Inst<sup>n</sup> queue for pipelining.
- ⑭ It has memory bank concept.
- ⑮ It has 256 s/w interrupts.

CS → IP

DS → BX/SI

ES → DI

SS → SP/BP

## Architecture of 8086 CPU V.N. Imp (10M)



8086 is a 16 bit microprocessor. It can perform 16 bit internal processing of data. It is internally divided into two independent functional unit i.e BIU

(Bus Interface Unit) and EU (Execution unit)

### Bus Interface Unit (BIU)

- It is the interface of 8086 to the outside world.
- It is responsible for all external bus operation
- It performs the following functions -
  - ① It generates 20 bit physical address for memory access.
  - ② Fetch instruction from memory.
  - ③ Transfer data to & from the memory & I/O
  - ④ Support pipelining using 6 byte instruction queue.

The main components of BIU are as follows-

#### A) Segment register -

##### ① CS (Code segment) -

- It holds base address of the program memory (i.e. code seg)
- It is multiplied by 10H (16d) to give 20 bit starting address of code segment

e.g. If CS = 1234H then CS  $\times$  10H

$$= 12340 \text{ H} \rightarrow \text{starting address of CS.}$$

## 2) DS (Data Segment) register:

- Hold base address for the memory data memory (i.e Data segment)
- It is multiplied by 10H to give starting address of data segment  
eg. DS = 1111b then  $DS \times 10H = 11110H$  will be starting address of data seg.

## 3) ES (Extra segment) register:

- Hold base address of Extra segment.

## 4) SS (Stack segment) register:

- Hold base address of stack memory (stack segment)

## (B) Instruction Pointer (IP)-

- It is a 16 bit register and it holds offset of code segment
- Address of code segment is calculated by  $(CS \times 10) + IP$

i.e If CS = 1000 and IP = 2222

$$= (\text{segment register} \times 10) + \text{offset register}$$

$$= (1000 \times 10) + 2222$$

$$= \boxed{12222\text{h}} \quad \begin{matrix} 20 \text{ bit physical address} \\ \text{of program} \end{matrix}$$

c) 20 bit physical address generation

mechanism:



(D) 6 byte prefetch Queue -

- It is a 6 byte FIFO RAM used to implement pipelining.
- The BIU fetches the next "six instruction Byte" from code segment into queue.
- To fetch  $inst^n$  bytes at least 2 bytes of queue must be empty "fetching the next instruction while executing the current  $inst^n$ " is called pipelining"
- Pipelining fails when a branch occurs
- During a branch the prefetch queue is cleared & the reload with the prefetched  $inst^n$  bytes from the new location

|    |    |    |    |    |    |    |
|----|----|----|----|----|----|----|
| F1 | E1 | F2 | E2 | F3 | E3 | F4 |
|----|----|----|----|----|----|----|

Non pipelined processor (i.e 8085)

|    |    |    |    |    |    |  |
|----|----|----|----|----|----|--|
| F1 | E1 | E2 | E3 | E4 | E5 |  |
| F2 | F3 | F4 | F5 |    |    |  |

pipelined processor (i.e 8086)

## 2) Execution Unit.

- It fetches instruction from the instruction queue.
- It performs arithmetic, logical, decision making operations.
- It sends request signals to BIU to access the external module.

The main component of EU are -

### (A) General purpose registers :

- 8086 has four 16 bit general purpose registers i.e AX, BX, CX and DX.
- These are available to programmer for arithmetic and logical operations.
- Each of these registers can be divided into two 8-bit register such as AH-AL, BH-BL, CH-CL and DH-DL.

### (B) Special purpose register

#### (i) Stack Pointer (SP)

- It is 16 bit register used to hold offset

address of the top of the stack in the stack segment.

- It is used by inst<sup>n</sup> like PUSH, POP, CALL, RET etc

#### (ii) Source Index (SI)

- It is 16 bit register used to hold offset address of data segment

#### (iii) Destination Index (DI) -

- It is 16 bit register used to hold offset of Extra Segment.

#### (iv) Base pointer (BP)

It is 16 bit register to hold offset address of stack memory.

#### (c) ALU:

It can perform 8 bit or 16 bit arithmetic and logical operation.

#### (d) operand

- It is a 16 bit register used to hold the operand or data temporarily

- It is not available for programmer.

### Pin configuration of 8086.

|      |    |    |                                         |
|------|----|----|-----------------------------------------|
| GND  | 1  | 40 | Vcc                                     |
| AD14 | 2  | 39 | AD15                                    |
| AD13 | 3  | 38 | AD16 / S3                               |
| AD12 | 4  | 37 | AD17 / S4                               |
| AD11 | 5  | 36 | AD18 / S5                               |
| AD10 | 6  | 35 | AD19 / S6                               |
| AD9  | 7  | 34 | BHE / S7                                |
| AD8  | 8  | 33 | MN / $\overline{MX}$                    |
| AD7  | 9  | 32 | $\overline{RD}$                         |
| AD6  | 10 | 31 | $\overline{RQ}/GTO$ (HOLD)              |
| AD5  | 11 | 30 | $\overline{RQ}/GT$ ; (HLDA)             |
| AD4  | 12 | 29 | $\overline{LOCK}$ ( $\overline{WR}$ )   |
| AD3  | 13 | 28 | $\overline{S}_2$ (M/ $\overline{I}_o$ ) |
| AD2  | 14 | 27 | $\overline{S}_1$ (DT/R)                 |
| AD1  | 15 | 26 | S0 ( $\overline{DEN}$ )                 |
| AD0  | 16 | 25 | Q <sub>S0</sub> (ALE)                   |
| NMI  | 17 | 24 | Q <sub>S1</sub> ( $\overline{INTA}$ )   |
| INTR | 18 | 23 | TEST                                    |
| CLK  | 19 | 22 | READY                                   |
| GND  | 20 | 21 | RESET                                   |

#### 1) CLK

- This is input line.
- An External clock generator IC 8284 provides the clock signal
- 8086 required, single phase 33% duty cycle.

#### 2) RESET (P no-21)

- This is Reset input line
- The 8284 clock generator provides it
- It clears flag reg & Instruction Queue

- It also clear DS, ES, SS and IP and set all bits
- Hence after Reset 8086 start from FFFF0H.

### 3) READY (P NO-22)

- This signal is also generated by 8284 clock generator and used to synchronize the MP with slower peripherals.
- If READY is low then MP cannot transfer data with peripherals, till it becomes high.

### 4) VCC (P NO-1 & 20, P. NO- 40)

- used for power supply.

### 5) ADO - AD15

- These are multiplexed bidirectional address/data Bus.
- During start of machine cycle it carries address (A<sub>0</sub> - A<sub>15</sub>) and in later part of cycle it carries data (i.e D<sub>0</sub> - D<sub>15</sub>)

### 6) MN/MX (P. NO-33)

- This is input signal to 8086.
- If the signal is high, 8086 is in minimum mode i.e IF uniprocessor mode & if the signal is Low 8086 is in maximum mode, i.e In multiprocessor mode.

### 7) TEST (P NO- 23)

- It is active LOW input line dedicated for 8087 co-processor.
- In maximum mode whenever the co-processor is busy it makes the pin HIGH.
- If TEST pin is High, the MP (8086) enter into a state it becomes low.

The min mode connected to GND.

A19/S6, A18/S5, A17/S4, A16/S3

- These are time multiplexed & status lines.

- During T<sub>1</sub> - state it carries upper 4 bit of Address bus.  
to identify memory segment

- During T<sub>2</sub> - T<sub>4</sub> state it carries status signal (S<sub>3</sub> to S<sub>6</sub>)

- Hex S<sub>5</sub> is interrupt enable signal.

S<sub>4</sub>      S<sub>3</sub>      Segment

0      0      ES

0      1      SS

1      0      CS or None

1      1      DS.

#### 9. INTR (Interrupt Request)

This is a maskable, level triggered, low priority interrupt.

#### 10. NMI INTA (Interrupt Acknowledge)

- It is active low output.

- When processor receive INTR signal, the processor complete current machine cycle & Acknowledge the interrupt by generating this signal.

#### 11. NMI (Non maskable Interrupt)

- This is non-maskable, edge triggered, High priority interrupt.

#### 12. BHE/S7 (Bus High Enable/Status)

- The bus High Enable signal is used to indicate the transfer of data over High order (D<sub>8</sub> to D<sub>15</sub>) data bus as shown in table shown below.

BHE

A<sub>0</sub>

|   |   |                                                                |
|---|---|----------------------------------------------------------------|
| 0 | 0 | Whole word from Both (Even & Odd) address                      |
| 0 | 1 | Upper byte from odd address (D <sub>8</sub> -D <sub>15</sub> ) |
| 1 | 0 | Lower byte from Even address (D <sub>0</sub> -D <sub>7</sub> ) |
| 1 | 1 | None                                                           |

- BHE in conjunction with A<sub>0</sub> determines whether a byte or word will transfer from or to memory.

### 3) ALE (Address Latch Enable)

- It will indicate the availability of address on AD<sub>0</sub>-AD<sub>15</sub> if ALE is High.
- When AD<sub>0</sub>-AD<sub>15</sub> have valid address (A<sub>0</sub>-A<sub>15</sub>) at that time this pin is High & when AD<sub>0</sub>-AD<sub>15</sub> have valid data (D<sub>0</sub>-D<sub>15</sub>) at that time this pin is Low.

### 4) WR (write)

- It is active low signal issued by the processor to write data to memory or I/O device.

### 5) RD (read)

This is active low output signal, when 8086 read data from memory or I/O device.

### 6) M/I<sub>O</sub>

- This is used by processor to distinguish memory & I/O device.
- When this signal is High, memory is accessed & when this signal is Low, the I/O device is accessed.

## E) FLAG register:



- It has 9 flags. These flags are of two types

6 status (condition) flag & 3 control flag

- The control flags are used to control certain operations.

### \* Status flag:

#### ① carry flag (CF):

It is set whenever carry out of MSB (i.e D7 bit), otherwise it is reset.

#### ② Parity flag (PF):

It is set if result has even parity (1's) otherwise reset.

#### ③ Auxiliary carry (AF)

It is set if carry is generated between D3 and D4 bit otherwise it is reset.

#### ④ zero flag (ZF):

It is set if result is zero  
otherwise it is reset.

### ⑤ Sign flag (SF):

It is set if result is negative otherwise it is reset.

### ⑥ Overflow flag (OF):-

It is set if result is too large otherwise it is reset.

#### \* Control flag

##### ① Trap flag (TF) :-

- It is used to set Trace mode ie start single stepping mode.
- It is used for Debugging purpose.

##### ② Interrupt flag (IF) :-

- It is used for Enable & disable the interrupt.

##### ③ Direction flag (DF) :-

- If it is set the SI and DI are in auto decrement mode during string operation otherwise in auto increment mode.



M | IO

RD

WR

Action.

1 0 1 Mem. Read

1 1 0 Mem. Write

0 0 1 I/O Read

0 1 0 I/O Write.

So, S, , 62

Addressing mode

1) Register Addressing mode

eg: MOV AX, BX.

2) Immediate addressing mode

eg: MOV AL, 02H

3) Direct Memory addressing mode

eg: MOV [5000], AL

4) Register Indirect Addressing mode

eg: MOV[SI], AL

5) Implied Addressing mode

eg: STC

## 8086 Minimum Mode.

crystal oscillator      Vcc



- 8086 works in minimum mode when  $MN/M\bar{X}=1$  (Vcc)
- In minimum mode 8086 is the only processor in the system
- As shown in ckt diagram clock, Ready & Reset signals are provided by the 8284 clock generator
- Address from the address bus latch into the 8282 (8 bit latch). Three such latches are needed as address bus is 20 bit
- The ALE for latch is given by 8086 itself
- The ALE of 8086 connected to strobe of 8282
- The databus is driven through 8286, 8 bit transceiver is required as data bus is 16 bit.

- Transreceivers are enable through  $\overline{DEN}$  signal while direction of data is controlled by the  $DT/R$  signal.
- Both  $\overline{DEN}$  &  $DT/R$  are given by 8086 itself
- $\overline{DEN}$  is connected to  $\overline{DE}$  &  $DT/R$  is connected to T of 8286.

| $\overline{DEN}$ | $DT/R$ | Action                     |
|------------------|--------|----------------------------|
| 1                | X      | 8286 (Transceiver) disable |
| 0                | 0      | Receive the data (Read)    |
| 0                | 1      | Transmit data (write)      |

- The various machine cycle is generated by the  $M/I\overline{O}$ ,  $\overline{RD}$ ,  $\overline{WR}$  signal.

| $M/I\overline{O}$ | $\overline{RD}$ | $\overline{WR}$ | Action       | Signal            |
|-------------------|-----------------|-----------------|--------------|-------------------|
| 1                 | 0               | 1               | Memory reads | $\overline{MRDC}$ |
| 1                 | 1               | 0               | Memory write | $\overline{MWTC}$ |
| 0                 | 0               | 1               | I/O reads    | $\overline{IDRC}$ |
| 0                 | 1               | 0               | I/O write    | $\overline{IOWC}$ |

- $M/I\overline{O}$ ,  $\overline{RD}$ ,  $\overline{WR}$  are given by 8086 itself
- DMA is done through HOLD & HLDA signal
- $\overline{INTA}$  is given by 8086 in response to INTR by 8086.

## Maximum mode of 8086 [10 m]



- 8086 works in maximum mode when  $MN/\bar{M}X = 0$  (Gnd)
- In max mode, there is atleast one more processor in the system besides 8086.
- As shown in ckt diagram clock, Ready, & Reset signals are provided by 8284 (clock generator)
- The most significant part of, max. mode is 8288 (Bus controller)
- Instead of 8086 the bus controller (8288) provides various control signals
- Address from the address bus latches into 8282 (8 bit address latch)

- Three such bus latches needed as address bus is 20 bit.
- The ALE for 8 bit Latch is given by 8288 Bus controller
- The ALE is connected to STB of 8282
- The data bus is driven through 8286 (Data Transceiver)
- Two such transceivers are needed, as the data bus is 16 bit.
- The transceivers are Enabled through the DEN Signal. while direction of data is controlled by DT/R
- Both DEN & DT/R are given by 8288 Bus controller
- DEN is connected to OE & DT/R is connected to T of 8286

| DEN | DT/R | Action        |
|-----|------|---------------|
| 0   | X    | 8286 disable  |
| 1   | 0    | Receive data  |
| 1   | 1    | Transmit data |

- The various machine cycles are initiated by 8288 bus controller using  $S_0$ ,  $S_1$  &  $S_2$ .

|     | $\overline{S_2}$ | $\overline{S_1}$ | $\overline{S_0}$ | Status            | 8288 Active output |
|-----|------------------|------------------|------------------|-------------------|--------------------|
| I/O | 0                | 0                | 0                | Interrupt Ack     | <u>INTA</u>        |
|     | (0)              | Read I/O port    |                  |                   | <u>IORC</u>        |
|     | (0)              | Write I/O port   |                  |                   | <u>IOWC</u>        |
|     | 0                | 1                | 1                | Halt              | None               |
|     | 1                | 0                | 0                | Instruction fetch | <u>MRDC</u>        |
|     | (1)              | Mem. Read        |                  |                   | <u>MRDC</u>        |
|     |                  | Mem. Write       |                  |                   | <u>MWTC</u>        |
|     | 1                | 1                | 1                | Inactive          | None               |

- DMA is done through  $\overline{RQ}/\overline{GT}$  lines of 8086.
- INTA is given by 8086, in response to INTR.

### Instruction set of 8086.

#### \* Arithmetic Instruction.

i) ADD / ADC destination, source

e.g. I] ADD AL, 25H ; AL  $\leftarrow$  AL + 25H

II] ADD BL, CL ; BL  $\leftarrow$  BL + CL

III] ADC CX, DX ; CX = CX + DX + carry

BE AE

CX = 1000H CX = 3001H

DX = 1000H DX = 2000H

carry = 1

carry = 0

2) SUB/SBB destination, source.

It's similar to ADD/ADC except that it does subtraction.

(SBB  $\Rightarrow$  Subtract with Borrow)

3) INC dest;

Add 1 to specified dest

I) INC SI

BE AE

SI = 2002 SI = 2003

II) INC BL

4) DEC dest

It is similar to INC except it does decrement by 1.

5) MUL source (8bit / 16 bit) - unsigned multiplication

- If source is 8 bit, it is multiplied with AL & result is stored in AX.

- MUL affects AF, PF, SF, ZF

e.g. I) MUL BL;  $AX = AL * BL$ .

II) MUL BX;  $DX - AX = AX * \cancel{BX}$ ; operation.

BE

AE

$AX \quad AX$

$AX * BX = \underline{\underline{0000}} \quad \underline{\underline{3333}}$

$AX = 2222H$

$DX = 0000$

$BX = 4444H$

$AX = 3333$

$DX = 7777H$

6) IMUL source (8 bit signed / 16-bit multiplication)  
same as MUL except, Here it can do  
Signed operation.

7) DIV source (unsigned division)

e.g.: I) DIV BL;

## operation :

$AX$  divide by  $BL$

$AL \leftarrow \text{Quotient}$

AH & Remainder

| II) DIV BX;     | <u>BE</u>  | <u>AE</u>  |
|-----------------|------------|------------|
| {DX : AX} / BX  | DX = 1111H | AX = 4444H |
| AX ← Quotient   | AX = 2222H | DX = 5555H |
| DX ← Remainder. | DX = 3333H |            |

8) IDIV source (signed division)

9) NEG destination

- This instruction forms the 2's complement of destination & source back to dest.

eg. NEG AL

### 10) CMP dest, source

- This instruction compare source with destination
  - The source & dest. must be of same size.
  - The result of this subtraction is NOT stored anywhere instead of flag bits are affected.
  - All condition flags are updated
- eg: `CMP CL, BL`.

### 11) CBW [Convert signed Byte to signed word]

- This instruction copies sign of byte into all the bits of AH

eg. `CBW`

BE

BBB

AE

$$AX = \text{xxxx xxxx} \underbrace{\text{001 0001}}_{\text{signed bit.}} \quad AX = 1111 1111 1000 0001$$

### 12) CWD (convert word to double word)

$$\begin{array}{ll} BE & AE \\ \text{AX} = \text{0011 0000 1111 0011} & \text{AX} = 1001 1000 1111 0011 \\ \text{DX} = \text{0xxxx xxxx xxxx xxxx} & \text{DX} = \text{HH HH HH HH} \end{array}$$

$$H8CDE = XA \quad HEDCBA = XA$$

SN on mixed language programming

- Here we need to call functions written in other language as mixed language programming
- Involve ~~direct calls to functions (procedure)~~ & macro.

$$\begin{array}{l} JA \\ HNSOI = XA \\ HACIT = XB \end{array}$$

$$\begin{array}{l} BE \\ HNEEI = XA \\ H.ACIT = XB \end{array}$$



Logical Group of Instruction.

1) NOT destination - operation 1's complement

eg. NOT AX

BE AE

$AX = 1234H$        $AX = EDCBH$

|      |      |      |       |
|------|------|------|-------|
| 0001 | 0010 | 0011 | 0100  |
| 1110 | 1101 | 1100 | 10011 |
| E    | D    | C    | B     |

2) AND dest, source

Logical bits bitwise AND

eg AND AX, BX

|              |              |
|--------------|--------------|
| BE           | AE           |
| $AX = 1234H$ | $AX = 1024H$ |
| $BX = 7126H$ | $BX = 7126H$ |

3) OR dest, source

Logical bitwise OR

eg OR AX, BX

HIBE AE HSCD = XA

AX = 1234H AX = 1024H AX = 7336H

BX = 7126H BX = 7126H RA2

4) EXOR dest, source

Logical bitweise EXOR ESID = XA

eg EXOR AX, BX

BE AE

AX = 1234H AX = 6312H

BX = 7126H BX = 7126H

5) TEST dest, source

- Logical Bitwise ANDing

- But it will change content of destination reg & only affect condition flag like AND instruction.

shift and Rotate Instruction.

1) SHL / SAL (shift logical / Arithmetic Left)

eg SHL AX, 01H

SHL dest, count

1100 BE 0100 AE 00 0011 0

AX = C123H AX = 8246H

1110 1100 0010 0100 0010 0010 0011

|   |      |      |      |      |
|---|------|------|------|------|
| 1 | 1000 | 0010 | 0100 | 0110 |
| 8 | 2    | 4    | 6    |      |

2) SHR (Shift Right) SHR dest, count

eg SHR AX, 01

BE AE

AX = C123H

AX = 8247H

1100 0001 0010 0011  
0010 0000 0001 0001

1100 0001 0010 0011  
0010 0000 0001 0001

3) SAR (Shift Arithmetic Right)

SAR dest, count

BE

AE

AX = C123H

AX = E091H

11 00 0001 0010 0D11

BE

11 00 0001 0010 0D11  
0010 0000 0001 0001 → 11  
E AX = C123H

Rotate

1) ROL (Rotate Left)

ROL dest, count



BE

AE

(Ans) AX = C123H

AX = 8247H

0 1100 0001 0010 0001

1 1000 0001 0001 0100 0111

8 2

4

7

→ 0 1100 0010 0100 0001 1

→ ROR (Rotate right)

ROR dest, count

BE

AE

BE

$AX = C123H$

$AX = E091H$

BE



AE

1110 0000 1001 0001

1

1E00 1001 0090 0110

RCL (Rotate left with carry)

RCL dest, count



BE

AE

$AX = C123H$

$AX = 8246$

( $CY = 0$ )

0

1100 0001 0010 0011

1

1000 0010 0100 0110

8 2

4

6

RCR (rotate right with carry)

RCR dest, count

BE EA AE

AX = C123H = XA

CY = 0 CY = 1



Processor control instruction.

STC (Set carry flag)

CLC (clear carry flag)

CMC (complement carry flag)

CLD (clear direction flag)

STD (set interrupt flag)

STI (set interrupt flag)

CLI (clear interrupt enable)

NOP (No operation)

## Computer Memory Organization

Q7 Explain memory hierarchy in detail.

There are different memory system, which have different storage capacity & different level of performance. They are organized to form a multilevel hierarchy of storage device as shown in figure.



Fig: Memory Hierarchy

The different memory types involved in this hierarchy is as follows

- 1) CPU register (Internal processor memory)  
There are small sets of High Speed register, flip-flop, which have operating speeds of computer and such High Speed memory together with ALU & control unit integrated into single chip & termed as CPU. They usually form a general purpose register (GPR's) for storing data as it is processed.

They form scratch pad memory  
eg. in 8086 its AX, BX, CX, DX etc.



### 2) Cache memory.

Most computer now have another level of IC called cache memory which is positioned logically between CPU & main memory. A cache memory storage capacity is lesser than main memory & cache memory but faster than main memory. Some cache memory resides on same IC called L1 cache and some reside other than CPU called L2 & L3 cache.

### 3) Main memory (Primary memory)

CPU directly access their location in the main memory. These are used for program and data storage during operation. It is read/write volatile memory. Compare to CPU register access speed is slower (but it is

Date: \_\_\_\_\_  
Page: \_\_\_\_\_

larger than cache memory and fact that it is separated from CPU.

#### 4) Secondary memory

This is much larger in size but slower than main memory and used for permanent storing prog & large data files, which are required by CPU. This data is transferred to main memory when CPU required. Because CPU cannot access secondary memory (Hard disk) directly.

#### 5. Explain different memory characteristics.

Ans. The important characteristics of any memory can be based some parameters as follows.

##### 1) Location of memory -

① CPU - The memory that have storage, placed inside the CPU is registers & High speed Cache (Level - 1) memory. It is read-write volatile memory.

##### ② Primary memory - (Internal memory)

The main memory is called as primary memory. It is next to cache memory & It is called as read-write volatile memory.

### ③ Secondary (External) memory-

External memory consists of peripheral storage devices like HDD. It is next to main memory & It is non-volatile memory.

#### 2) Capacity (size):

The capacity or size of memory is typically expressed in terms of number of bytes. External memory is typically huge & it is expressed in forms of terms of megabyte & gigabytes & nowadays in Tera bytes.

#### 3) Unit of Transfer

For internal memory the unit of data transfer is equal to number of data lines into & out of data module (i.e D<sub>0</sub> to D<sub>n</sub>)

The units of data transfer may be of 1-bytes, 2-bytes, 4-bytes etc at a time (or in one cycle) depends on data lines of CPU and I/O device.

#### 4) Access method.

There are different types of access method used to access a particular memory location from memory. These are as follows.

### Random Access

If storage location access in any order and access time is independent of the physical location being accessed i.e to access any memory location of memory it takes same amount of time eg. RAM

### Serial Access

Memories whose storage location can be accessed in certain pre-determined sequence is called serial access memory. In serial access memory, the location are accessed by continually moving storage location. Here time required to access any location depends on position of memory location.

### Semi-random access

Storage devices like Hard~~ware~~ disk, floppy disk & CD ROM contains many rotation circular storage tracks. Here track can be access randomly but access within each track is serial. In such case access mode is semi random.

## 5) Performance measure.

which measures speed of memory & parameters are as follows.

### - Access time -

This is the time from the instant that address is presented to memory to the instant that data have is available for use.

### - Memory cycle time

This consist of access time plus any additional time required before second access can commence.

### - Transfer rate

This is the rate at which data can transferred into or out of a memory.

## 6) Physical type - (Physical classification)

### Memory



### - RAM (Random Access memory)

It is read-write access memory. It is available volatile memory. It is two type ie SRAM & DRAM.

## SRAM

(Static Ram)

1. It is faster than DRAM.

## DRAM

(Dynamic RAM).

It is slower than SRAM.

2. It is built-up of flip flop & transistors

It is built up for

capacitors & mosfet.

3. Refreshing ckt is not required.

Refreshing ckt is not required.

4. Cost is more

Cost is less.

5. Used for implementing cache memory.

Used for implementing main memory.

6. It consists of more no. of components.

It consists of less no. of components.

7.



Bit lines

## ROM (Read only memory)

It is used for permanent storage. It is a non-volatile memory. It can be PROM, EPROM, EEPROM.

## PROM (Programmable ROM)

The contents of ROM cannot be altered once it is write. But it is less expensive. It is non volatile ~~exa~~ eg. toys, which contain ROM.

## EPROM (Erasable PROM)

Here write operation is also possible before write operation all storage cell must be erased by using ultra-violet radiation light. Here selective ~~as~~ erasing is not possible. It is less expensive than EEPROM.

## EEPROM (Electrically EPROM)

It is similar to EPROM. Here by passing voltage we can erase data. Here selective is possible & cost is more than EPROM.

## 7) Physical characteristics.

volatile / non-volatile      erasable / non-erasable.

## Page Replacement techniques: (Cache memory)

### ① FIFO (First in First Out)

| Time  | 1    | 2   | 3    | 4    | 5    | 6    | 7   | 8    | 9   | 10   | 11   | 12   |
|-------|------|-----|------|------|------|------|-----|------|-----|------|------|------|
| page  | 2    | 3   | 2    | 1    | 5    | 2    | 4   | 5    | 3   | 2    | 5    | 2    |
| frame | 2    | 2   | 2*   | 2    | 5    | 5    | 5*  | 3    | 3   | 3    | 3    | 3    |
|       | 3    | 3   | 3    | 3    | 2    | 2    | 2   | 2    | 2*  | 5    | 5    | 5    |
| miss  | miss | hit | miss | miss | miss | miss | hit | miss | hit | miss | miss | miss |

$$\text{Hit ratio} = \frac{\text{Hit}}{\text{Hit+Miss}} \times 100 = \frac{3}{12} \times 100 = 25\%$$

$$\text{page fault} = 9$$

### ② LRU Algorithm (least recently used)

| Time  | 1    | 2   | 3    | 4    | 5   | 6    | 7   | 8    | 9    | 10  | 11  | 12  |
|-------|------|-----|------|------|-----|------|-----|------|------|-----|-----|-----|
| page  | 2    | 3   | 2    | 1    | 5   | 2    | 4   | 5    | 3    | 2   | 5   | 2   |
| frame | 2    | 2   | 2*   | 2    | 2   | 2*   | 2   | 2    | 3    | 3   | 3   | 3   |
|       | 3    | 3   | 3    | 5    | 5   | 5    | 5*  | 5    | 5    | 5*  | 5   | 5   |
| miss  | miss | hit | miss | miss | hit | miss | hit | miss | miss | hit | hit | hit |

$$\text{Hit ratio} = \frac{5}{12} \times 100 = 0.4167 \times 100 = 41.67\%$$

3) Optimal.

| Time | 1 | 2 | 3  | 4 | 5 | 6  | 7 | 8 | 9  | 10 | 11 |
|------|---|---|----|---|---|----|---|---|----|----|----|
| Page | 2 | 3 | 2  | 1 | 5 | 2  | 4 | 5 | 3  | 2  | 5  |
|      | 2 | 2 | 2* | 2 | 2 | 2* | 4 | 4 | 4  | 4  | 4  |
|      | 3 | 3 | 3  | 3 | 3 | 3  | 3 | 3 | 3* | 2  | 2  |
|      | 1 | 5 | 5  | 5 | 5 | 5* | 5 | 5 | 5* | 5  | 5  |
| M    | M | H | M  | M | H | H  | M | H | H  | M  | H  |

$$\text{Hit ratio} = \frac{6}{12} \times 100 = 50\%$$

Eq 1)

| 1 | 2 | 3 | 4 | 5  | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 |
|---|---|---|---|----|---|---|---|---|----|----|----|----|----|----|----|----|----|----|----|
| 7 | 0 | 1 | 2 | 0  | 3 | 0 | 4 | 2 | 3  | 0  | 3  | 2  | 1  | 2  | 0  | 1  | 7  | 0  |    |
| 7 | 7 | 7 | 2 | 2  | 2 | 2 | 4 | 4 | 4  | 0  | 0  | 0  | 0  | 0  | 0  | 0* | 0  | 7  | 7  |
| 0 | 0 | 0 | 0 | 0* | 3 | 3 | 3 | 2 | 2  | 2  | 2  | 2* | 1  | 1  | 1  | 1* | 1  | 0  |    |
| 1 | 1 | 1 | 1 | 0  | 0 | 0 | 3 | 3 | 3  | 3* | 3  | 3  | 2  | 2  | 2  | 2  | 2  | 2  |    |
| M | M | M | M | H  | M | M | M | M | M  | M  | H  | H  | M  | M  | H  | H  | M  | M  |    |

$$\text{Hit} = \frac{5}{20} = 25\%$$

Page fault = 15

|   |   |   |   |   |   |    |   |   |   |   |    |   |   |   |   |   |   |   |
|---|---|---|---|---|---|----|---|---|---|---|----|---|---|---|---|---|---|---|
| 7 | 7 | 7 | 2 | 2 | 2 | 2  | 4 | 4 | 4 | 0 | 0  | 0 | 1 | 1 | 1 | 1 | 1 | 1 |
| 0 | 0 | 0 | 0 | 0 | 0 | 0* | 0 | 0 | 3 | 3 | 3* | 3 | 3 | 3 | 0 | 0 | 0 | 0 |
| 1 | 1 | 1 | 1 | 3 | 3 | 3  | 3 | 2 | 2 | 2 | 2  | 2 | 2 | 2 | 2 | 2 | 2 | 7 |
| M | M | M | M | H | M | H  | M | M | M | M | H  | H | H | H | M | H | M | H |

$$\text{Hit} = \frac{8}{20} = 40\%$$

Page fault = 12

|   |   |   |   |   |   |    |   |    |   |    |   |    |   |   |    |    |   |   |   |
|---|---|---|---|---|---|----|---|----|---|----|---|----|---|---|----|----|---|---|---|
| 7 | 7 | 7 | 2 | 2 | 2 | 2  | 2 | 2* | 2 | 2  | 2 | 2* | 2 | 2 | 2* | 2  | 2 | 7 | 7 |
| 0 | 0 | 0 | 0 | 0 | 0 | 0* | 4 | 4  | 4 | 0  | 0 | 0  | 0 | 0 | 0  | 0* | 0 | 0 |   |
| 1 | 1 | 1 | 3 | 3 | 3 | 3  | 3 | 3* | 3 | 3* | 3 | 1  | 1 | 1 | 1* | 1  | 1 |   |   |
| H | M | H | M | H | H | H  | M | H  | H | M  | H | H  | H | H | M  | H  | M |   |   |

$$\text{Hit} = \frac{11}{20} = 55\%$$

Page fault = 9

## Module 6. I/O organization.



Explain different I/O data transfer techniques.

There are three I/O data transfer techniques.

### ① Programmed I/O

- It is a useful method for computer where hardware cost needs to be minimized.
- Entire I/O is handled by CPU with the help of small software without any additional Hardware.
- It is based on the concept of busy waiting before I/O is performed.
- CPU constantly checks the status of I/O devices.
- If I/O device is not ready, CPU simply waits in the loop for the I/O device to become ready.
- CPU performs the following steps.
  - ① Read the I/O device status bit
  - ② Test the status bit to determine the device is ready to begin data transfer operation.

3 If device is not ready, return to step ① otherwise proceed with data transfer

Advantages :-

1. It is very simple technique
2. It is cost effective technique.

Disadvantages:

1. Max time CPU has to wait for I/O device hence ~~very~~ very time consuming technique.

Flowchart: technique transfer a block of data using programmed I/O.



Interrupt driven I/O

- Major draw back of programmed I/O is busy waiting since in the programmed I/O, CPU has repeatedly check whether is a device is free or not hence performance of CPU goes down.
- To overcome the problem of programmed I/O, the interrupt driven I/O is used.
- Here CPU switches to other program without waiting for the I/O devices to become free. When device is ready or free, it comes back to the CPU through mechanism known as interrupt.

Steps followed by Interrupt driven I/O

1. CPU issue READ command to I/O device & then switch to some other program
2. Once I/O device is ready or free it sends interrupt to CPU.
3. When interrupt from I/O device comes CPU suspends execution of current (another) program & READ data from Input device & then resume the execution of suspended program.

flowchart : transfer a block of data using interrupt driven I/O



## SN OF DMA [Direct Memory Access]

- There are some drawbacks of programmed I/O & interrupt driven I/O. Both required participation of the CPU to transfer data between memory & devices. Hence the CPU becomes busy in performing I/O transfer.
- To transfer large block of data at high speed without or less CPU intervention, an alternative method can be used. A special controller is provided to allow transfer of block of data directly between I/O devices & Main memory.

Block diagram of DMA.



Fig: working of DMA

## DMA working:

DMA data transfer proceeds as follow:

1. The CPU initially load the DMA register i.e Address register & data count with their initial values. where address reg. contains number of bytes to be transferred to or from memory & Address reg. contain starting memory location of data transfer
2. The I/O device need the service, generate request on any one channel on DMA.
3. Then DMA request for that I/O device to CPU.
4. CPU then surrenders the control of address & Data bus to DMA & send acknowledgement to DMA.
5. Now the DMA controller transfer directly to & from main memory & after every byte of transferred address reg. increment by one & data count reg. decrease by one.
6. If data count reg. value reaches to zero then DMA return back address & data bus to CPU & generate interrupt.

## SN on I/O module

- Input /output module provides a method for transferring information between internal storage & external I/O devices.

- Peripheral connected to computer need special communication link to resolve the difference that exist between the computer & each peripherals.

- The major differences are -

1. peripherals are electromechanical & electromagnetic devices & their manner of operation is different from the operation of the CPU & memory, which are electric electronic devices. Therefore, a conversion of signal may be req'd.
2. The data transfer rate of peripheral peripherals is slower than CPU, hence synchronization mechanism req'd.
3. Data codes & formats in peripherals are differ from the word format in CPU & memory. Hence I/O module:

  - Interface to CPU & memory.
  - Interface one or more peripherals.

\* various functions of I/O modules.



### 1. Control & timing :

- The co-ordinate the flow of traffic between resources & external devices.
- To transfer data from external bus devices to processor involve the following sequence of steps:
  - a. The processor check status of attached device (I/O device)
  - b. The I/O module return the device status.
  - c If device is ready to transmit, the processor transfer the data by means of I/O module

### 2. CPU communication :

- CPU comm<sup>n</sup> involve following.
  - a. Data - Data are exchanged between the processor & I/O module over data bus.
  - b. Status reporting - Because peripherals are so slow, it is important to know the status & I/O module.
  - c. Address recognition - Just as each word of memory has an address - So it is Thus an I/O module must recognize unique address of each peripherals.

### 3. Device communication.

- The I/O module must be able to perform device communication - This comm<sup>n</sup> involves

commands, status info & data.

#### 4. Data Buffering.

- Rate of data transfer to/from CPU is normally higher than I/O device hence I/O module buffers data so that CPU & I/O device send/receive at its rate.

#### 5. Error detection

Must detect & correct or report error

- Types of error during
- Data error, transmission
- Mechanical malfunction

M3 - Q. Explain pipelining process concept & different stages of pipeline.

- Pipelining is an effective way of organizing parallel activity in computer system. The actual fetching of inst<sup>n</sup> from memory is a major bottleneck in instruction speed.
- computers have ability to fetch inst<sup>n</sup> from memory in advanced so that they would be there when they are needed.
- These inst<sup>n</sup> were stored in a set of register called "prefetch" buffer or "Instruction Queue"
- In effect pre-fetching dividing inst<sup>n</sup> execution in many parts & each one is handled by separate unit, all of which

run parallel & each part is called stage  
- Pipeline stages.

M stage instructions pipeline can overlay the processing of upto M - inst<sup>n</sup>. So it is desirable to use more than 2 stages to maximize inst<sup>n</sup> throughput.

Examples:

1. Three stage pipeline

In 3 stage pipeline, one inst<sup>n</sup> has three units.



Stage-1, Stage-2, Stage-3

stage-1 : Inst<sup>n</sup> fetch unit, it fetch inst<sup>n</sup> from memory & placed it in buffer or queue unit if needed.

Stage 2 : Inst<sup>n</sup> Decode unit, it decode the inst<sup>n</sup> & determine its type and what operand (data) it need.

stage 3 : It actually execute the inst<sup>n</sup> with the help of ALU.

| Stage | Instruction    |                |                |                |                |                | → Time |
|-------|----------------|----------------|----------------|----------------|----------------|----------------|--------|
| 1     | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> | I <sub>4</sub> |                |                |        |
| 2     |                | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> | I <sub>4</sub> |                |        |
| 3     |                |                | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> | I <sub>4</sub> |        |

Pipelining - fetching the next inst<sup>n</sup> while decoding & executing current inst<sup>n</sup> is called as pipelining.

- During time unit (t1), fetch - 1 from mem.
- During time unit (t2), stage - 2 decode I<sub>1</sub> & stage - 1, fetch I<sub>2</sub> from memory.
- During time unit (t3), stage 3 execute I<sub>1</sub>, stage 2 decodes I<sub>2</sub> and stage 1 fetch I<sub>3</sub>. and so on.

## 2. Five stage pipeline:



Stage 1

Stage 2

Stage 3 - Read operands (data) either from memory or register

Stage 4 -

Stage 5 - It write result back to proper register or address.

## Program of 8086.

a. Addition of 8 bit number./16 bit

code segment

assume CS:code

Start:

MOV AL, 12h / MOV AX, 1111h

MOV BL, 64h / MOV BX, 2222h

ADD AL, BL / ADD AX, BX.

End Start

Code ends.

b. data segment

first DW 6666h

second DW 9999h

data ends

Code segment

assume CS:code, DS:data,

Start:

MOV AX, data

MOV DS, AX

MOV AX, first

MOV BX, second

ADD AX, BX

end start

code ends.

Q. Explain different pipeline Hazards

→ In general there are three major difficulties that cause the information instruction pipeline to deviate from its normal operation.

a. Resource conflict.

b. Data Hazards (Data dependency conflict)

c. Branch difficulties.

a. Resource conflict :- A resource conflict is a competition of the same resource at the same time. Example of resources includes memories, cache, buses, ports, functional unit like ALU.

- To understand concept of resource conflict let us take eg

$$\text{eg: } x = y + z$$

Above equation is executed by following instruction.

$$I_1 : R_1 \rightarrow y$$

$$I_2 : R_2 \rightarrow R_1 + z$$

$$I_3 : x \rightarrow R_2$$

| Time | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
|------|---|---|---|---|---|---|---|---|
|------|---|---|---|---|---|---|---|---|

|                |    |    |    |    |  |  |  |                    |
|----------------|----|----|----|----|--|--|--|--------------------|
| I <sub>1</sub> | FI | DI | OF | EX |  |  |  | R <sub>1</sub> ← y |
|----------------|----|----|----|----|--|--|--|--------------------|

|                |  |    |    |    |    |  |  |                                     |
|----------------|--|----|----|----|----|--|--|-------------------------------------|
| I <sub>2</sub> |  | FI | DI | OF | EX |  |  | R <sub>2</sub> ← R <sub>1</sub> + z |
|----------------|--|----|----|----|----|--|--|-------------------------------------|

|                |  |   |   |    |    |    |    |                    |
|----------------|--|---|---|----|----|----|----|--------------------|
| I <sub>3</sub> |  | x | x | FI | DI | OF | EX | x ← R <sub>2</sub> |
|----------------|--|---|---|----|----|----|----|--------------------|

As we see,

- During clock cycle 3, I<sub>1</sub> is fetching operand (OF) and no other instrn can access memory

during cycle 3 & same with I2.

- Inst<sup>n</sup> - 3 (I3) is delayed by 2 cycles as it cannot fetch Inst<sup>n</sup> as memory is being accessed by other Inst<sup>n</sup>.
- Thus resource dependency can deteriorate overall performance of pipeline execution.
- Above problem can be solved by using separate Inst<sup>n</sup> & data memory.

#### b. Data Hazards (Data dependency conflict)

- It arises when an Inst<sup>n</sup> depends on result of previous Inst<sup>n</sup>, but this result is not yet available.
- for eg. I1 :  $D = B + C$   
I2 :  $X = D + 2$

Here Inst<sup>n</sup> 1 modifies the variable - D, Inst<sup>n</sup> - 2 uses the value of variable - D. Thus, there is data dependency between two Inst<sup>n</sup> due to variable - D and both Inst<sup>n</sup> simultaneously in pipeline cannot be executed.

#### c. Branch difficulties

- In the absence of branch Inst<sup>n</sup> all the four stages operate on different Inst<sup>n</sup>.
- In case of Branch Inst<sup>n</sup> as it is decided after decode address in clock cycle the transfer from Inst<sup>n</sup> fetch (IF) to decode Inst<sup>n</sup> (DI) of Inst<sup>n</sup> is halted until the execution of branch Inst<sup>n</sup> is completed.

BCD addition.

MOV AX, 9999H

MOV BX, 9999H

ADD AL, BL

DAA

ADC AH, BH

DAA

Six Stage application



stage 1      stage 2      stage 3      stage 4      stage 5      stage 6.

stage 1      ad      ms      planning      hard

stage 2

Stage 3 - It will calculate effective address of  
operand (data)

stage 4

stage 5

stage 6.

| Instn | time | 1  | 2  | 3  | 4     | 5  | 6  | 7  | 8  | 9  | 10 | 11 | 12 | 13 | 14 |
|-------|------|----|----|----|-------|----|----|----|----|----|----|----|----|----|----|
| I1    | IF   | ID | OF | EX |       |    |    |    |    |    |    |    |    |    |    |
| I2    |      | IF | ID | OF | EY    |    |    |    |    |    |    |    |    |    |    |
| I3    |      |    | IF | ID | OF    | EX |    |    |    |    |    |    |    |    |    |
| I4    |      |    |    | IF | ..... |    | IF | ID | OF | EX |    |    |    |    |    |
| I5    |      |    |    |    |       |    |    | IF | ID | OF | EX |    |    |    |    |

Previous Instn is flushed from pipeline  
New instn branch location is fetch.

IF branch taken (True) a new instn is fetched from new location in next cycle  
If branch not taken (False) the instn fetched previously can be used.

### Q. Exp flynn's classification in detail.

- Various computer classification scheme have been formed according to architecture of parallel computer.
- In 1966, Michael J. Flynn based his classification scheme on the multiplicity of Inst<sup>n</sup> stream & data streams in computer.
- On this basis digital computers may be classified into four categories.
- The term stream is used to denote a sequence of items (Inst<sup>n</sup> or data) as

executed or operated upon by single processor.

- Instruction stream is a sequence of Inst<sup>n</sup> as executed by machine.

- Data stream is a sequence of data including input partial or temporary results called for previous result of Instruction.

- Categories of Flynn's are:

1. SISD (single Instruction stream - single data stream)

2. SIMD (single Inst<sup>n</sup> stream - multiple data stream)

3. MISD (multiple " " - single " ")

4. MIMD (multiple " " - multiple " ")

Conceptually only three types. System components are needed. control unit (CU), processing unit (PU) & memory module (MM).

- The Inst<sup>n</sup> & data are fetched memory module are decoded by control unit & further sends to processing unit for execution.

1. SISD computer organization.

Most serial computers are of SISD type.





- In SISD inst<sup>n</sup> are executed serially
- They can be overlapped their execution
- Stage with Fetch stage like 8086
- Inst<sup>n</sup> fetched from memory module that is decoded by control unit to generate decoded inst<sup>n</sup> stream. The processing unit like ALU performs calculation on this. Inst<sup>n</sup> stream & data stream fetch to & from memory.

### 2. SIMD computer organization.



- This category corresponds to array processor
- There are multiple processing unit supervised by single control unit.
- All processing unit receive same instruction broadcast from control unit but operated on different data sets
- The shared memory subsystem may contain multiple module.

### 3. MISD Computer Org<sup>n</sup>

- There are  $n$ - processing unit can receiving distinct inst<sup>n</sup> stream over same data stream.
- The result of one processor (PU) passed to next processor in micropipes.
- The concept of this organization is not implemented.
- This is also known as 'systolic Array'



### 4. MIMD Computer Organization.



fig: MIMD computer org<sup>n</sup>

- Most multiprocessor system classified in this categories now-a-days
- It is also called as multiple SISD organization.

### Module 3 Differentiate RISC v/s CISC [5-10 m]

control

~~BLUET~~

~~BLUET~~

unit

RISC → Reduced Instruction Set Computer

CISC → complex Instruction Set Computer

characteristics

RISC

CISC

1. No of instructions

More

Less

2. No of addressing modes

More (support Complex AMI)

Less

3. Instruction set

Varies

fixed.

4. Instruction length  
(format)

1, 2, 3 or 4 bytes

4 bytes

5. Instruction decoding

slow

fast

6. NO of registers

few (10 to 12) More (50 to 100)

7. pipeline

Not pipelined

Highly

or less pipelined

pipelined

8. Inst<sup>n</sup> execution speed

slowly depends on complexity of Inst<sup>n</sup>

High or

medium

9. NO of clocks req<sup>d</sup> / Inst<sup>n</sup>

1 or less than 2 to 15 CPI

1.5 CPU ↗

10. concept of Inst<sup>n</sup>

combined (i.e Register Register or mem. reference) oriented Inst<sup>n</sup>

|                                                   |                    |                  |
|---------------------------------------------------|--------------------|------------------|
| 11. Memory Access                                 | Frequently         | Rarely           |
| 12. compiler                                      | Simpler            | complicated      |
| 13. Inst <sup>n</sup> Execution<br>(control Unit) | By microprogrammed | By Hardwired     |
| 14. Example                                       | ARM 8051           | 8085, 8086.      |
|                                                   | Intel i860         | Intel x86 family |
| 15. cost                                          | Less               | More             |

DIFF bet Hardwired & microprogrammed CPU

|                                                          | Hardwired control Unit | Microprogrammed Control Unit |
|----------------------------------------------------------|------------------------|------------------------------|
| 1. Implementation Approach                               | Sequential circuit     | Microprogramming             |
| 2. Decoding & Sequencing Logic                           | complex                | Easy.                        |
| 3. Speed of Execution                                    | fast                   | slow                         |
| 4. Cost of Implementation                                | More                   | Less                         |
| 5. Ability to handle difficult Complex Inst <sup>n</sup> |                        | easier                       |
| 6. Application                                           | RISC processor         | CISC processor               |
| 7. Inst <sup>n</sup> Set size                            | Less                   | More.                        |

|                                                                         |                                                                                     |                                                         |
|-------------------------------------------------------------------------|-------------------------------------------------------------------------------------|---------------------------------------------------------|
| 8. Control memory                                                       | Absent                                                                              | present                                                 |
| 9. Control function                                                     | Implemented by Hardware                                                             | Implemented by S/W                                      |
| 10. Flexibility                                                         | Not flexible,<br>difficult to modify inst <sup>n</sup><br>for new inst <sup>n</sup> | Flexible, New<br>inst <sup>n</sup> can be easily added. |
| 11. Ability to support very difficult different OS & diagnostic feature | Easy.<br>(unless anticipated during design)                                         |                                                         |

Explain design of control unit with respect to Hardwired & microprogrammed (softwired) approach.

To execute inst<sup>n</sup>, a computer processor must generate a control signal in proper sequence. This sequence of action can either by processor's software or by hardware. There are 2 different approaches used for control unit design.

1. Hardwired control Unit
  2. Microprogrammed control Unit.
1. Hardwired Control Unit
- A hardwired control unit is a combination circuit which gets a set of input & transform

them into set of control signals

- Hardwired control unit provide Highest speed
- RISC's are implemented with Hardwired CU.
- If the instruction is complex, implementing Hardwired CU is difficult in the case microprogrammed CU are used.
- The CU are fixed logic circuit to interpret inst<sup>n</sup> & generate control signal for that.



Fig: Hardwired control unit organization

It has different elements

1. Clock : With the help of clock the control unit maintain time. This is referred as processor cycle time.

2. Instruction Register (IR) : It store opcode of current inst<sup>n</sup> the CU make the use of this opcode and will perform diff. action for different action.

3. Instruction Decoder (ID): As usual the IR is input to decoder for any instruction loaded by IR , one of the output lines from  $INS_1$  to  $INS_n$  is activated.

4. Step decoder : It generate signal for each step or time slot in control sequence

5. Control Step Counter: It is run by clock so the counter counts the clock pulse or time slot.

6. Status flag & condition code: Like start, end, carry, zero flag, sign flag etc provide by status & condition code.

7. Encoder: All input signal to the encoder block should be ~~considered~~ combined to generate different control signals.

## Microprogrammed control unit.



The microprogrammed CU function are as follows:

1. To execute an instruction, the sequencing logic unit issue a Read command to control memory.
2. the instruction whose address is specified by CMAR is Read by CM & stored in CBR
3. The CBR used to generate control signals & Next Address information for sequencing logic unit.

4. The sequencing logic units new address into CMAR based on next address information sent by decoder -2 & ALU flags.

5. Decoder-1 translate opcode of IR into control memory address reg. This decoder used for horizontal micro instruction.

6. Decoder 2 used for vertical micro ~~processes~~ instruction & it generate different control signal for micro instruction.

• Exp different methods to design hardwired control Unit [10 m]

Methods to design Hardwired CU are as follows

1. state table method.
2. Delay Element method.
3. sequence counter method.
4. state table method.

In this approach it is necessary to construct a state table for the control unit. The state table for this implementation derived from the flow chart directly, like any finite state machine representation

- finite state machine are of two types.

- (A) Mealy type
- (B) Moore type

### A Mealy type.

| State          | Input                               |                                     |     |                | IN                                  |
|----------------|-------------------------------------|-------------------------------------|-----|----------------|-------------------------------------|
|                | I <sub>1</sub>                      | I <sub>2</sub>                      | ... | I <sub>m</sub> | S <sub>1,m</sub> , O <sub>1,m</sub> |
| S <sub>1</sub> | S <sub>1,1</sub> , O <sub>1,1</sub> | S <sub>1,2</sub> , O <sub>1,2</sub> |     |                | S <sub>2,m</sub> , O <sub>2,m</sub> |
| S <sub>2</sub> | S <sub>2,1</sub> , O <sub>2,1</sub> | S <sub>2,2</sub> , O <sub>2,2</sub> |     |                |                                     |
| S <sub>3</sub> | :                                   | :                                   |     |                |                                     |
| :              | :                                   | :                                   |     |                |                                     |
| S <sub>n</sub> | S <sub>n,1</sub> , O <sub>n,1</sub> | S <sub>n,2</sub> , O <sub>n,2</sub> |     |                | S <sub>n,m</sub> , O <sub>n,m</sub> |

Table : Mealy type state table

Here the rows of the state table represent internal state of the machine. These states are determined by information stored in machine at various time unit.

- The column represent combination applied to machine (I<sub>j</sub>)
- The entry of row S<sub>i</sub> and column I<sub>j</sub> from S<sub>i,j</sub> and O<sub>i,j</sub>
- Where S<sub>i,j</sub> is the next state of machine that result from the application of input combination I<sub>j</sub>
- Where O<sub>i,j</sub> denotes the output signal appears whenever machine is in the state S<sub>i</sub> with Input I<sub>j</sub> applied.

### B. Moore type :

| state          | I <sub>1</sub>   | I <sub>2</sub>   | ... | I <sub>m</sub>   | output         |
|----------------|------------------|------------------|-----|------------------|----------------|
| S <sub>1</sub> | S <sub>1,1</sub> | S <sub>1,2</sub> |     | S <sub>1,m</sub> | O <sub>1</sub> |
| S <sub>2</sub> | S <sub>2,1</sub> | S <sub>2,2</sub> |     | S <sub>2,m</sub> | O <sub>2</sub> |
| :              | :                | :                |     | :                | O <sub>n</sub> |
| S <sub>n</sub> | S <sub>n,1</sub> | S <sub>n,2</sub> |     | S <sub>n,m</sub> |                |

## 2. Delay Element method:

the control signals or groups of control signals from the control units are activated in a proper sequence. The control unit using delay elements can be made from flowchart that specifies required control signal sequence. These are simple rule to derive control circuit from the flowchart.

A. Rule 1: Two successive micro-operations require delay element. The control signals are taken from the input & output lines of the delay element.



## B) RULE - 2

The signal that activate some control signals are ORed to get common output signal.



C Rule 3: When  $n$ -lines in the flowchart merge to common point  $x$  then these lines are connected to ' $n$ ' input OR gate.



D. Rule 4: A decision box can be implemented by 2 AND gates. The input of AND gates derived by input  $A$  & complement of  $A$



### 3. sequence counter Method:

- It shows a modulo- $k$  sequence counter cycling through  $k$ -different state.



The logic circuit of counter take these  $k$ -output as input along with some addition input used for counting purpose to generate some control signals.



- In this method.
- The sequence counter consist of counter. It is a counter which counts from 0 to  $k-1$  state.
- The  $1/k$  decoder and input signals ~~from~~ are used to control the operation of modulo- $k$  counter i.e reset, clock , count Enable.
- The output ~~of~~ of modulo- $k$  counter is connected to Input  $1/k$  decoder.
- When  $\text{Begin} = 1$  and clocked the first input of AND gate , it becomes HIGH . It will set count Enable signal . so when  $\text{CE} = 1$  , the counter cycle continuously pass through different state & decoder . generate  $k$ -phase( ) signal of output .
- When  $\text{Begin} = 0$  or  $\text{End} = 1$  , clock not ready to counter .
- when  $\text{Reset} = 1$  , it will off the counter

- When SN on Interleaved memory.
- In order to carry out m-independent access simultaneously, main memory must be partitioned into m-separate modules or banks  $M_0, M_1, \dots, M_{n-1}$
- Each module must be provided with its own independent ckt.
- A modular memory org<sup>n</sup> is particularly useful in multiprocessor to a common memory. Different processor can access memory simultaneously provided that they reference separate module.

### Low order interleaved



low order interleaved

(consecutive word in a module)

High ordered interleaved memory.



High Ordered Interleaved  
(consecutive word in consecutive module)

— Inorder for this memory org<sup>n</sup> to used efficiently the memory references generated by processor should be distributed evenly among the m-module.

Ex. various string instruction of 8086 [10m]

A string indicate a series of bytes or word stored on block of data are known as string inst<sup>n</sup>. For these inst<sup>n</sup>, memory source is

DS:[SI] & memory destination i.e is ES:[DI]

After performing string operation SI or DI are automatically incremented (or decremented) by 1 or 2 depending on byte or word

operation. If the direction flag (DF) is reset then automatic incrementing & vice-versa.

1. STOSB. (store String Byte)

ES: [DI]  $\leftarrow$  AL

- store content of AL reg. into m.l pointed by ES:[DI]

2. LODSB (load string Byte)

AL  $\leftarrow$  DS:[SI]

- Load content of m.l pointed by DS:[SI] to AL reg.

{  
- After string operation , if DF is reset then  
the DI is decremented incremented by 1  
otherwise it is decremented by 1

3. MOVSB [ Move String Byte ]

ES : [DI]  $\leftarrow$  DS:[SI]

- Move the content of m.l pointed by DS:[SI]  
to ES: [DI]

4. CMPSB (Compare string Byte)

DS:[SI] - ES: [DI]

- Subtract content of m.l pointed by  
ES: [DI] from DS:[SI] & content of m.l  
will not change any flags are affected  
i.e compare operation.

5. SCASB ( Scan string Byte )

AL - ES: [DI]

- subtract content of m.l pointed by  
ES: [DI] from AL only flags will affect

1. STOSW

2. LODSW

3. MOVSW

4. MOVSW

5. SCASW.

## C0A \* XLAT

MOV into AL, the content of memory location in Data segment, whose effective address is formed by the sum of BX + AL

Eg: XLAT ;  $AL \leftarrow DS : [BX + AL]$

; i.e IF DS = 1000, BX = 0200, AL = 03

;  $10000$  ( $DS \times 10$ )

+  $0200$  ( $BX$ )

03 ( $AL$ )

10203  $\therefore AL \leftarrow [10203H]$

## \* LAHF

Load AH with lower byte of flag reg

## \* SAHF

- Stores the content of AH into lower bytes of flag reg.

## \* PUSH Source -

Eg. 1. PUSH CL;  $SS : [SP - 1] \leftarrow CL$

2. PUSH CX;  $SS : [SP - 1] \leftarrow CH$

$SS : [SP - 2] \leftarrow CL$

## \* POP Source

Eg 1. POP CL;  $CL \rightarrow SS : [SP]$

$SS : [SP + 1]$

2. POP CX

$CL \leftarrow SS : [SP + 1]$

$CH \leftarrow SS : [SP + 2]$

$SS : [SP + 3]$

- \* AAA [ASCII Adjust for Addition]
    - (for AX reg only)
  - In ASCII codes , 0 to 9 are represented by 30 to 39.
  - It will convert content of AL in ASCII after addition.
- eg. MOV AL, 04H  
     MOV BL, 03H  
     ADD AL, BL  
     AAA
- \* AAS (ASCII Adjust for subtraction)
  - \* AAM (BCD Adjust after multiplication)
  - \* AAD (Binary adjust before division)

MS

Explain various memory mapping techniques of cache memory.

- As there are fewer cache blocks [lines] than main memory block , an algorithm is needed for mapping main memory blocks into cache lines (blocks)
- Further a means is needed for determining which main memory block currently occupies a cache lines . The choice of the mapping function required for that mapping technique play very important rule .
- Three techniques can be used
  1. Direct
  2. Associative
  3. set associativeness

## 1. Direct mapping.

- The simplest technique, known as direct mapping maps each block of main memory into only one possible cache line.
- The mapping expressed as

$$i = j \bmod m.$$

where,

$i$  = cache Block number

$j$  = main memory Block number

$m$  = number of block (lines) in the cache memory.

## 2. Associative mapping

- Associative mapping overcome the disadvantage of direct mapping by permitting each main memory block to be loaded each into any block of cache memory.
- In this case, the cache control logic interprets a memory simply as a Tag & a word field. The tag field unique identifies a block of main memory.
- To determine whether block is in cache, the cache control logic simultaneously examine every line (Block) tag for match.



TAG = No. of Block in main memory.

WORD = No. of Bytes in Blocks.

### 3. Set Associative memory

- Set associative mapping is a compromise that exhibits the strength of both the direct & associative approaches which reducing their disadvantages
- This is referred as ~~•~~ k-way set associative mapping
- For set associative mapping, each word map into all the cache lines in the specific set.
- So that main mem. block  $B_0$  maps into set and so on.

Q. 15

a. Assume main mem. size is 64 KB & cache memory consist of 128 Blocks of 16 bytes each  
 Find Tag, Block, word of direct mapping.

| Tag   | Block | word  |
|-------|-------|-------|
| 5 bit | 7 bit | 4 bit |

← 16 bit →

Given main mem = 64 KB

$$= 2^6 \times 2^{10} = 16 \text{ bit}$$

$$\text{word} = 16 \text{ Bytes} = 2^4$$

$$\text{BLOCKS} = \text{NO. of BLOCK in cache} = 128 = 2^7$$

$$\text{Tag} = \frac{\text{total NO. of Block in main mem}}{\text{Total NO. of Blocks in Cache memory.}}$$

MS

Cache mapping technique:

Let main memory consist of 2048 blocks of 32 words each & cache memory contain 256 blocks.

1. Find TAG, Block, word using direct mapping
2. find TAG & WORD of fully associative mapping
3. find TAG, SET, WORD using 4-way set associative technique.

Given: 1. No of blocks in main memory = 2048

2. Block size = 32 bytes

3. Size of main memory =  $2048 \times 32$

$$= 2^{11} \times 2^5 = 2^{16}$$

$$= 64 \text{ KB}$$

$\therefore$  16 bit is req'd to access main memory

4. No of blocks in cache mem = 256

| A. | TAG | Block | Word |
|----|-----|-------|------|
|    | 3   | 8     | 5    |

← 16 bit →

$$\text{Word} = 32 \text{ bytes} = 2^5$$

Block = No of block in cache mem i.e  $256 = 2^8$

B.  $S = \text{TAG} \& J = \text{WORD}$

|  |  |  |  |
|--|--|--|--|
|  |  |  |  |
|  |  |  |  |

← 16 bit →

← 16 bit →

C. For 4-way associative means

No of blocks in set = No of blocks in cache

4

TAG      SET      WORD

|   |   |   |
|---|---|---|
| 5 | 6 | 5 |
|---|---|---|

**RAS**  
SN on Cache Coherency.

In multiprocessor systems, another Bus master can take over control of system. This Bus master could write data into a main memory blocks which are already held in the cache of another processor. When it happens, the data in the cache no longer match those held in main memory, creating inconsistency. These are different methods to handle cache coherency.

1. Bus watching (snooping)
2. Hardware transparency
3. Non cacheable memory
4. Cache flushing

#### 1. Bus watching (snooping)

In Bus watching, cache memory controller invalidates the cache entry, if another master writes to a location in shared memory which also resides in the cache memory.

#### 2. Hardware transparency:

In this access of all devices of the main memory are routed through the same cache or by copying all cache write both to the main memory & to all other cache that share same memory.

3. Non cacheable memory - In this partitioned main memory into a cacheable & non cacheable memory. By designing shared memory as non cacheable memory, coherency can be maintained, since shared memory is never copied into cache.

4. Cache flushing - To avoid data inconsistency, a cache flush write any altered data to the main memory, & cache before the system flushed the shared memory.