

Welcome to ACE Engineering Academy - online live class

Subject: **Computer Organization and Architecture**

Faculty: **Y.V. Ramaiah**

9866339106

### **Subject**

Computer organization & Architecture

### **Chapters (Topics)**

- I. Computer Arithmetic ✓
- II. Memory Organization
- III. Secondary Memories
- IV. Basic processor organization and Design
- V. Pipeline organization
- VI. Control unit Design
- VII. IO Organization

## Chapter 2

# Memory Organization

- Introduction ✓

→ Memory Basics ✓

→ Memory Classification ✓

→ Memory Size Expansion ✓

→ Primary Memory ✓

→ Secondary Memory ✓

→ ROM and its design ✓

→ **RAM and its design** ✓

→ Memory Hierarchy ✓

→ Cache Memory ✓

→ Mapping Techniques ✓

→ Different misses occurred in cache ✓

→ Different block replacement techniques ✓

→ Tag directory design ✓

→ Associative Memory ✓

Q. Consider a machine with a byte addressable main memory of  $2^{16}$  bytes. Assume that a direct mapped data cache consisting of 32 lines of 64 bytes each is used in the system. A  $50 \times 50$  two-dimensional array of bytes is stored in the main memory starting from memory location 1100H. Assume that data cache is initially empty. The complete array is accessed twice. Assume that the contents of the data cache do not change in between the two accesses.

How many data cache misses will occur in total?



$$\text{file size to be mapped} \\ = 50 \times 50 \text{ B} = 2500 \text{ Bytes}$$



No. of blocks required in main memory

to store 2500 bytes =  $\frac{2500}{16} = 156$

$$BW = 2^6 \\ CL = 2^5 = 32$$

$$1100H =$$

$K \bmod 32$

$M_Bx$

=  $M_{B68}$

$$0001000100000000 \\ 64 \quad 4$$



$M_{B68}$  maps to  $CL_4$

first word is available  
in  $M_{B68}$ .  
 $M_{B68}$  to  $M_{B107}$  are needed  
to map

$M_{B68}$  to  
 $M_{B107}$   
 $K \bmod 32$

Red ink  
1st time  
Black  
INK  
2nd time

| 1st    | 2nd                | CLO |
|--------|--------------------|-----|
| 96     | 96 HIT             | 1   |
| 97     | 97 HIT             | 2   |
| 98     | 98 HIT             | 3   |
| 99     | 99 HIT             |     |
| 68 100 | 68 CNF ✓ 100 CNF ✓ | 4   |
| 69 101 | 69 CNF ✓ 101 CNF ✓ | 5   |
| 70 102 | 70 CNF ✓ 102 CNF ✓ | 6   |
| 71 103 | 71 CNF ✓ 103 CNF ✓ | 7   |
| 72 104 | 72 CNF ✓ 104 CNF ✓ | 8   |
| 73 105 | 73 CNF ✓ 105 CNF ✓ | 9   |
| 74 106 | 74 CNF ✓ 106 CNF ✓ | 10  |
| 75 107 | 75 CNF ✓ 107 CNF ✓ | 11  |
| 76     | 76 HIT             | 12  |
| .      | .                  |     |
| 95     | 95 HIT             | 31  |



1st  
time  
= 40  
CMP  
Misses

2nd time  
total no. of  
CNF misses  
= 16

Total = 56

Q. Consider a machine with a byte addressable main memory of  $2^{16}$  bytes.

bytes. Assume that a direct mapped data cache consisting of 32 lines of 64 bytes each is used in the system. A  $50 \times 50$  two-dimensional array of bytes is stored in the main memory starting from memory location 1100H. Assume that data cache is initially empty. The complete array is accessed twice. Assume that the contents of the data cache do not change in between the two accesses.

Which of the following lines of the data cache will be replaced by new blocks in accessing the array?

- (a) line 4 to line 11      (b) line 4 to line 12  
 (c) line 0 to line 7      (d) line 0 to line 8

CL 4  
to  
CL 11

Q. If 2-way set associate cache with LRU cache contains Blocks 4,

sets 2, the number of cache misses for the following sequence of block address is (initially cache is empty)

- (a) 2      (b) 3  
(c) 4      (d) 5
- $K \text{ mod } S$   
 $K \text{ mod } 2$



Q. Consider a 2-way set associative cache with 256 blocks and uses LRU replacement. Initially the cache is empty. Conflict misses are those misses which occur due to contention of multiple blocks for the same cache set. Compulsory misses occur due to first time access to the block. The following sequence of accesses to memory blocks

$(0, 128, 256, 128, 0, 128, 256, 128, 1, 129, 257, 129, 1, 129, 257, 129)$  is repeated 10 times. The number of conflict misses experienced by the cache is 76.

2017  
GATE

$$\begin{aligned} N &= 2 \\ C_L &= 256 \\ S &= 128 \\ S_0 \text{ to } S_{127} & \\ K \bmod 128 & \end{aligned}$$



## Fully Associative mapping



$\rightarrow$  In this, all cache lines are considered as Single set ;  $S=1$   $\log_2 S = 0$ , i.e. Set off set field Index field is not required in its physical Address format

2019 GATE



- Q) Certain CPU uses Fully Associative Cache of ACE Size 16 K Bytes and line size is 16 bytes. The main memory is Byte addressable and uses 32 bit address. How many bits are required for the Tag and Index fields respectively in the address generated by the processor?  ← 32 →  
Tag      Index      Word  
④

- a) 28 and 4
  - b) 24 and 4
  - c) 24 and 0
  - d) 28 and 0



Different misses occurred in Cache memory while executing a program

(i) Compulsory Miss (Cold miss)

(ii) Conflict Miss

(iii) Capacity Miss

Compulsory Miss:- While accessing a word / block first time and if it is not available in Cache, it known as Compulsory Miss  
This miss occurs in all mapping technique



Capacity miss:- It occurs, when

the target word is not found in Cache and all Cache blocks are filled.

Generally Capacity Miss occurs in Fully

Associative mapped Cache only

∴ In Fully Associative mapped Cache, any Main Memory block can be mapped to anywhere in the Cache.



MB<sub>q</sub> → CL<sub>3</sub>  
MB<sub>13</sub> → CL<sub>3</sub>

### Conflict Miss

- When CPU refers a word first time; and if it is not available in cache then it is known as Compulsory Miss, immediately CPU maps that block from Main Memory to C.M.
- After some time, if CPU is needed to replace that block for providing the space for new block in its place; and CPU requires same evicted block once again (2nd time) then it is known as Conflict Miss.

- Single Conflict miss also does not occur in Fully Associative Mapped Cache
- No. of Conflict misses occurred is more in Direct mapped Cache
- No. of conflict misses to be occurred in 'N' way block set associative mapped Cache is inversely proportional to 'N'.



## Block Replacement Techniques

→ These are not applicable for Direct mapped Cache.

→ One of these techniques are used when any Main memory block can have 2 or more Cache line number mapping chances

Let  $n = 2$ ,  $CL = 4$        $S = 2$



Kmod2

### Request order

- MB 9
- MB 13
- MB 6
- MB 4
- MB 7
- MB 8



for mapping T  
one of MB9 and MB13 will be  
evicted, if required  
Replacement technique



(i) FIFO (ii) LRU (iii) optimal

FIFO:- (First In First Out)

In this, the block which is available in Cache memory from longest time will be selected for replacement (even that block is recently accessed by the CPU then also not considered)

merit: Minimum HW logic circuit is sufficient  
cost is moderate  
hit rate is also moderate

LRU:- Least Recently used

The block which is not referenced by the CPU from longest time (irrespective of its arrival time) will be selected for replacement

→ It requires complex HW logic circuit  
Hence design cost is expensive

hit rate is excellent



- optimal
- In this, the block which is not needed for the CPU in near future will be selected for Replacement
  - It requires Moderate Hardware logic ckt, hence cost is also Moderate and it provides Moderate Hit Rate.

CL=8

Fully Associative mapped Cache

|      |     |
|------|-----|
| MB8  | CL0 |
| MB13 | CL1 |
| MB6  | CL2 |
| MB7  | CL3 |
| MB16 | CL4 |
| MB17 | CL5 |
| MB12 | CL6 |
| MB31 | CL7 |

In future, CPU requires After

MB8 : 2 mins  
MB13 : 5 mins  
MB7 : 1 min  
MB16 : 3 min  
MB17 : 2.5 min  
MB12 : 3.5 min  
MB31 : 4 min

For mapping new block → block is replaced



Hardware logic circuit required for  
declaring Hit/miss in Cache memory  
while performing reading.

### Direct mapped Cache

only one Tag Comparator  
size of the Comparator  
= No. of tag bit/line ✓  
MUX is not required

'N' way Block Set  
Associative mapped Cache

No. of Tag Comparators = N  
Size of each CMP =  
No. of tag bits / line  
 $N \times 1$  MUX is  
required

### Direct mapped cache

$$\begin{aligned}
 CA &= 6 \text{ bit}, & PA &= 12 \text{ bit} \\
 MMW &= 2^{12} = 4096 & CMW &= 2^6 = 64 \\
 CL &= \frac{6}{2^4} = 2^4 = 16 & MB &= \frac{4096}{16} = \frac{2^{12}}{2^4} = 2^8 = 256 \\
 K \bmod 4 & & & \text{WORD size} = 8 \text{ bit} \\
 K &= 0 \text{ to } 255 & & \text{Do to D}_7
 \end{aligned}$$

MB<sub>0</sub> to MB<sub>255</sub>

CL<sub>0</sub> to CL<sub>3</sub>

|                                                                                             |                               |                                                             |
|---------------------------------------------------------------------------------------------|-------------------------------|-------------------------------------------------------------|
| A <sub>11</sub> A <sub>10</sub> A <sub>9</sub> A <sub>8</sub> A <sub>7</sub> A <sub>6</sub> | A <sub>5</sub> A <sub>4</sub> | A <sub>3</sub> A <sub>2</sub> A <sub>1</sub> A <sub>0</sub> |
| x x x x x x                                                                                 | x x                           | x x x x                                                     |
| TAG                                                                                         | CL <sub>x</sub>               | word offset                                                 |



- $A_0, A_1, A_2, A_3$  are used to select one of the words in the selected Cache line
- Each Cache line is designed with one internal Word Address Decoder

$4 \times 16$   
Inputs  $A_3, A_2, A_1, A_0$  → to select one of the 16 words



- Line offset Decoder is used to select one of the 4 cache lines with the help of  $A_5 A_4$  Address
- For  $(1 \text{ DA})_{16}$  word Hit has occurred,  
In  $CL_1$ ,  $w_{10}$  word will be Read through Data Bus.
- For  $(4 \text{ A } 3)_{16}$  word = 0100 1010 0011  
since Tag Matching CKT provides '0 op', miss occurs then CPU Visits main memory.