

24/7/19



\* Program :- holds the address of the next counter instruction.

\* Stack :- stores the result of the previous PC pointers



MOV R<sub>1</sub>, R<sub>2</sub>  
opcode.

output enabled → to be stored in database  
input enabled → reads the info from database.



## \* Memory hierarchy



25/7/19

Info needed to transfer data from main memory to secondary memory

# MOV R<sub>1</sub>, R<sub>2</sub> contains address of R<sub>2</sub>  
MOV R<sub>1</sub> to R<sub>2</sub>

generated { OE<sub>R2</sub>  
              LD R<sub>1</sub>

generated { LD R<sub>1</sub>  
              OE<sub>R2</sub>

Mr. because at the later point of time  
some other processor may be output  
enabled and it may receive unnecessary info.  
Be timing & check will make use of the

### # Instruction read

- Latch opcode
- Opcode decode (R1, R2) or direct word
- execute instn.

I<sub>0</sub>: ADD R<sub>1</sub> → register addition  $\Rightarrow$  multiplication  
I<sub>1</sub>: CMP → has mask (less than - drops) performs subtraction  
I<sub>2</sub>: AND R<sub>1</sub> → logical & nano from opcode case of other logical op.

I<sub>3</sub>: JMP

I<sub>4</sub>: MOV R<sub>1</sub>, R<sub>2</sub>

I<sub>5</sub>: MOV R<sub>2</sub>, R<sub>1</sub>

I<sub>6</sub>: MOV ACC, R<sub>1</sub>

I<sub>7</sub>: MOV ACC, R<sub>2</sub>

I<sub>8</sub>: MOV R<sub>1</sub>, ACC

I<sub>9</sub>: MOV R<sub>2</sub>, ACC

I<sub>10</sub>: MOV ACC, M (memory to accumulator)

I<sub>11</sub>: MOV M, ACC

I<sub>12</sub>: MVI

I<sub>13</sub>:

I<sub>14</sub>:

I<sub>15</sub>:

I<sub>16</sub>:

I<sub>17</sub>:

I<sub>18</sub>:

I<sub>19</sub>:

I<sub>20</sub>:

I<sub>21</sub>:

I<sub>22</sub>:

I<sub>23</sub>:

I<sub>24</sub>:

I<sub>25</sub>:

I<sub>26</sub>:

I<sub>27</sub>:

I<sub>28</sub>:

I<sub>29</sub>:

I<sub>30</sub>:

I<sub>31</sub>:

I<sub>32</sub>:

I<sub>33</sub>:

I<sub>34</sub>:

I<sub>35</sub>:

I<sub>36</sub>:

I<sub>37</sub>:

I<sub>38</sub>:

I<sub>39</sub>:

I<sub>40</sub>:

I<sub>41</sub>:

I<sub>42</sub>:

I<sub>43</sub>:

I<sub>44</sub>:

I<sub>45</sub>:

I<sub>46</sub>:

I<sub>47</sub>:

I<sub>48</sub>:

I<sub>49</sub>:

I<sub>50</sub>:

I<sub>51</sub>:

I<sub>52</sub>:

I<sub>53</sub>:

I<sub>54</sub>:

I<sub>55</sub>:

I<sub>56</sub>:

I<sub>57</sub>:

I<sub>58</sub>:

I<sub>59</sub>:

I<sub>60</sub>:

I<sub>61</sub>:

I<sub>62</sub>:

I<sub>63</sub>:

I<sub>64</sub>:

I<sub>65</sub>:

I<sub>66</sub>:

I<sub>67</sub>:

I<sub>68</sub>:

I<sub>69</sub>:

I<sub>70</sub>:

I<sub>71</sub>:

I<sub>72</sub>:

I<sub>73</sub>:

I<sub>74</sub>:

I<sub>75</sub>:

I<sub>76</sub>:

I<sub>77</sub>:

I<sub>78</sub>:

I<sub>79</sub>:

I<sub>80</sub>:

I<sub>81</sub>:

I<sub>82</sub>:

I<sub>83</sub>:

I<sub>84</sub>:

I<sub>85</sub>:

I<sub>86</sub>:

I<sub>87</sub>:

I<sub>88</sub>:

I<sub>89</sub>:

I<sub>90</sub>:

I<sub>91</sub>:

I<sub>92</sub>:

I<sub>93</sub>:

I<sub>94</sub>:

I<sub>95</sub>:

I<sub>96</sub>:

I<sub>97</sub>:

I<sub>98</sub>:

I<sub>99</sub>:

I<sub>100</sub>:

I<sub>101</sub>:

I<sub>102</sub>:

I<sub>103</sub>:

I<sub>104</sub>:

I<sub>105</sub>:

I<sub>106</sub>:

I<sub>107</sub>:

I<sub>108</sub>:

I<sub>109</sub>:

I<sub>110</sub>:

I<sub>111</sub>:

I<sub>112</sub>:

I<sub>113</sub>:

I<sub>114</sub>:

I<sub>115</sub>:

I<sub>116</sub>:

I<sub>117</sub>:

I<sub>118</sub>:

I<sub>119</sub>:

I<sub>120</sub>:

I<sub>121</sub>:

I<sub>122</sub>:

I<sub>123</sub>:

I<sub>124</sub>:

I<sub>125</sub>:

I<sub>126</sub>:

I<sub>127</sub>:

I<sub>128</sub>:

I<sub>129</sub>:

I<sub>130</sub>:

I<sub>131</sub>:

I<sub>132</sub>:

I<sub>133</sub>:

I<sub>134</sub>:

I<sub>135</sub>:

I<sub>136</sub>:

I<sub>137</sub>:

I<sub>138</sub>:

I<sub>139</sub>:

I<sub>140</sub>:

I<sub>141</sub>:

I<sub>142</sub>:

I<sub>143</sub>:

I<sub>144</sub>:

I<sub>145</sub>:

I<sub>146</sub>:

I<sub>147</sub>:

I<sub>148</sub>:

I<sub>149</sub>:

I<sub>150</sub>:

I<sub>151</sub>:

I<sub>152</sub>:

I<sub>153</sub>:

I<sub>154</sub>:

I<sub>155</sub>:

I<sub>156</sub>:

I<sub>157</sub>:

I<sub>158</sub>:

I<sub>159</sub>:

I<sub>160</sub>:

I<sub>161</sub>:

I<sub>162</sub>:

I<sub>163</sub>:

I<sub>164</sub>:

I<sub>165</sub>:

I<sub>166</sub>:

I<sub>167</sub>:

I<sub>168</sub>:

I<sub>169</sub>:

I<sub>170</sub>:

I<sub>171</sub>:

I<sub>172</sub>:

I<sub>173</sub>:

I<sub>174</sub>:

I<sub>175</sub>:

I<sub>176</sub>:

I<sub>177</sub>:

I<sub>178</sub>:

I<sub>179</sub>:

I<sub>180</sub>:

I<sub>181</sub>:

I<sub>182</sub>:

I<sub>183</sub>:

I<sub>184</sub>:

I<sub>185</sub>:

I<sub>186</sub>:

I<sub>187</sub>:

I<sub>188</sub>:

I<sub>189</sub>:

I<sub>190</sub>:

I<sub>191</sub>:

I<sub>192</sub>:

I<sub>193</sub>:

I<sub>194</sub>:

I<sub>195</sub>:

I<sub>196</sub>:

I<sub>197</sub>:

I<sub>198</sub>:

I<sub>199</sub>:

I<sub>200</sub>:

I<sub>201</sub>:

I<sub>202</sub>:

I<sub>203</sub>:

I<sub>204</sub>:

I<sub>205</sub>:

I<sub>206</sub>:

I<sub>207</sub>:

I<sub>208</sub>:

I<sub>209</sub>:

I<sub>210</sub>:

I<sub>211</sub>:

I<sub>212</sub>:

I<sub>213</sub>:

I<sub>214</sub>:

I<sub>215</sub>:

I<sub>216</sub>:

I<sub>217</sub>:

I<sub>218</sub>:

I<sub>219</sub>:

I<sub>220</sub>:

I<sub>221</sub>:

I<sub>222</sub>:

I<sub>223</sub>:

I<sub>224</sub>:

I<sub>225</sub>:

I<sub>226</sub>:

I<sub>227</sub>:

I<sub>228</sub>:

I<sub>229</sub>:

I<sub>230</sub>:

I<sub>231</sub>:

I<sub>232</sub>:

I<sub>233</sub>:

I<sub>234</sub>:

I<sub>235</sub>:

I<sub>236</sub>:

I<sub>237</sub>:

I<sub>238</sub>:

I<sub>239</sub>:

I<sub>240</sub>:

I<sub>241</sub>:

I<sub>242</sub>:

I<sub>243</sub>:

I<sub>244</sub>:</p

\* For any instruction to be executed

T<sub>0</sub>: MAR  $\leftarrow$  PC

T<sub>1</sub>: { IAR  $\leftarrow$  M[MAR]; gets res. should be loaded with the value in MAR  
new op code is selected.

PC  $\leftarrow$  PC + 1 Assuming every instruction takes only 1 memory location

T<sub>2</sub>: { DEC(IAR); Decodes the instruction register

MAR  $\leftarrow$  IAR<sub>0:11</sub>; load the MAR with content from IAR<sub>0:11</sub>

ADD R<sub>1</sub>

T<sub>3</sub>: DR  $\leftarrow$  T<sub>2</sub>; DR is loaded with contents of R<sub>1</sub> so that ALU can access it.

T<sub>4</sub>: Acc  $\leftarrow$  ALU<sub>ADD</sub>(Acc, DR); (content is given down to ALU)

(Name stamps)

move back to T<sub>0</sub> (timestamp)  
next timestamp ← next timestamp + 1

# MOV R<sub>1</sub>, R<sub>2</sub>

T<sub>5</sub>: R<sub>1</sub>  $\leftarrow$  R<sub>2</sub>

↓  
use next T<sub>0</sub>.

# MOV Acc, M

T<sub>6</sub>: Acc  $\leftarrow$  M[MAR]  
T<sub>0</sub>;

#

Sequence counter(4 bit) ← its size depends on complexity of instruction

16 time stamps. ← Assuming that the entire task will not take more than 16 bits of time stamp.  
 $\frac{1}{16}$  bits of time stamp  
Sequence counter ← every time we go back to T<sub>0</sub>



Timestamp  $T_0$

PC:  $OE = T_0 + \text{constant}$   $T = 0J : 0A$

MAR:  $LD = T_0 + \text{constant}$

Timestamp  $T_1$

PC:  $INCR = T_1 +$

IR:  $LD = T_1 +$

Timestamp  $T_2$

MAR:  $LD = T_2 +$

IR:  $OE = T_2 +$

For ADD  $R_1$ , Timestamp  $T_3$

$DR: LD = T_3 D, T_0 +$   
 $T_2: OE = T_3 D, T_0 +$   
 $IR: R_2, R_4$   
 $\text{For MOV }$

$R_2: LD = T_3 D, T_5 +$   
 $IR: OE = T_3 D, T_5 +$

$AC = T_4 D, T_0 +$   
 $ALU = ADD = T_4 D, T_0 +$   
 $SEGCTR: INCR = \frac{T_0 + T_1 + T_2 + T_3 D, T_0 +}{\text{constant}} =$   
 $\underline{\underline{CTR}} = T_4 D, T_0 +$

(MD)

For  $MOV R_1, R_2$ ,

$$R_1: LD = T_3 D_7 I_{14} +$$

$$R_2: DE = T_3 D_7 I_{14} +$$

For  $MOV Acc, R_1$ ,

$$R_1: DE = T_3 D_7 I_{16} +$$

$$Acc: LD = T_3 D_7 I_{16} +$$

31/7/19

MOV Acc, M

$T_3,$

$$Acc: \cancel{LD} \rightarrow LD = T_3 D_7 I_{10} +$$

$MAR +,$

$$T_3: Acc \leftarrow M[MAR], T_0$$

$$Acc: LD = T_4 D_7 I_0 + T_3 D_1 +$$

$$ALU: ADD = T_4 D_7 I_0 +$$

SEQ.CTR: INCR =

$$CLR = T_4 D_7 I_0 + T_3 D_1 +$$

#



MOV Acc, M

$$Acc: LD = T_3 D_1 I_{15}'$$

MOV1 Acc, M

$$Acc: LD = T_3 D_1 I_{15}$$

(Indirect add.)

- \* Hardwired control Unit  $\rightarrow$  Proper circuitry is used to generate control signals.

Adv: Very fast

Disadv: Its not scaled.

- \* Microprogrammed ctrl  $\rightarrow$  alternate unit design.

Adv: Scalable as programming is allowed.

Disadv: Programming allowed. memory is used up and hence slow.

- \* Central memory  $\rightarrow$  control words.  
(CM)

In time stamp T<sub>2</sub>,

LD MAR — C<sub>0</sub> designated 0th bit of CM  
OE<sub>PC</sub> — C<sub>1</sub> " 1st bit of CM

0.....11 01  
Branch add.  
(BA)

during T<sub>1</sub>,

LD IR — C<sub>2</sub>  
RD M — C<sub>3</sub>  
(read memory)  
INCR PC — C<sub>4</sub>

0.....0111 00 02

During T<sub>3</sub>

OE<sub>IR</sub> — C<sub>5</sub>

0.....0100001 xx

sum(Carry C<sub>5</sub>)



20



7/8/19



After op code fetch & decode, we don't know which memory add to jump to to execute next instn.

Microprogram sequencer → all other units except the ctrl memory.

↓  
+ control memory = Microprogrammed Ctrl Unit.

Timestamps;

- T<sub>0</sub>: MAR ← PC
- T<sub>1</sub>: IR ← M[MAR]; PC ← PC + 1
- T<sub>2</sub>: DCD(IR);  
MAR ← I<sub>0</sub>; II

$$\text{A} \quad LD\text{MAR} = C_0$$

$$OE_{PC} = C_1$$

$$LD_{IR} = C_2$$

$$RD_M = C_3$$

$$PC_{INCR} = C_4$$

$$OE_{IR} = C_5$$

|                      | C <sub>5</sub> | C <sub>4</sub> | C <sub>3</sub> | C <sub>2</sub> | C <sub>1</sub> | C <sub>0</sub> | BA | M |
|----------------------|----------------|----------------|----------------|----------------|----------------|----------------|----|---|
| For T <sub>0</sub> , | 0              | 0              | 0              | 0              | 1              | 1              | 01 | 1 |

|                      |   |   |   |   |   |   |    |   |
|----------------------|---|---|---|---|---|---|----|---|
| For T <sub>1</sub> , | 0 | 1 | 1 | 1 | 0 | 0 | 02 | 1 |
|----------------------|---|---|---|---|---|---|----|---|

|                      |   |   |   |   |   |   |    |   |
|----------------------|---|---|---|---|---|---|----|---|
| For T <sub>2</sub> , | 1 | 0 | 0 | 0 | 0 | 1 | XX | 0 |
|----------------------|---|---|---|---|---|---|----|---|

we do not know  
what instn to  
execute, so we move to internal  
add.-generator.

Eg: MOV R<sub>1</sub>, R<sub>2</sub>

OER<sub>2</sub> = C<sub>6</sub>

LD R<sub>1</sub> = C<sub>7</sub>

C<sub>12</sub> C<sub>11</sub> C<sub>10</sub> ... BA Mode  
1 1 0 ..... 0 0 0 1

Eg: ADD R<sub>1</sub>

T<sub>2</sub>: DR ← R<sub>1</sub>

T<sub>3</sub>: ACC ← ALU<sub>AD</sub> (ACC, DR); To

C<sub>12</sub> C<sub>11</sub> C<sub>10</sub> ... C<sub>9</sub> C<sub>8</sub> C<sub>7</sub> ... BA Mode  
0 .... 0 1 1 0 0 ... 0 P<sub>1</sub> 1

LD DR = C<sub>8</sub> ALU<sub>AD</sub> = C<sub>11</sub>

OER<sub>1</sub> = C<sub>9</sub> OEA<sub>ALU</sub> = C<sub>12</sub>

LD ACC = C<sub>10</sub>

1 1 1 0 - - - - - 0 0 0 1

## Micro instruction design

- Horizontal μ programming signal inst<sup>n</sup> to be decoded/encoded  $n, \text{ no of bits } [\log_2 n]$
- Vertical μ programming multiple inst<sup>n</sup>s to be decoded/encoded
- Diagonal μ programming (somewhere in between)
  - requires > bits than VMP
  - < bits than NMP

### \* Compatibility class

The compatibility class is a set of control signals such that the ctrl signals within that set are pairwise compatible i.e., no two ctrl signals are active simultaneously.

### \* Minimal compatibility class

It is a compatibility class to which no other ctrl signal can be added without introducing incompatibility.

### \* Minimal cover

It is a minimal subset of minimal compatibility class, which includes all the control signals.  
(We will go for encoding in Minimal cover in case of diagonal μP).

## II: Combinational signals

$I_1 : a \ b \ c \ g$

$I_2 : a \ c \ e \ h$

$I_3 : a \ d \ f$

$I_4 : b \ c \ f$

$a, b, c, d, e, f, g, h$   
can't separate together.

$S_1 : a, b, c, d, e, f, g, h$ . not, suffice, either all are signals together.

$S_2 : \underline{bd}, \underline{bc}, \underline{bh}, \underline{cd}, \underline{de}, \underline{df}, \underline{dh}, \underline{ef}, \underline{eg}, \underline{fh}, \underline{fg}, \underline{gh}$   
pairs of signals  
which are never  
together.

$S_3$ : make a group of 3 other signals.

$bde, bch, efg, deg, ade, egh, aeh, def, \cancel{agh}, \cancel{dgh}$   
 $\cancel{deg} \cancel{fgh} \quad \cancel{agh}$

$S_4$ : here signals together.

$\emptyset$

8/8/11

cover table

|                     | a         | b | c          | d | e | f | g | h |
|---------------------|-----------|---|------------|---|---|---|---|---|
| $K_1 = a$           | x         |   |            |   |   |   |   |   |
| $K_2 = cd$          |           |   | x          | x |   |   |   |   |
| $K_3 = bde$         |           | x |            | x | x |   |   |   |
| $K_4 = bch$         | x         |   | x          |   |   |   | x |   |
| $K_5 = deg$         |           |   |            | x | x | x |   |   |
| $K_6 = dgh$         | dominated |   | x          |   | x | x |   |   |
| $K_7 = efg$         |           |   | dominating | x | x |   |   |   |
| $K_8 = fgh$         |           |   |            | x | x | x |   |   |
| essential<br>covers |           |   |            |   |   |   |   |   |

dominated w.r.t subset  
of the dominating

strike off dominating columns.

So,  $K_1, K_2$  are essential covers.

## Reduced coverable

|                     | b | d | e | f | g | h |
|---------------------|---|---|---|---|---|---|
| K <sub>1</sub> =add | x |   | x |   |   |   |
| K <sub>2</sub> =sub | x |   |   |   | x |   |
| K <sub>3</sub> =avg |   | x |   |   |   |   |
| K <sub>4</sub> =avg |   |   |   |   | x |   |
| K <sub>5</sub> =avg |   | x | x |   |   |   |
| K <sub>6</sub> =avg |   | x | x | x |   |   |

we will strike off dominated rows  
and not dominating rows

$$\text{Minimal cover} = \{(K_1, K_2), K_3, K_6\}$$



"all the control  
signals have  
been covered"

$$\{K_1, K_2\}, K_3, K_6$$

we need to choose these in such a  
way that all the alphabets are covered  
and not struck off in the reduced  
coverable.

## \* Pipelining

Assuming 4 inst's executed.

I<sub>1</sub>  
I<sub>2</sub>  
I<sub>3</sub>  
I<sub>n</sub>

⇒ Following  
here

|    | EX             |                |                | I <sub>1</sub> |                | I <sub>2</sub> |                | I <sub>3</sub> |                | I <sub>n</sub> |
|----|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|----------------|
| OD |                |                | I <sub>1</sub> |                | I <sub>2</sub> |                | I <sub>3</sub> |                | I <sub>n</sub> |                |
| OF |                | I <sub>1</sub> |                |                | I <sub>2</sub> |                | I <sub>3</sub> |                | I <sub>n</sub> |                |
| IF | I <sub>1</sub> |                |                | I <sub>2</sub> |                | I <sub>3</sub> |                | I <sub>n</sub> |                |                |

{instn fetch IF  
opende fetch OF  
opende decode OD  
execution EX}

16 time stamps reqd. for 4 instns.  
non pipeline method.

Best case  
scenario :-

| EX |                |                | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> | I <sub>n</sub> |
|----|----------------|----------------|----------------|----------------|----------------|----------------|
| OD |                |                | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> | I <sub>n</sub> |
| OF |                | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> | I <sub>n</sub> |                |
| IF | I <sub>1</sub> | I <sub>2</sub> | I <sub>3</sub> | I <sub>n</sub> |                |                |

(IF is free once I<sub>1</sub>)  
(is fetched)

7 time stamps reqd.

Worst case :-

### Data dependency Hazards :

— RAW - read after write

— WAR - write after read

— WAW - write after write

ΦΦ

I<sub>0</sub>: MOV R<sub>1</sub>, R<sub>2</sub> (R<sub>1</sub> written on)

I<sub>1</sub>: ADD R<sub>1</sub> Acc ← Acc + R<sub>1</sub> (R<sub>1</sub> is now read)

I<sub>2</sub>: MOV R<sub>1</sub>, R<sub>3</sub> (R<sub>1</sub> is being written on)

I<sub>3</sub>: ADD R<sub>2</sub>, R<sub>4</sub>

(I<sub>0</sub>, I<sub>1</sub>) → RAW dependency

(I<sub>1</sub>, I<sub>2</sub>) → WAR

(I<sub>0</sub>, I<sub>2</sub>) → WAW

(I<sub>0</sub>, I<sub>3</sub>) → WAR

ADD R<sub>2</sub>, R<sub>4</sub>  
MOV R<sub>5</sub>, R<sub>4</sub>

} Both if they are reading from same source, this is not dependency.

Q. Find all possible dependencies present in the following program fragment when you are using a 4-stage pipeline.

I<sub>0</sub>: MOV R<sub>1</sub>, R<sub>2</sub> ① ②

I<sub>1</sub>: ADD R<sub>3</sub>, R<sub>4</sub> ③ ④

I<sub>2</sub>: SUB R<sub>1</sub>, R<sub>5</sub> ⑤ ⑥

I<sub>3</sub>: ADD R<sub>2</sub> ⑦

I<sub>4</sub>: MOV R<sub>5</sub>, R<sub>6</sub> ⑧ ⑨

I<sub>5</sub>: ADD R<sub>5</sub> ⑩

I<sub>6</sub>: MOV R<sub>7</sub>, R<sub>1</sub> ⑪ ⑫

Soln:-

(I<sub>0</sub>, I<sub>2</sub>) → WAW      (I<sub>0</sub>, I<sub>3</sub>) → X

(I<sub>0</sub>, I<sub>5</sub>) → RAW      (I<sub>2</sub>, I<sub>4</sub>) → WAR

(I<sub>2</sub>, I<sub>5</sub>) → TRAW      (I<sub>2</sub>, I<sub>6</sub>) → X

(I<sub>4</sub>, I<sub>5</sub>) → RAW

- \* Assembly I/O is dependent on each other.
- \* To make sure the instruction that are dependent releases its products not before the next is executed.

2/8/19

## MEMORY MANAGEMENT

### \* Virtual Memory

Process believes that we have available memory using the Segmentation we can apply the concept of virtual memory in our system.



### # Paging

"main mem. can be divided into same sized pages"



When there is a demand for mem. & the MM cannot provide it, then:

Swap out :- Some pages are swapped out from MM  $\Rightarrow$  SM

Swap in :- Some pages are swapped in from SM  $\Rightarrow$  MM (empty slot)

Choosing page to swap  $\rightarrow$  on the basis of some Algorithm.

### # Page Replacement algorithms

- FIFO (First in first out)
- LRU (Least recently used)
- MFU (Most frequent used)
- Optimal replacement

e.g.: 1, 4, 5, 0, 6, 7, 4, 2, 3, 5, 7

page reference stream

assuming  $\rightarrow$  each page size  $\rightarrow$  1K  
main memory accommodates 4 pages.



FIFO =

(like  
queue)



XXVA:-

\* No. of page misses.

1, 4, 5, 1, 6, 9, 3, 2, 5, 3, 1, 7, 9, 3, 8, 6, 2, 5, 6, 9, 8,

Soln - FIFO



OPT. REP.



+ 8 + 4 → 12

RF FIFO



13 + 4 → 17

# TF Segmentation



(1) from bottom to top

Method for TF  
related to word to  
byte mapping

Partial Segmentation

- \* Best fit
- \* First fit
- \* Worst fit.

lower bits to higher.

→ Best fit

aim := go for minimum fragmentation.

[11K, 1K, 5K, 10K, 9K]

External Segmentation



→ First fit

[11K, 1K, 5K, 10K, 9K]



→ Worst fit

aim := maximize fragmentation

[11K, 1K, 5K, 10K, 9K]



★ DRAM (Dynamic RAM)  
(made of capacitors) → Main memory

Bufiled.

SRAM (Static RAM) → Cache memory  
[costly]



Pages can be subdivided into blocks.



22/8/19

Q. Consider a CPU where all instructions require 7 clock cycles to complete execution. There are 140 instructions in the instruction set. It is found that 125 control signals are needed to be generated by the control unit. While designing the horizontal microprogrammed control unit, single address field format is used for branch control logic. What is the maximum size of the control word & control address register?

Soln - Each instruction takes 7 cycles  
∴ 140 instructions take =  $(140 \times 7)$  cycles  
= 980 cycles

Now for

$$2^m \geq 980$$

$$m \geq 10$$

$$\therefore (10 + 125) \text{ bits} \rightarrow \text{control word}$$

10 units control address register.

Q. Consider the following sequence of micro operations.

MBR  $\leftarrow$  PC

MAR  $\leftarrow$  X

PC  $\leftarrow$  Y

Memory  $\leftarrow$  MBR

A. Instruction fetch

B. Operand fetch

C. Conditional branch

D. Initiation of interrupt service.

Which one of the following is a possible operation performed by this sequence?

sol: MBR - Memory buffer register

It stores the data being transferred to and from the immediate store.

MAR - Memory address register.

It holds the memory location of data that needs to be accessed.

PC - Program counter.

It contains the address of the instruction being executed at the current time.

The 1st instruction places the value of PC onto MBR.

The 2nd instruction places an address X into MAR

The 3rd instruction places an address Y into PC.

The 4th instruction places the value of MBR (old PC value) into memory.

D.

28/8/19

## \* Cache Memories      ASSOCIATIVITY

STRA M (stack RAM)

makes up cache memory.

CM < MM < SM



User programs cannot be placed in cache memory.

CPU  $\xrightarrow[\text{start}]{\text{first}}$  CM  $\xrightarrow[\text{search}]{\text{then}}$  MM

cache miss / cache hit  
(does not get) (get it)

microprocessor = address



Tag memory  
Stores addresses of blocks  
in CM



This part is called Tag

Any block of the MM can be placed in any cache line

We search the Tag sequentially for a block till we get it  
 $\xrightarrow[\text{of all lines}]{} \text{Cache miss.}$

Remove the Tag & place it in Argument register.



V-bit

- one bit associated with each tag
- initially all bits are 0.



As we have not cleared CM, then CM can have some garbage value.

When a block is sent from MM  $\rightarrow$  CM then its VBit is set to 1.



## \* Direct Mapped Cache

Ques. which block will be allowed to move to which cache line.  
Now, cache is not 1D structure



Microprocessor → generate address



Block is not available in cache memory. So that block needs to be moved from memory to that particular cache line)

## # Set Associative Cache



Q. A set associative cache memory consists of 128 blocks divided into 4 sets. The main mem. consists of 16384 blocks and each block contains 256 8bit words. How many bits are reqd. to address the main memory? How many bits are reqd. to identify the tag, the set and the block?

Soln- No. of blocks  $\rightarrow 2^{14} = 16384$  blocks.  $8 \text{ bits} = 1 \text{ byte}$

each block  $\rightarrow 256 \times 8$  bits.  $22$

$\Rightarrow 22$  bits are reqd. to address main memory

$$16384 \times 256 = 2^{22} \text{ bits MM}$$

⇒ To identify block  $\rightarrow 256$   
 $= 2^8$

8 bits reqd. for block identification.

No. of Block in cache = 128

$$128/4 \rightarrow 4 \text{ sets}$$

$$= 32$$

$$= 2^5$$

5 bits reqd. for set identification.

Bits reqd. for tag identification  $\rightarrow 22 - (5+8)$

Q. Computer has an 8gb memory with 64 bit word size. Each block of memory stores 16 words. Computer has a direct mapped cache of 128 blocks and uses word level addressing. What is the addressing format if we change the cache to a 4 way set associative cache? what is the new address format?

Sol. Word length  $\rightarrow$  64 bit.  
each block  $\rightarrow$  16 words.  
 $\rightarrow 16 \times 64$  bits.

$$2^k \times 2^l \rightarrow 2^{k+l}$$

For 1gb  $\rightarrow 2^{30}$  bits. 8gb  $\rightarrow 8 \times 2^{30} \rightarrow 2^{33}$   
 $\rightarrow 2^{30}$  bytes.



30 bits reqd. to address main memory.

Each block  $\rightarrow$  16 words.  
4 bits don't reqd. for offset. [Offset].

• 4 way set ass

block identification, (128 blocks)  $\rightarrow 2^7$   
(7 bits reqd.)

tag group  $\rightarrow 30 - (7+4)$

in h way set ass  $\rightarrow \frac{128}{h}$   $\rightarrow$  divided into  $h$  sets each.  
 $32 \rightarrow 2^5.$

|     |  |   |  |   |
|-----|--|---|--|---|
| Tag |  | 5 |  | 4 |
|-----|--|---|--|---|