

The Memory hierarchy → How much, fast, expensive.

(1)

There is a trade off b/w

Capacity, access time and cost.

fast access → more cost per bit.

More capacity → less cost per bit.

More capacity → more access time.

{Match  
Processor}

Very costly solution

We require Large Capacity memory

①

fast access time and low cost per bit

②

across time  
③

Not directly available

Solution is to use relatively

low capacity memory with fast access time.

[Not rely on single memory

Employ

Memory Hierarchy]

① decreasing cost per bit

Increasing capacity

Decreasing access time

This is key to  
S.W.E.S

Example:

→ Opti required by CPU } → a) In level (a)

→ Hit ratio:

(H) fraction of all memory access that are found in faster mem.

→ b) In level (b)

\* Transfer to level (a) then access.

\* Our 4<sup>th</sup> point ensure high (H) that makes sure the average total

access time as close as its highest/higher levels.

↳ 4<sup>th</sup> Condition is ensured using

Locality of reference

- Memory addressing/referencing tends to cluster.

↳ Looping/subroutine (instructions)  
↳ Table/array accessing (data)

Our short period — processor works with ~~fixed~~ clusters.

Time to time place

Swap cluster ← Current cluster in level 1  
back to level 2  
to accomodate more cluster

\* access directly

→ b) In level (b)

\* Transfer to level (a) then access.

\* access directly

level 1 — 1000by — 0.1μs

level 2 — 100,000by — 1μs

$$H = 0.95 \Rightarrow 95\%$$

$$\text{av. access time} = (0.95)(0.1\mu\text{s})$$

$$+ (0.05)(0.1\mu\text{s} + 1\mu\text{s})$$

$$= 0.095 + 0.055 = 0.15\mu\text{s}$$



\*.) Each location in main memory has a unique address and is **③**  
Extended with a higher speed smaller cache.

\*) Not visible to programmer  
\*) Register, Cache, main memory  $\not\Rightarrow$  Volatile

\*) Secondary memory is non volatile.  $\Rightarrow$  Volatile

Cache Memory : The instruction cycle time is limited by

memory cycle time  $\not\Rightarrow$  This match b/w processor & main memory speed



④ Cache contains copy of portion of main memory. Processor need a word and check whether it is in Cache, if not take it from main mem.

[Block of data is fetched into the Cache to satisfy a single memory reference]

and then there is three level Cache organisation



Line → Some higher bits.

Address

number Tag Block

Memory address

三

$$w_1 \wedge w_2 = \overline{w_1} \cdot \overline{w_2}$$

all day  
- 10

W.B. E.  
2  
10

+ Line =

十一

卷之三

• 1

C-1

Block length

The diagram illustrates the structure of a memory access. At the top, the word "Cache line" is written vertically next to a double-headed vertical arrow. Below it, the phrase "Block length (K words)" is also written vertically next to another double-headed vertical arrow. A horizontal double-headed arrow connects the two vertical arrows, indicating their relationship.

lost c bit tag  
(x words)

Logik der Tafel (a) Cache

(a) Cacti

$\log_2 n$  memory  $\equiv 2^n$  addressable words ( $n$  bits)

fixed length block ( $K$ ) non-overlapping words ( $M$ )

length block of ( $K$ ) words.

typically  $c \ll m$

symmetry  $C \triangleleft M$ .

— — — — —

Each slot/line includes a tag string

which particular work is currently

which particular work is currently

Address 3a

The diagram illustrates the organization of memory and cache, showing how a memory address is mapped to a cache block.

**Memory Address:**  $100\ 0000$  (6-bit address)

**Cache Organization:**

- Cache Lines:**  $0, 1, 2, \dots, C-1$  (Number of Cache Lines)
- Block Length:**  $(K \text{ words})$
- Cache Line:**  $0, 1, 2, \dots, C-1$  (Number of Cache Lines)
- Block:**  $(K \text{ words})$
- Line Number:**  $0, 1, 2, \dots, C-1$  (Number of Cache Lines)
- Tag:**  $00, 01, 10, 11, \dots$  (2-bit tag)

**Addressing:** A 6-bit memory address is divided into a 2-bit tag and a 4-bit block address. The tag is used to identify the line in the cache, and the block address is used to select a word within that line. The diagram shows a 2-bit tag being compared against the tag bits of the cache lines to determine a hit or miss.

**Annotations:**

- Left side:** "Let 6 bit address answer with 2 bit tag"
- Top left:** "All are bytes"
- Top right:** "Line number" with an arrow pointing to the line number column.
- Bottom left:** "Block length" with an arrow pointing to the block length column.
- Bottom right:** "Block" with an arrow pointing to the block column.
- Bottom center:** "Word length" with an arrow pointing to the word length column.
- Bottom right corner:**  $n = \log C + \log K$

(b) Main memory



## Cache Designing: (a) Cache Size

Reasonably small may improve performance. But block size is an issue.

Block size: is the unit of data exchanged b/w Cache & main memory.

\* With increasing block size it first increase then decrease. (Related to H)

## (c) Mapping Functions

- Which Cache location the block will occupy.  
Direct, Associative and Set associative (Read it)
- Liberal to Complex →

## (d) Replacement Algorithm:

Within constraints of mapping fn. which block to replace when new block is loaded.

{ replace block which ← (LRU)

{ is in the cache for longest time w/o reference.

## (e) Write policy:

- If modified must be written back. { Write through }  
Block updated → write  
write once

Updated block replaced - write → minimizes memory write but leaves main memory in obsolete state.

4a



## OS - Evolution

### [a] Serial Processing

- Step process - (i) Scheduling - Reserve a block of CPU time.
- Serial processing (ii) Setup time - Loading and linking of the job.

Later several system software are written to make it efficient.

- [b] Simple batch System : Huge amount of CPU time got wasted in setup.
- It has monitor and user do not access processor directly instead via monitor
  - User submit job to operator, it batches on an input device used by monitor
  - Monitor read job one by one place into user program area and pass control to it.
  - After completion control comes back to monitor that loads in next job.

### [c] Multiprogrammed batch systems : Still processor is idle due to slow I/O.

- Read one record from file
  - Execute 100 inst 1 ms
  - Write one record 15 ms
  - %CPU-utilization 3.2%
- \*) As running there is enough memory to hold multiple user programs.
- \*) When a program wait for I/O switch to other.
- Multiprogramming

What are its issues?

# -System With 250 Mb memory (available) Sample Program Execution Attributes

Not much CPU gear

5a

- 3 Jobs are Submitted
- Since little resource contention, they can run 11y.

| Program Execution Attributes |               | JOB1      | JOB2      | JOB3      |
|------------------------------|---------------|-----------|-----------|-----------|
| Type of job                  | Heavy compute | Heavy I/O | Heavy I/O | Heavy I/O |
| Duration                     | 5 min         | 15 min    | 10 min    | 7.5 M     |
| Memory required              | 50 M          | 100 M     | 75 M      | Y         |
| Need disk?                   | No            | No        | No        | Y         |
| Need terminal?               | No            | Yes       | No        | Y         |
| Need printer?                | No            | No        | No        | Y         |



## Effects of Multiprogramming on Resource Utilization

|                    | Uniprogramming | Multiprogramming |
|--------------------|----------------|------------------|
| Processor use      | 20%            | 40%              |
| Memory use         | 33%            | 67%              |
| Disk use           | 33%            | 67%              |
| Printer use        | 33%            | 67%              |
| Elapsed time       | 30 min         | 15 min           |
| Throughput         | 6 jobs/hr      | 12 jobs/hr       |
| Mean response time | 18 min         | 10 min           |

- Multiprogramming rely on I/O interrupts and DMA
- Processor issue I/O command for one job and proceed execution another job. I/O carried by device controller.
- After I/O completes processor get interrupted by device contr.

Moving towards process model  $\Rightarrow$  Multiprogramming was designed

to keep processor and I/O devices busy to achieve maximum efficiency

(5)

④ Key mechanism: When processor get I/O Completion signals  
 $\hookrightarrow$  it switches to some other resident program.

⑤ Principal tool: Early system programmers rely on Interrupt.

(We have seen  
interrupt handling)

$\hookrightarrow$  load context  
 $\Rightarrow$  process interrupt

— — —

\* But in multiprogramming environment, Co-ordinating various such activities with multiple tasks in progress (require different steps & instructions to be performed) turns out to be very difficult. (Messy)

\* Such system vulnerable to errors, that are hard to diagnose, detect and clear up.

Four Main Causes for such error:

(i) Improper Synchronization: Read - Write I/O. (lost/duplicate signal problem)

may cause error.

(ii) Failed mutual exclusion: More than one program attempt to use same resource.

(iii) Non-determine program operation: Programs running in shared memory with interleaved execution they started to interfere in unpredictable way

(iv) Deadlocks: Two program hung up waiting for each other.

A Systematic procedure is required to handle such scenarios.

How to control various programs executing on the processor.

(Augmentation)

7

- \* Program under execution  $\equiv$  Process (How to model it)



- \* ① Process Can be realized as a data structure (May be executing or waiting)
- ② It has various attributes = State
    - = Priority (depends as req.)
  - ③ OS keep track of the states and manages the movement of process among these states. (Req. elaborated data structure)
  - ④ Process can be seen as an entity consist of number of elements.

7a



Process Control Block (PCB) :- Unique process modelling during its execution.

Created & Managed by OS.

- Hence it enables to interrupt and resume process.

(a) Identifier : Unique PID.

(b) Status : Running, blocked, suspended - - -

(c) Priority : Its relative priority.

(d) PC : Address of next instruction to be executed.

(e) Memory pointer : Pointing to code & data (also shared block if any).

(f) Content data : Processor register data required for resumption.

(g) I/O Status : Outstanding I/o req, I/o devices holding, list of files using - - -

(h) Accounting : Amount of processor & clock time used - - -  
(some book keeping).

⑧