

Strategies for Memory Allocation: The most popular strategies are: First fit, Best fit, & Worst fit.  
(They select a free hole from the set of available holes)

1) First fit: Allocate the first hole that is big enough.  
Searching can start either at the beginning of the set of holes or at the location where the previous first-fit search ended. We can stop searching as soon as we find a free hole that is large enough.

eg: Given 5 mem. partitions of 100 KB, 500 KB, 200 KB, 300 KB, 600 KB (in order), how would the first fit, best fit & worst fit algs. place the processes of 212 KB, 417 KB, 112 KB & 426 KB (in order).

Ans:  
P1 - 212 KB — 500 KB  
P2 - 417 KB — 600 KB  
P3 - 112 KB — 200 KB  
P4 - 426 KB — Must wait



Note: We have 2 free mem. blocks 100 KB & 300 KB but we are not able allocate mem. for 426 KB process.

2) Best fit: Allocate the smallest hole that is big enough. We must search for the entire list, unless the list is ordered by size. This strategy produces the smallest leftover hole.

Ans:  
P1 - 212 KB — 300 KB  
P2 - 417 KB — 500 KB  
P3 - 112 KB — 200 KB  
P4 - 426 KB — 600 KB



3) Worst Fit: Allocate the largest hole. Again we need to search the entire list unless it is sorted by size. This strategy produces the largest leftover hole, which may be more useful than the smallest leftover hole from the best fit approach.

P1 - 212 KB - 600 KB

P2 - 417 KB - 500 KB

P3 - 112 KB - 300 KB

P4 - 426 KB - Must wait

| OS        |
|-----------|
| 100 KB    |
| 500 KB P2 |
| 200 KB    |
| 300 KB P3 |
| 600 KB P1 |

### Implementation in lab:

IIp: Enter no. of blocks: 5

Enter block 0 size:

Enter block 1 size:

Enter block 4 size:

Enter no. of processes: 4

Enter process 0 size:

Enter process 3 size:

Olp: The process 0 is allocated to block:

The process 3 is not allocated.

Fragmentation: There are mainly 2 types of fragmentation

we have 1) External Fragmentation

2) Internal Fragmentation

D) External Fragmentation: As processes are loaded & removed from mem., the free mem. space is broken into little pieces. external fragmentation exists when there is enough

total mem. space to satisfy a request but the available space is not contiguous.

e.g. First fit & best fit strategies

50% Rule: statistical analysis of first fit reveals that, even with some optimization, given  $N$  allocated blocks, another  $0.5N$  blocks will be lost to fragmentation i.e. one third ( $\frac{1}{3}N$ ) of mem. is unusable! this property is known as 50% (50 percent) rule.

e.g. 100 blocks

if 40 blocks are allocated  $\Rightarrow$  20 blocks will be lost for fragmentation.

Soln for External fragmentation:  
1) compaction  
2) Non-contiguous allocation

i) compaction: The goal here is to shuffle the mem. contents so as to place all free mem. together in 1 large block.

\* compaction can't be done if relocation is static or load time & it's only done if relocation is dynamic.

disadv: Could be expensive.

ii) Non-contiguous mem. allocation: Allowing a process to be allocated physical mem. wherever such mem. is available. Two complementary techniques achieves this btm.  
1) Paging    2) Segmentation.

2) Internal Fragmentation: Consider a multiple-partitioned allocation scheme with a hole of 2000 bytes. Suppose that the next process requests 1998 bytes. If we allocate

exactly the requested block, we are left with 2 bytes of hole. This wasted/unused mem. that is internal to a block is called as Internal fragmentation.

\* "Mem. fragmentation occurs when a sys contains mem. that is technically free but the computer can't utilise"

\* ii) Non-Contiguous Mem. Allocation: Mem. for a process can be allocated in a non-contiguous manner. There are 2 techniques to do this a) Paging, b) Segmentation

a) Paging: It is a mem-mgt scheme that permits the phy address space of a process to be non-contiguous. Paging avoids external fragmentation & the need for compaction.

It also solves the considerable prblm of fitting mem. chunks of varying sizes onto the backing store; most mem. mngt schemes used before the introduction of paging suffered from this prblm. The prblm arises because some code fragments or data residing in main mem. need to be swapped out, space must be found on the backing store. The backing store has the same fragmentation prblms, but access is much slower, so compaction is impossible. Because of its advantages, over earlier methods, paging in its various forms is used in most os.

Basic Method: The basic method for paging involves breaking Physical mem. into fixed sized blocks called

frames & breaking logical mem into same size called pages. When a process is to be executed, its pages are loaded into any available mem. frames from their source (a file sys or a backing store). The backing store is divided into fixed-sized blocks that are of the same size as the mem. frames.



Fig: Paging Model

When CPU executes a process, it knows only the logical address space of a process & hence there should be some mapping of logical address to the physical address of a process. & as process is allocated with non-contiguous mem locations, the base address is not enough. Hence we need a mapping to all the framed pages to the physical address space. This can be achieved by using page table. Page table stores the physical address of frames.

If the CPU needs physical address of a particular page it will check the index of the page table & the induced location consists of correct actual phy addr of a page.

Eg: Consider the logical addr space as 16 bytes ( $2^4$ )<sup>gen</sup>  
 represented by  $2^m$ , where  $m=4$  here). & page size as  
 4 bytes ( $2^n$  rep, where  $n=2$ ), also phy mem. as 32 bytes.

$$\text{page size} = 2^n = 2^2 = 4 \text{ Bytes}$$

$$\text{Logical addr. Space} = 2^m = 2^4 = 16 \text{ B}$$

$$\text{physical addr. Space} = 32 \text{ B.}$$

$$\therefore \text{No. of Pages} = \frac{\text{LAS}}{\text{page size}} = \frac{2^m}{2^n} = \frac{2^4}{2^2} = 2^2 = 4$$

$$\text{No. of frames} = \frac{\text{PAS}}{\text{page size}} = \frac{32}{4} = 8 \text{ frames.}$$

| Page no. | 0 | 1 | 2 | 3 |
|----------|---|---|---|---|
| 0        | a | b | c | d |
| 1        | e | f | g | h |
| 2        | i | j | k | l |
| 3        | m | n | o | p |

|   |   |
|---|---|
| 0 | 5 |
| 1 | 6 |
| 2 | 1 |
| 3 | 2 |

Page Table  
 (contains frame)  
 Number

Logical Mem.

|    |    |    |    |    |    |    |    |
|----|----|----|----|----|----|----|----|
| 0  | 1  | 2  | 3  | 4  | 5  | 6  | 7  |
| 4  | 5  | 6  | 7  | 8  | 9  | 10 | 11 |
| 5  | 6  | 7  | 8  | 9  | 10 | 11 | 12 |
| 6  | 7  | 8  | 9  | 10 | 11 | 12 | 13 |
| 7  | 8  | 9  | 10 | 11 | 12 | 13 | 14 |
| 8  | 9  | 10 | 11 | 12 | 13 | 14 | 15 |
| 9  | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
| 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 |

### Prg. Paging Example

If we want to know the physical address of a corresponding page, then we need to consider page table. As an ex., the first page

physical Mem.

of the logical addr. space of process, (abcd) is mapped to 5<sup>th</sup> frame in physical address.

$$LA \rightarrow 0 \rightarrow 5^{\text{th}} \text{ frame} \rightarrow PA$$

To determine the actual phys. addresses, we have to multiply page size with the frame no. i.e., Frame no.  $\times$  page no.

\* starting loc. of frame =  $5 \times 4 = 20$ .

It means that the 1<sup>st</sup> page of the process (CLAS) is present in a frame which starts at a loc. of 20. But if we want to know the exact loc. of each byte in phy. mem, say the loc. of 'c' in phy mem. 1<sup>st</sup> we need to consider the page table to access the frame no. & then we need to add an offset (the byte no. in LA) to the frame no. address.

i.e.,  $(5 \times 4) + 2 = 22$  (PA of 'c').

In this way, the logical address is divided into 2 parts page no.(p) & page offset(cd). We have to have the inf. of these 2 variables to correctly identify the location of any instructional data in the physical address space.

e.g1: page 0, offset 0 ( $p=0 + d=0$ )  $\Leftrightarrow LA = 0$

$$\Rightarrow PA = (5 \times 4) + 0 = 20$$

e.g2: Page 1, offset 2 ( $p=1 + d=2$ )  $\Leftrightarrow LA = 6$

$$\Rightarrow PA = (6 \times 4) + 2 = 26$$

$\overset{\uparrow}{LA}$      $\overset{\uparrow}{\text{No. of}}$   
 $\text{page}$

We need a special hw to perform this mapping.

The address generated by CPU is divided into 2 parts, the 1<sup>st</sup> part denotes the page no. & is indexed into page table.



which gives frame no.. The frame no. & along with the page offset (2<sup>nd</sup> part) is used to find out the location of physical address.

If the logical address space =  $2^m$  & page size =  $2^n$ , then higher order  $m-n$  bits of a logical address designate the page no. & the  $n$  lower-order bits designate the page offset.

Thus, the logical address is as follows:



The physical address space is as follows:



frame no. frame offset

\* Organization of Logical Address Space (LAS):

\* LAS is divided into equal no. of pages.

\* Page size is always a power of 2 (i.e.,  $2^K$ ,  $K > 0$ )

$$\text{No. of pages (N)} = \frac{\text{LAS}}{\text{PS}} = \frac{2^{\text{LA}}}{2^K} = 2^{\text{LA}-K}$$

\* No. of bits needed to locate all pages  $\Rightarrow p = \log_2 N$

\* When we use paging scheme, we have NO External Fragmentation.

- Allocation: Any free frame can be allocated to a process that needs it.

✓ However we may have some internal fragmentation.

Notice that frames are allocated as units. If the mem. requirements of a process do not happen to coincide may not be completely full. In the worst case, a process would need 'n' pages plus 1 byte. It would be allocated n+1 frames, resulting in internal fragmentation of almost an entire frame.

✓ If process size is independent of page size, we expect internal fragmentation to avg one-half page per process.

This consideration suggests that small page sizes are desirable. However, overhead is involved in each page-table entry. & this overhead is reduced as the size of the pages increase.

Also disk I/O is efficient when the amt of data being transferred is larger. Generally page sizes have grown overtime as processes, data sets & main mem. have become larger. Today, pages typically are b/w 4KB & 8KB in size, & some systems support even larger page sizes.

2 GB RAM  $\rightarrow$  4KB = page size

$$\text{No. of frames} = \frac{2\text{GB}}{4\text{KB}} = \frac{2^{30}\text{B}}{2^{12}\text{B}} = 2^{30-12} = 2^{18}$$

When a process arrives in the sys to be executed, its size, expressed in pages, is examined. Each page of the process needs 1 frame. Thus a process requires 'n' pages, atleast  $n$  frames must be available in memory. If 'n' frames are available, they are allocated to this arriving process. The first page of the process is loaded into one of the allocated frames, & the frame no. is put in the page table for that

process. The next page is loaded into another frame, its frame no. is put into the page table & so on.

✓ An important aspect of paging is the clear separation b/w the user's view of mem & the actual physical mem. The user prg views mem as a single space, containing only this prg. In fact the user prg is scattered throughout phy mem, which also holds other prgs. The mapping from logical to phy is hidden from the user & done by MMU (os).

\* Since the os is managing phy mem., it must be aware of the allocation details of phy mem - which frames are allocated & which frames are available, how many total frames are there & so on. This inf. is generally kept in a data structure known as frame Table. The frame table has 1 entry for each physical page frame, indicating whether the latter is free or allocated, & if it is allocated, to which page of which process (os) processes.



Fig: (a) Before Allocation

(b) After Allocation

Paging Hardware: The hardware implementation of paging can be done in several ways.

✓ Page table can be implemented as a set of dedicated registers. Registers are faster but if the page table is large (i.e., 1 million entries) then use of registers may not be feasible. Instead, a page table is kept in main memory, + a page Table Base Register (PTBR) points to the page table changing the page table requires changing only 1 register, substantially decreasing the context switch time.

✗ The problem with this approach is the time required to access a user memory location. If we want to access location  $i$ , we must first index into the page table, using the value in PTBR offset by the page no. of  $i$ . This task requires a memory access. It provides us with a frame no., which is combined with the page offset to produce the actual address. We can then access the desired place in mem. With this scheme, 2 mem accesses are needed to access a byte (one for page table entry, 1 for the byte). Thus memory access is slowed by a factor of 2. This delay would be intolerable.

Soln: The standard soln is to use a special, small, fast lookup hw cache, called a translation lookaside Buffer (TLB). The TLB is associative, high speed mem.. Each entry in TLB consists of 2 parts: a key (tag) & a value. When the associative mem. is presented with an item, the item is compared with all keys simultaneously. If the item is

found, the corresponding value field is returned. The search is fast; the b/w however, is expensive. Typically the no. of entries in TLB is small, often numbering b/w 64 & 1024.

✓ The TLB contains only a few of the page table entries. When a logical address is generated by the CPU, its page no. is presented to the TLB. If the page no. is found, its frame no is immediately available & is used to access mem. The whole task may take less than 10% longer than it would be if an unmapped mem. reference is used.

If the page no. is not in the TLB (TLB Miss), a mem. reference to the page table must be made. When the frame no. is obtained, we can use it to access mem. In addition we add the page no & frame no. to the TLB, so that they will be found quickly on the next reference. If the TLB is already full of entries, the OS must select one for replacement. Replacement policies range from least recently used (LRU) to Random. Furthermore some TLBs allow certain entries to be wired down, meaning that they cannot be removed from the TLB. Typically TLB entries for kernel code are wired down.

✓ The percentage of times that a particular page no. is found in the TLB is called the hit ratio. An 80% hit ratio, for example, means that we find the desired page no. in the TLB 80% of the time. If it takes 20ns



Fig: Paging Hardware with TLB

to search the TLB for 100 ns to access memory, then a mapped-mem access takes  $20\text{ns} + 100\text{ns} = 120\text{ns}$ . When the page no. is in the TLB. If we fail to find the page no. in the TLB ( $20\text{ns}$ ), then we must access mem. for the page table & frame no. ( $100\text{ns}$ ) & then access the designated byte in mem ( $100\text{ns}$ ), for a total of  $220\text{ns}$  ( $\frac{20+100}{100}$ ).

To find Effective Mem. access Time, we weight the case by its probability

$$EAT = (0.80 * 120) + (0.20 * 220)$$

$$\therefore \text{EAT} = 140\text{ns}/$$

In this example, we suffer a 40% slowdown in mem. access time (from 100ns to 140 ns).

For a 98% hit ratio, we have

$$EAT = (0.98 * 120) + (0.02 * 220)$$

$$= 122\text{ns}/$$

This 1% hit ratio produces only a 22% slowdown in Access Time.

$$* \text{ TLB hit ratio } (x) = \frac{\text{No. of Hits}}{\text{Total No. of references}}$$

Notations: TLB access time =  $c$  ( $c < m$ )

TLB hit ratio =  $x$ .

Mem. access time =  $m$

$$\checkmark \text{EMAT}_{\text{TLB}} = x(c+m) + (1-x)(c+2m)$$

hit success                      miss (failure)

$$\checkmark \text{EMAT} = 2m$$

(Mem)

eg: What hit ratio is required to reduce the EMAT from 300ns (without TLB) to 190 ns (with TLB). Assume cache access time is 50ns.

Ans: Given that  $c = 50$  ns

$$x = ?$$

$$2m = 300 \text{ ns}$$

$$m = 150 \text{ ns},$$

$$\text{EMAT}_{\text{TLB}} = 190 \text{ ns}.$$

$$190 = x(50+150) + (1-x)(300+50)$$

$$= 200x + 350 - 350x$$

$$190 = 350 - 150x$$

$$x = 160/150 = 1.06,$$

Protection: Memory protection in a paged environment is accomplished by protection bits associated with each frame. One additional bit is generally attached to each entry in the page table; a valid-invalid bit. When this bit is set to "valid"; The associated page is in the process logical address space & thus a legal (or valid) page. When

the bit is set to "invalid". The page is not in the logical address space. Illegal addresses are trapped by use of valid-invalid bit. The OS sets each bit from each page to allow or disallow access to page.

| frame No. | Valid / Invalid |   |
|-----------|-----------------|---|
|           | ↓               | ↓ |
| 0         | 2               | V |
| 1         | 3               | V |
| 2         | 4               | V |
| 3         | 7               | V |
| 4         | 8               | V |
| 5         | 9               | V |
| 6         | 0               | i |
| 7         | 0               | i |

|        |
|--------|
| 0      |
| 1      |
| 2      |
| 3      |
| 4      |
| 5      |
| 6      |
| 7      |
| 8      |
| 9      |
| :      |
| page n |

Fig: Valid(V) or invalid(i) bit in Page Table

\* In this fig., the pages 0, 1, 2, 3, 4, & 5 are mapped normally through the page table. Any attempt to generate an address in pages 6, or 7, however will find that the valid-invalid bit is set to invalid, & the computer will trap to the OS.

✓ Many processes use only a small fraction of the address space available to them. It would be wasteful in this case to create a page table with entries for every page in address range. Most of this table would be unused but would take up valuable mem. space. Some systems provide h/w, in the form of a ~~Page table length register~~ (PTLR), to indicate the size of the page table. This value is checked against every logical address to verify that the address is in the valid range for the process.

Shared Pages: An advantage of paging is the possibility of sharing common code. This consideration is particularly important in a time-sharing environment.

\* Assume 3 users/processes each of whom executes a text editor. If the text editor consists of 150KB of code & 50KB of data space, then we need 800KB of space of main memory.

The code can be shared b/w the processes as shown below



- \* The code that is being shared should be Reentrant (Non self modifying code) i.e., should be read only code.
- \* Only one copy is kept in physical mem. which saves memory.

\* Other heavily used programs like - compilers, window systems, libraries, databases & so on can also be shared.

Disadvantage of Paging: Paging differentiates the user's view of mem. from the actual physical mem. The user's view of mem. (logical address space) is not same as the physical mem. The logical mem/ user's view is mapped onto actual

physical memory. This mapping allows differentiation b/w logical mem & physical memory.

② Segmentation: Users generally think of a prg as a main prg with set of methods, procedures or fns. It may also

include various datastructures: objects, arrays, stacks, variables, & so on. Each of these data elements or

modules are referred by a name. We talk abt "the stack", "the math library", "The main prg", without caring what addresses in memory these elements occupy. Each of these

is considered as a segment & each segment is of variable length, which is defined by the purpose of the

segment in the prg. Elements within a segment are identified by their offset from the beginning of the segment.

✓ Segmentation is a mem. mgmt scheme that supports the user's view of mem.. A logical address space is a collection of segments. Each segment has a name & a length. The

addresses specify both the segment name & the offset within the segment. For simplicity of implementation, segments are numbered & are referred to by a segment no., rather than by a segment name. Thus logical address consists

of 2 tuples:  $\langle \text{Segment number}, \text{offset} \rangle$

✓ Normally, the user prg is compiled, & the compiler automatically constructs segments reflecting the ilp prg.

\* A 'C' compiler might create separate segments for the

following.

1) The code

2) Global Variables

3) The heap, from which mem. is allocated

4) The stacks used by each thread

5) The Standard C Library

Libraries that are linked during compile time might be assigned separate segments. The loader would take all these segments & assign them segment nos.

Eg: consider 5 segments numbered

from 0 through 4. The segments are stored in physical memory as shown below. The Segment table has a separate entry for each segment, giving the beginning address of the segment in physical mem. (or Base) & length of that



Fig: User's view of a prg.

Segment (or limit).



Fig: Example for Segmentation

|   | Limit | Base |
|---|-------|------|
| 0 | 1000  | 1400 |
| 1 | 400   | 6300 |
| 2 | 400   | 4300 |
| 3 | 1100  | 3200 |
| 4 | 1000  | 4700 |

Segment table



- Mapping addresses: For eg, a reference to 53 byte of Segment 2 is mapped onto location  $4300 + 53 = 4353$ .
- \* A reference to segment 3, byte 852  $\Rightarrow 3200 + 852 = 4052$ .
  - \* Physical address = Base address of segment + offset.
- And offset must be checked against limit.  
 $\therefore \text{offset} < \text{limit}$ .

Fig: Segmentation Hardware



### Virtual Memory Management:

Virtual Mem. involves the separation of logical mem. as perceived by users from physical mem. This separation allows an extremely large virtual memory to be provided.

for programmers when only a smaller physical mem. is available.

Fig: Diagram showing Virtual Mem. that is larger than available



\* OS creates an illusion of having larger mem. by taking advantage of disk. This illusioned mem. can be called as virtual memory.

Demand paging: Load the pages only when they are needed (Load on Demand)

✓ This is the common technique used in virtual mem. system.  
- With demand paged virtual mem., pages are only loaded when they are demanded during prg execution; pages that are never accessed are thus never loaded into physical mem.

✓ A demand-paging system is similar to a paging system with swapping where processes reside in secondary memory (usually a disk). When we want to execute a process, we swap it into memory. Rather than swapping the entire process into mem. However, we use a lazy swapper. A lazy swapper never swaps a page into mem. unless that page will be needed. Since we are now viewing a process as a sequence of pages, rather than as 1 large contiguous add. space, use of the term swapper is technically incorrect.

A swapper manipulates entire processes, whereas a pagger is concerned with the individual pages of a process. We thus use pagger, rather than swapper, in connection with demand paging.

✓ With this scheme, we need some form of hw support to distinguish b/w the pages that are in mem. & the pages that are on the hard disk. The valid-invalid bit

Scheme can be used for this purpose. This time, when the bit is set to "valid", the associated page is both legal & in mem. If the bit is set to "invalid", the associated page either is not valid (i.e., not in logical address space of process) or is valid but is currently on disk.

The page table entry for a page that is in mem. is set as usual, but the page table entry for a page that is not currently in mem. is either simply marked invalid or contains the address of the page on disk.



\* Page Fault: Whenever the process tries to access a page that is not in mem. (marked as invalid) causes a page fault.

Handling a page fault: Whenever a page fault occurs the os need to bring the corresponding page into mem. & restart the instruction execution.

- 1) Check the page Table (Done by Loader) for the process to determine whether the reference was a valid or an invalid mem. access.



- 2) If the reference was invalid, we terminate the process. If it was valid, but we have not yet brought in that page, we now page it in (causes trap to OS).
- 3) We find a free frame.
- 4) We schedule a disk read operation to read the desired page into the newly allocated frame.
- 5) When the disk read is complete, we modify the page table to indicate that the page is now in mem.
- 6) We restart the instruction that was interrupted by the trap. The process can now access the page as though it had always been in mem.

❖ Pure Demand Paging:- In the extreme case, we can start executing a process with no pages in mem. This causes page fault & then the 1<sup>st</sup> page is brought to mem.

\* Performance of Demand Paging: Demand paging can significantly affect the performance of a Computer Sys.

To see why, let's compute the effective access time for a demand-paged mem. For most computer systems, mem access time ranges from 10-200ns. As long as we have no page faults, the effective access time is equal to the mem. access time. If however, a page fault occurs, we must first read the relevant page from disk & then access the desired word.

Let 'p' be the probability of a page fault ( $0 \leq p \leq 1$ ), & 'm' is the mem. access time.

We would expect 'p' to be close to zero - i.e., few page faults.

$$\text{Effective Access Time} = \frac{(1-p) \cdot m + p \cdot \text{page fault time}}{\text{Time}}$$

To compute the effective access time, we must know how much time is needed to service a page fault. A page fault causes the following sequence to occur:

- ① Trap to OS.
- ② Save the user registers & process state.
- ③ Determine the interrupt was a page fault.
- ④ Check the page reference was legal & determine the location of the page on disk.
- ⑤ Issue a read from the disk to a free frame:
  - a) Wait in a queue for this device until a read request is serviced.
  - b) Wait for device seek & or latency time.
  - c) Begin the transfer of the page to a free frame.
- ⑥ While waiting, allocate the CPU to some other process/user.