

# Memory Hierarchy

## Virtual Memory, Address Translation



Virtual Machine

Slides contents from:

Hennessy & Patterson, 5ed. Appendix B and Chapter 2.

David Wentzlaff, ELE 475 – Computer Architecture.

MJT, High Performance Computing, NPTEL.

# Process/Program Address Space

Byte Address



# Process/Program Address Space

Byte Address



- Compiler assumes a linear address space
  - Byte 0 to Byte  $2^{32}-1$

# Process/Program Address Space

Byte Address



- Compiler assumes a linear address space
  - Byte 0 to Byte  $2^{32}-1$
- **Virtual Address space**

# Process/Program Address Space

Byte Address



- Compiler assumes a linear address space
  - Byte 0 to Byte  $2^{32}-1$
- **Virtual Address space**
- The entire process data structure may not be present in MM at all times.



# Paged Virtual Memory



# Paged Virtual Memory



MAIN MEMORY

# Paged Virtual Memory



# Paged Virtual Memory





# The Memory Hierarchy



# Address Translation Table



# Address Translation Table



# Address Translation Table



# Address Translation Table



# Virtual Memory



# Virtual Memory



# Virtual Memory

Main Memory



Page Tables

|    |       |
|----|-------|
| P1 | 0 → 0 |
|    | 1 -   |
|    | 2 -   |
|    | 3 -   |

|    |       |
|----|-------|
| P2 | 0 -   |
|    | 1 -   |
|    | 2 → 3 |
|    | 3 -   |

|    |     |
|----|-----|
| Pn | 0 2 |
|    | 1 - |
|    | 2 1 |
|    | 3 - |

Hard Disk



# Address Translation Table



# Address Translation Table



# Address Translation Table



# Implementation of Address Translation

- Process always uses virtual addresses

# Implementation of Address Translation

- Process always uses virtual addresses
- **Memory Management Unit (MMU)**: part of CPU; hardware that does address translation

# Implementation of Address Translation

- Process always uses virtual addresses
- **Memory Management Unit (MMU)**: part of CPU; hardware that does address translation
- The page tables are (at best) present in the MM (OS virtual address space)
  - One main memory reference per address translation!

• load R1, 4(R2) →  
\* all address bits  
\* addr translation → 1 MM access  
\* 1 mm access to get the data

3<sup>6</sup> entries in PageTable  
2 →

# Implementation of Address Translation

- Process always uses virtual addresses
  - **Memory Management Unit (MMU)**: part of CPU; hardware that does address translation
  - To translate a virtual memory address, the MMU has to read the relevant page table entry out of memory
-

# Implementation of Address Translation

- Process always uses virtual addresses
- **Memory Management Unit (MMU)**: part of CPU; hardware that does address translation
- To translate a virtual memory address, the MMU has to read the relevant page table entry out of memory
  - Caches recently used translations in a **Translation Lookaside Buffer (Page Table Cache)**



# Caches and Address Translation



# Caches and Address Translation



# Caches and Address Translation



# Caches and Address Translation



# Which is less preferable?

- **Physical addressed cache**
  -
- **Virtual addressed cache**
  -

# Which is less preferable?

- **Physical addressed cache**
  - ne* – Hit time higher (cache access after translation)
- **Virtual addressed cache**
  -

# Which is less preferable?

- **Physical addressed cache**
  - Hit time higher (cache access after translation)
- **Virtual addressed cache**
  - Data/instruction of different processes with same virtual address in cache at the same time ...
    -

# Which is less preferable?

- **Physical addressed cache**
  - Hit time higher (cache access after translation)
- **Virtual addressed cache**
  - Data/instruction of different processes with same virtual address in cache at the same time ...
    - Flush cache on context switch, or *process state* → page table adder
    - Include Process id as part of each cache directory entry

# Which is less preferable?

- **Physical addressed cache**
  - Hit time higher (cache access after translation)
- **Virtual addressed cache**
  - Data/instruction of different processes with same virtual address in cache at the same time ...
    - Flush cache on context switch, or
    - Include Process id as part of each cache directory entry
  - Synonyms

# Synonyms (Aliases)



**t1: Read x**



# Synonyms (Aliases)



# Synonyms (Aliases)



NPPT cache

# Synonyms (Aliases)



# Synonyms (Aliases)



- L2 uses virtual addresses



**2 copies of one physical page in the cache!**



coherence

# Overlapped Operation



# Overlapped Operation



**Virtually Indexed Physically Tagged Cache (VIPT)**

Other options: PIPT, VIVT

Direct mapped, 32 KB, 32B block,  
32b main memory address

# Recall – Cache Access

Direct mapped, 32 KB, 32B block,  
32b main memory address

32 b address

# Recall – Cache Access

Direct mapped, 32 KB, 32B block,  
32b main memory address



Direct mapped, 32 KB, 32B block,  
32b main memory address



Direct mapped, 32 KB, 32B block,  
32b main memory address



# Recall – Cache Access

Direct mapped, 32 KB, 32B block, 32b main memory address



# Recall – Cache Access

Direct mapped, 32 KB, 32B block, 32b main memory address



Direct mapped, 32 KB, 32B block,  
32b main memory address



Direct mapped, 32 KB, 32B block,  
32b main memory address



# Recall – Cache Access

Direct mapped, 32 KB, 32B block, 32b main memory address



64 b VM address

# VM Example



# VM Example

Page  
16KB

# VM Example



# VM Example



① go to decide

A blue circle containing the number "1" with an arrow pointing to the "Index (8)" box. Another arrow points from the "Index (8)" box to a cache block diagram.



# VM Example



# VM Example



# VM Example



Index comes from the Virtual Address  
(Virtually Indexed)

VIRI

Tag comes from the Physical Address  
(Physically tagged)

# VM Example



Index comes from the Virtual Address  
(Virtually Indexed)

Tag comes from the Physical Address  
(Physically tagged)

# VM Example



Index comes from the Virtual Address  
(Virtually Indexed)

Tag comes from the Physical Address  
(Physically tagged)

# VM Example



Index comes from the Virtual Address  
(Virtually Indexed)

Tag comes from the Physical Address  
(Physically tagged)

# VM Example



Index comes from the Virtual Address  
(Virtually Indexed)

Tag comes from the Physical Address  
(Physically tagged)

# VM Example



# Translation Lookaside Buffer

- Cache of page table mappings
- 32 – 4096 entries long
  - SA, FA, or DM
- Dirty flag – use during page write back
- Ref – used for LRU

Diagram illustrating the structure of a Translation Lookaside Buffer (TLB) entry:

The diagram shows a table with 6 columns, each containing two rows of data. The columns are labeled: VPN (tag), PPN (data), Valid, Ref, Dirty, and Access Rights.

Annotations in blue:

- A checkmark is placed above the first row of the VPN (tag) column.
- A checkmark is placed above the first row of the PPN (data) column.
- An arrow points from the Ref column to the "applying LRU" text.
- A checkmark is placed above the first row of the Dirty column.

Handwritten text "applying LRU" is written diagonally across the Ref and Dirty columns.

| VPN<br>(tag) | PPN<br>(data) | Valid | Ref | Dirty | Access Rights |
|--------------|---------------|-------|-----|-------|---------------|
|              |               |       |     |       |               |
|              |               |       |     |       |               |

# Page Fault

- Virtual address generated by processor is not available in main memory

# Page Fault

- Virtual address generated by processor is not available in main memory
- Detected on attempt to translate address
  - Page Table entry is invalid

# Page Fault

- Virtual address generated by processor is not available in main memory
- Detected on attempt to translate address
  - Page Table entry is invalid
- Must be 'handled' by operating system
  - Identify slot in main memory to be used
  - Get page contents from disk
  - Update page table entry
- 



# Page Fault

- Virtual address generated by processor is not available in main memory
- Detected on attempt to translate address
  - Page Table entry is invalid
- Must be ‘handled’ by operating system
  - Identify slot in main memory to be used
  - Get page contents from disk
  - Update page table entry
- Provide data to the processor

# Abstraction: Virtual vs. Physical Memory

- Programmer sees virtual memory
  - Can assume the memory is “infinite”

  



# Abstraction: Virtual vs. Physical Memory

- Programmer sees virtual memory
  - Can assume the memory is “infinite”
- Reality: Physical memory size is much smaller than what the programmer assumes

# Abstraction: Virtual vs. Physical Memory

- Programmer sees virtual memory
    - Can assume the memory is “infinite”
  - Reality: Physical memory size is much smaller than what the programmer assumes
  - The system (system software + hardware, cooperatively) maps virtual memory addresses are to physical memory
    - The system automatically manages the physical memory space transparently to the programmer
-

# Abstraction: Virtual vs. Physical Memory

- + Programmer does not need to know the physical size of memory nor manage it → A small physical memory can appear as a huge one to the programmer → Life is easier for the programmer
- More complex system software and architecture

---

A classic example of the programmer/(micro)architect tradeoff



# Virtual Memory

- What is the size of the Page Table?
  - Where is it stored?
  - What factors decide the size of a page?  
    - What are its side effects?
  - Page size is constant/variable?
  -

