

VIETNAM NATIONAL UNIVERSITY HO CHI MINH CITY  
HO CHI MINH CITY UNIVERSITY OF TECHNOLOGY  
FACULTY OF COMPUTER SCIENCE AND ENGINEERING



## REPORT ASSIGNMENT

### Operating System

Class: CC01 - HK251  
INSTRUCTOR: NGUYEN PHUONG DUY

TOPIC: DESIGN SIMPLE OPERATING SYSTEM

| student's name   | Student's ID |
|------------------|--------------|
| Le Phuc Khang    | 2352471      |
| Tran Quoc Thang  | 2353125      |
| Tran Xuan Hao    | 2252191      |
| Nguyen Tuan Ngoc | 2352815      |
| Phan Tuan Kiet   | 2352654      |

## GROUP WORK RESULTS REPORT

| Number | Full name        | Student ID | Contribution |
|--------|------------------|------------|--------------|
| 1      | Le Phuc Khang    | 2352471    | 100%         |
| 2      | Tran Quoc Thang  | 2353125    | 100%         |
| 3      | Nguyen Tuan Ngoc | 2352815    | 100%         |
| 4      | Phan Tuan Kiet   | 2352654    | 100%         |
| 5      | Tran Xuan Hao    | 2252191    | 100%         |

## Contents

|                                                                                       |           |
|---------------------------------------------------------------------------------------|-----------|
| <b>1 Theoretical Background</b>                                                       | <b>3</b>  |
| 1.1 Scheduler . . . . .                                                               | 3         |
| 1.2 Memory Management . . . . .                                                       | 4         |
| 1.2.1 Memory management . . . . .                                                     | 4         |
| 1.2.2 Multi-level paging . . . . .                                                    | 5         |
| 1.2.3 Comparative Analysis of Page Table Architectures . . . . .                      | 6         |
| <b>2 Implementation</b>                                                               | <b>13</b> |
| 2.1 Scheduler . . . . .                                                               | 13        |
| 2.1.1 Queue Operations (Implementation of Round-Robin) . . . . .                      | 13        |
| 2.1.2 MLQ Scheduler Implementation (sched.c) . . . . .                                | 14        |
| 2.1.3 Scheduler Algorithm Conclusion . . . . .                                        | 17        |
| 2.2 Memory Management . . . . .                                                       | 19        |
| 2.2.1 Memory management . . . . .                                                     | 19        |
| 2.2.2 Multi-level paging . . . . .                                                    | 23        |
| <b>3 Interprets the results of running tests</b>                                      | <b>28</b> |
| 3.1 Scheduling . . . . .                                                              | 28        |
| 3.1.1 The result input file of scheduler (sched) . . . . .                            | 28        |
| 3.1.2 Scheduling: Gantt chart . . . . .                                               | 30        |
| 3.2 Memory . . . . .                                                                  | 30        |
| 3.2.1 Result of the input file of Memory (os_0_mlq_paging) . . . . .                  | 30        |
| 3.2.2 Explain the status of the memory allocation in heap and data segments . . . . . | 33        |
| 3.2.3 Chronological Analysis of Memory Allocation . . . . .                           | 34        |
| 3.2.4 Summary of Result Flow . . . . .                                                | 35        |
| <b>4 Answer question</b>                                                              | <b>36</b> |
| 4.1 Question 1 . . . . .                                                              | 36        |
| 4.2 Question 2 . . . . .                                                              | 36        |
| 4.3 Question 3 . . . . .                                                              | 36        |
| 4.4 Question 4 . . . . .                                                              | 41        |
| 4.5 Question 5 . . . . .                                                              | 42        |
| 4.6 Question 6 . . . . .                                                              | 43        |
| 4.7 Question 7 . . . . .                                                              | 44        |
| <b>5 Overall</b>                                                                      | <b>53</b> |
| <b>6 Source Code</b>                                                                  | <b>53</b> |

# 1 Theoretical Background

## 1.1 Scheduler

The scheduler is the central component of the kernel responsible for deciding which process runs at any given time. Our implementation is grounded in the theoretical framework of CPU Scheduling described in Chapter 5 of *Operating System Concepts*. Specifically, we construct a **Symmetric Multiprocessing (SMP)** system utilizing a **Preemptive Multilevel Queue (MLQ)** algorithm with a **Time-Slicing allocation strategy** to address starvation.

### 1. Multilevel Queue Scheduling (Theory of Partitioning):

- **Theoretical Basis:** Processes in a system often have different response-time requirements (e.g., interactive/foreground vs. batch/background). A Multilevel Queue (MLQ) algorithm partitions the Ready Queue into several separate queues, each assigned to a different class of processes.
- **System Design:** In our assignment, we define `MAX_PRIO` distinct queues. Processes are permanently assigned to a specific queue based on their `priority` attribute. Unlike Multilevel Feedback Queues (MLFQ), which allow processes to move between queues to define their behavior dynamically, our standard MLQ implementation reduces runtime overhead ( $\mathcal{O}(1)$  queue selection) by maintaining fixed process classification.

### 2. Intra-Queue Scheduling: Round-Robin (RR) and Preemption:

- **Theoretical Basis:** To support time-sharing, Round-Robin (RR) scheduling is theoretically defined as FCFS scheduling with **Preemption**. A small unit of time, called a *Time Quantum*, is defined. If a process's CPU burst exceeds this quantum, the system timer generates an interrupt, causing the OS to preempt the process and move it to the tail of the ready queue.
- **System Design:** We implement this by assigning a `time_slot` to the system. The `queue.c` module strictly enforces FIFO behavior (`dequeue` from head, `enqueue` to tail), which, combined with the timer interrupt in `os.c`, realizes the Round-Robin logic. This ensures **fairness** among processes of the same priority and minimizes the average response time.

### 3. Inter-Queue Scheduling: Solving Starvation via Time-Slicing:

- **The Starvation Problem:** A fundamental flaw of fixed-priority scheduling is **Indefinite Blocking** (or Starvation), where low-priority tasks may never execute if a continuous stream of high-priority tasks arrives.
- **Theoretical Solution:** The textbook proposes two solutions: *Aging* or *Time Slicing among queues*. In the Time Slicing approach, each queue gets a certain portion of the CPU time (e.g., 80% for foreground, 20% for background).
- **System Design (Slot Mechanism):** We implement the **Time Slicing** solution using a deterministic slot mechanism. Each queue  $i$  is allocated a specific budget:  $slot[i] = MAX\_PRIO - i$ . The scheduler iterates through queues; if a high-priority queue consumes its allocated slots, the scheduler is theoretically forced to "yield" to lower-priority queues, even if the high-priority queue is non-empty.

This converts the algorithm from "Absolute Priority" to "Proportional Share," mathematically guaranteeing that even the lowest priority queue eventually receives CPU cycles.

#### 4. Symmetric Multiprocessing (SMP) and Concurrency Control:

- **Theoretical Basis:** According to the theory of *Multiple-Processor Scheduling*, modern systems predominantly use Symmetric Multiprocessing (SMP), where each processor is self-scheduling. The literature outlines two primary strategies for organizing threads: (1) Per-processor private queues, and (2) A single **Common Ready Queue**. While private queues offer cache efficiency, the Common Ready Queue approach provides a unified pool of work but introduces specific synchronization challenges.
- **System Design (Load Balancing):** We adopt the **Common Ready Queue** architecture. As noted in the theoretical framework, this design naturally handles the distribution of workloads. Since all processors examine the same global queue to select a thread, the system inherently avoids the workload disparity often found in private-queue architectures. No complex balancing algorithms are needed because no processor remains idle while tasks exist in the global pool.
- **Synchronization Theory:** The textbook explicitly warns that a shared ready queue creates a potential **Race Condition**, where two separate processors might attempt to schedule the same thread simultaneously or corrupt the queue structure. To mitigate this, as mandated by the principles of concurrency control, we implement a locking mechanism (`pthread_mutex_t`). This lock protects the critical section (the shared queue), ensuring that enqueue and dequeue operations are strictly serialized, albeit at the cost of potential lock contention.

## 1.2 Memory Management

### 1.2.1 Memory management

The fundamental task of an operating system is to manage the memory hierarchy and provide an abstraction mechanism to separate the user process from the physical details of the hardware.

**a. Logical vs. Physical Address Space** The central concept of memory management is the separation of logical address space from physical address space:

- **Logical Address:** An address generated by the CPU during program execution. To a process, memory appears as a flat, continuous space ranging from address 0 to  $Max_{Virtual}$ . The set of all valid logical addresses for a process is defined as the *Logical Address Space*.
- **Physical Address:** The actual address seen and accessed by the memory unit on the physical RAM.
- **Memory Management Unit (MMU):** A hardware device responsible for the run-time mapping from virtual to physical addresses. Under this scheme, the user program never interacts with the real physical addresses.

**b. Process Memory Layout** Although the virtual address space is linear, logically a process is typically organized into distinct sections to separate access rights and usage purposes:

- **Text section:** contains the executable code of the program (usually read-only).
- **Data section:** stores global and static variables.
- **Heap:** region for dynamic allocations that conceptually grows upward.
- **Stack:** holds stack frames and local variables, growing downward.

In full-featured UNIX-like systems, each of the above logical sections is usually represented by one or more Virtual Memory Areas (VMAs) maintained in a linked list inside the process's `mm_struct`.

In our simple OS implementation, we adopt a slightly different but simpler model. Each process owns a single user `vm_area_struct` that covers its whole logical user space (`vmaid = 0`). Inside this VMA, we represent individual logical "segments" by region descriptors `vm_rg_struct`, whose start and end addresses (`rg_start`, `rg_end`) are recorded in the symbol region table `symrgtbl[]`. Free holes between regions are maintained in the VMA's `vm_freerg_list`.

Therefore, in this project:

- segment  $\approx$  `vm_rg_struct` region inside one VMA,
- while the VMA itself simply defines the outer bounds of the user address space for the process.

### 1.2.2 Multi-level paging

To implement memory mapping in modern architectures, **Paging** is widely used to eliminate external fragmentation.

**a. The Page Table Problem in 64-bit Systems** In traditional 32-bit architectures, a two-level paging scheme is often sufficient. However, with a 64-bit architecture ( $2^{64}$  bytes of address space), the size of the page table becomes prohibitively large if stored contiguously.

Hierarchical Paging is the optimal solution is to divide the page table into smaller pieces. This allows the operating system to avoid allocating the entire page table structure in contiguous memory. Only the necessary sub-tables are allocated (on-demand allocation).

**b. 5-Level Paging Architecture** The system is designed based on an extension of the Intel x86-64 architecture, utilizing a **5-level paging scheme** to support a significantly larger virtual address space (up to 57-bit linear address).

The address translation process (also known as the **Page Walk**) occurs sequentially through 5 tables:

1. **PGD (Page Global Directory):** The top-level table (Level 5).
2. **P4D (Page Level 4 Directory):** The Level 4 table.
3. **PUD (Page Upper Directory):** The Level 3 table.
4. **PMD (Page Middle Directory):** The Level 2 table.

5. **PT (Page Table):** The final level (Level 1), containing the mapping to the physical frame.

Each entry in the upper-level tables points to the base address of the next lower-level table. The entry in the PT (Page Table Entry - PTE) contains the actual **Physical Frame Number (PFN)**.

### 1.2.3 Comparative Analysis of Page Table Architectures

#### a. Hierarchical Paging

When applying Single-level paging on a 64-bit system with a 4KB ( $2^{12}$ ) page size, the virtual address space contains approximately  $2^{52}$  pages. If a single page table were used, the Operating System would be required to allocate a massive, contiguous block of physical memory to store all Page Table Entries for each process. This is practically impossible due to the sheer size exceeding RAM limits and the difficulty of finding contiguous memory caused by fragmentation.

Hierarchical Paging resolves this by dividing the logical address into multiple parts to manage the page table as a tree structure (typically 4-level or 5-level paging in modern systems). This method offers two key advantages:

- **Memory Efficiency:** The system does not need to store the entire page table structure in RAM. Due to the sparse nature of address spaces, Inner Tables are only initialized for the virtual memory regions that the process actually uses.
- **Non-contiguous Allocation:** Only the highest-level table (Outer Page Table)—which is usually very small—needs to be contiguous. The Inner Tables can be scattered anywhere in physical memory, reducing the burden on the OS's memory management.

However, this method comes with significant disadvantages regarding performance and complexity:

- **Access Latency:** The number of memory accesses increases with the number of paging levels. For instance, with 5-level paging, the CPU requires 6 memory accesses to retrieve a single byte of data (5 for the page table walk and 1 for the actual data), significantly slowing down processing speed.
- **Design Complexity:** Both the hardware (specifically the MMU) and the Operating System must be designed with greater complexity to handle the page walk mechanism accurately.



Figure 1: 5-Level Paging Scheme (Address Translation for 57-bit Virtual Address)

### b. Huge Pages

As analyzed in the **Hierarchical Paging** section, while that method effectively manages memory allocation, it compromises performance because the CPU must perform multiple memory accesses to walk the page table tree. Furthermore, relying on the standard **4KB** page size in systems with massive RAM capacities results in bloated page tables and places significant pressure on the TLB.

To mitigate these overheads **directly within the hierarchical framework**, the **Huge Pages** technique is employed. Instead of rigidly dividing memory into 4KB blocks, the system aggregates them into larger chunks (typically **2MB** or **1GB**). The primary goal is to reduce the depth of the page table walk and increase the TLB reach.

Huge Pages do not eliminate the hierarchical paging structure but operate by “short-circuiting” the tree traversal process. Considering the Intel x86-64 4-level paging structure:

- **Standard Page (4KB):** The CPU must traverse all 5 levels: PGD → P4D → PUD → PMD → PT → 4KB Page.
- **Huge Page (2MB):** The CPU only traverses 4 levels. At the level 1 (Page Directory - PT), a special bit (PS bit - Page Size) is set. The entry in the PT points directly to a **2MB physical frame** instead of a Level-5 Page Table.
- **Huge Page (1GB):** The CPU only traverses 3 levels. At the level 2 (Page Directory Pointer - PMD), the PS bit is set, pointing directly to a **1GB physical frame**.

### Advantages

- **Performance:** Minimizes TLB Misses. With a high TLB Hit rate, physical address translation is almost instantaneous.
- **Smaller Page Tables:** Since one large page replaces 512 small pages (in the case of 2MB), the number of page table entries is reduced by a factor of 512. The system does not need to allocate memory for the last-level Page Tables, saving RAM for the OS.

- **Fewer Memory Accesses:** The page walk process is shorter (requiring only 2–3 memory accesses instead of 4–5).

### Disadvantages

- **Internal Fragmentation:** This is the most significant drawback. If a process requires only 10KB of memory but is allocated a 2MB Huge Page, nearly 2MB is wasted.
- **Allocation Latency:** Finding a contiguous physical memory block of 2MB or 1GB is much more difficult than finding a 4KB block, especially when RAM is fragmented after long periods of operation.
- **Swapping Issues:** Swapping out a 1GB page to the disk is much more cumbersome and expensive than handling standard 4KB pages.



Figure 2: Huge Pages

### c. Hashed Page Tables

This method utilizes a **Hash Table** to manage page table entries. Instead of using linear storage or a hierarchical tree structure, the Virtual Page Number (VPN) is passed through a hash function to determine its storage location in the table. This approach is particularly effective for systems with a “sparse address space,” where the majority of virtual addresses are not in use. Each entry in the hash table typically serves as the head of a linked list (to handle hash collisions). Each element in the list consists of three main fields:

- **Virtual Page Number (VPN):** Used to identify the page.
- **Physical Frame Number (PFN):** The corresponding mapped physical frame value.
- **Pointer:** A reference to the next element in the linked list.

### Advantages

- **Optimal Memory Efficiency:** The size of the hashed page table is proportional to the number of *physical pages actually in use*, rather than the size of the entire virtual address space. This results in significant memory savings in massive 64-bit systems.
- **Flexible Structure:** Since it does not require storing information for unused memory regions, this method eliminates memory waste associated with page tables for unallocated address ranges.

### Disadvantages

- **Variable Memory Access Cost:** Performance depends heavily on the quality of the hash function. High collision rates result in longer linked lists, forcing the CPU to perform multiple memory accesses to traverse the list, thereby reducing processing speed.
- **Hardware Design Complexity:** Traversing a linked list is more difficult to implement in hardware (MMU) compared to walking a tree structure in hierarchical paging.
- **Poor Locality of Reference:** Due to the random nature of hashing, pages that are contiguous in virtual memory may be scattered across the hash table, reducing the efficiency of the CPU Cache.



Figure 3: Hashed Page Tables

### d. Clustered Page Tables

Clustered Page Tables are an optimized variation of the **Hashed Page Tables** method. As previously discussed, traditional hashed page tables store individual  $\langle \text{VPN}, \text{PFN} \rangle$  mappings. This can lead to significant memory overhead for pointers and result in long linked list chains if hash collisions occur frequently.

To address this, **Clustered Page Tables** modify the storage unit: instead of each entry containing information for a single page, it stores mapping information for a group of **contiguous virtual pages** (e.g., 16 consecutive pages).

**Entry Structure:** Each entry in the hash table's linked list typically consists of:

- **Virtual Page Number (VPN):** The identifier tag for the first page in the cluster (Base VPN).
- **An Array of PFNs:** Stores physical frame numbers for multiple consecutive pages (e.g.,  $PFN_0, PFN_1, \dots, PFN_{15}$ ).
- **Pointer:** Points to the next entry in case of a hash collision.

**Mechanism:** When the CPU needs to translate a virtual page  $p$ :

- The system hashes  $p$  to locate the bucket in the table.
- It searches the linked list for an entry where the **Base VPN** matches  $p$ .
- Instead of retrieving just one address, the system can access the address for  $p$  and its neighbors ( $p+1, p+2, \dots$ ) located within the same cluster.

### Advantages

- **Reduced Pointer Overhead:** Since one entry stores multiple pages (e.g., 16 pages), the number of nodes in the linked list is reduced by a factor of 16, saving memory space used for ‘next’ pointers.
- **Leveraging Spatial Locality:** Programs often access memory regions that are close to each other. Grouping these pages into a single entry improves performance when processing contiguous data blocks.
- **Sparse Address Space Support:** Retains the benefit of Hashed Page Tables by not consuming memory for unused virtual address regions.

### Disadvantages

- **Implementation Complexity:** Managing clusters requires more complex software/hardware logic compared to simple 1-to-1 mapping.
- **Internal Fragmentation within Entry:** If a cluster is designed to hold 16 pages, but a process only uses 1 page within that cluster, the remaining 15 PFN storage slots in that entry remain empty, resulting in wasted page table memory.



Figure 4: Clustered Page Tables

### e. Inverted Page Tables

In both Hierarchical Paging and Hashed Page Tables methods, each process maintains its own separate page table. Each table contains an entry for every virtual page the process is using. This results in the system managing millions of entries, consuming a significant amount of physical memory just to store page table structures to track memory usage.

To solve this problem, the **Inverted Page Table** method was developed. Instead of indexing based on the virtual address space of each process, this method indexes based on the frames of the actual **physical memory**.

#### Structure and Mechanism

- **Global Nature:** There is only one Global Page Table for the entire system, regardless of the number of running processes.
- **Fixed Size:** The number of entries in this table corresponds exactly to the number of **Physical Frames** in RAM. For example, if RAM has 1 million frames, the table will have 1 million entries.
- **Entry Structure:** The  $i$ -th entry in the table contains information about the virtual page stored in the  $i$ -th physical frame. Each entry typically consists of:
  1. **Virtual Page Address:** To identify the logical page.
  2. **Process ID (PID):** Information about the process that owns the page (essential because different processes may generate identical virtual addresses).

#### Advantages

- **Maximum Memory Efficiency:** This is the most significant advantage. The size of the page table is fixed and depends only on the size of physical RAM, making it independent of the number of processes or the size of the virtual address space.

#### Disadvantages

- **High Lookup Latency:** Since the table is sorted by physical address (while the CPU queries based on virtual address), the system cannot use direct indexing ( $O(1)$ ) as in traditional methods. Instead, it must perform a search operation. Although often assisted by a Hash Table, handling collisions and traversing the table is inherently slower than direct array access.
- **Difficulty in Shared Memory:** This is the major drawback. In traditional paging, memory sharing is achieved by having entries in different process page tables point to the same physical frame. However, in an Inverted Page Table, each physical frame corresponds to exactly **one single entry**. Therefore, it can only store one  $\langle \text{PID}, \text{VPN} \rangle$  pair, making the implementation of shared memory complex and often requiring auxiliary data structures.
- **Incomplete Logical Address Space Information:** Since the Inverted Page Table only stores mappings for pages currently resident in physical memory (RAM), it completely lacks information regarding pages located in secondary storage (Disk/Swap). Consequently, when a Page Fault occurs, the system cannot rely on this table to resolve the data location. As a result, the operating system is compelled to maintain an additional external page table for each process to track the entire logical address space.

**f. Reasons for Choosing Hierarchical Page Tables** In this simple operating system simulation project, we selected Hierarchical Page Tables (5-level Paging Scheme) for the following three reasons:

- **Dynamic Allocation:**

The hierarchical structure enables “on-demand” memory allocation. There is no need to initialize the entire massive data structure upfront. When a process requires access to a memory region, we simply invoke `alloc_aligned_table()` for that specific branch. This approach effectively prevents the wastage of management memory within a 64-bit environment.

- **Simplified Implementation of Sharing and Protection:**

The Address-Translation model allows for the assignment of access permissions (Read/Write/Execute) at each level of the tree (from top-level directories down to individual pages). This facilitates the implementation of Memory Protection, allowing for easy setup and granular control over specific memory regions.

- **Ease of Implementation:**

Compared to Inverted Page Tables or Hashed Page Tables, the hierarchical structure is easier to implement within a C-based simulation environment. We primarily rely on bitwise shifts and array indexing to decompose virtual addresses into their respective indices .

## 2 Implementation

### 2.1 Scheduler

#### 2.1.1 Queue Operations (Implementation of Round-Robin)

To support the Round-Robin scheduling within each priority level, the queue operations in `queue.c` must strictly enforce First-In-First-Out (FIFO) behavior. I have implemented three core functions as follows:

**a. Enqueue Operation (Arrival/Preemption)** The `enqueue` function handles the arrival of a new process or the return of a preempted process.

- **Mechanism:** It places the process at index `q->size`. This is critical for Round-Robin: a process finishing its time slice is moved to the back of the line. We check for buffer overflow (`MAX_QUEUE_SIZE`) before assignment to ensure memory safety.

```

1 void enqueue(struct queue_t *q, struct pcb_t *proc) {
2     /* TODO: put a new process to queue [q] */
3     if (q->size < MAX_QUEUE_SIZE) {
4         q->proc[q->size] = proc;
5         q->size++;
6     }
7 }
```

Listing 1: Enqueue Implementation

**b. Dequeue Operation (Dispatching)** The `dequeue` function selects the next process for execution.

- **Mechanism:** It selects the process at index 0 (the Head), which corresponds to the process that has been waiting the longest. Since we use a static array, removing index 0 leaves a "hole". The function includes a loop to shift all subsequent elements ( $i + 1$ ) to position  $i$ . This ensures the array remains contiguous and the next candidate is always at index 0.

```

1 struct pcb_t *dequeue(struct queue_t *q) {
2     /* TODO: return a pcb whose priority is the highest
3      * in the queue [q] and remember to remove it from q */
4     if (q == NULL || q->size == 0) {
5         return NULL;
6     }
7     struct pcb_t *first_proc = q->proc[0];
8
9     int i;
10    for (i = 0; i < q->size - 1; i++) {
11        q->proc[i] = q->proc[i + 1];
12    }
13    q->size--;
14 }
```

```

14     return first_proc;
15 }
```

Listing 2: Dequeue Implementation

### 2.1.2 MLQ Scheduler Implementation (sched.c)

The `sched.c` module implements the core logic of the Multi-Level Queue scheduler. This implementation is designed to handle process retrieval and placement within a Symmetric Multiprocessing (SMP) environment, strictly following the Slot-based policy defined in the assignment specifications.

**a. Scheduler Initialization (`init_scheduler`)** This function sets up the system state before any process execution begins.

- **Slot Pre-calculation:** Instead of calculating the slot quota at runtime, we initialize the `slot` array immediately using the formula  $slot[i] = MAX\_PRIO - i$ .
- **Reasoning:** This optimization reduces CPU overhead during the critical context-switching phase. By pre-filling the quotas, the scheduler simply needs to decrement or reset values during execution.
- **Mutex Initialization:** The `pthread_mutex_init` call prepares the lock required for thread safety in the SMP architecture.

```

1 void init_scheduler(void) {
2     int i;
3     for (i = 0; i < MAX_PRIO; i++) {
4         mlq_ready_queue[i].size = 0;
5         slot[i] = MAX_PRIO - i; // Pre-calculate quotas
6     }
7     // ...
8     pthread_mutex_init(&queue_lock, NULL);
9 }
```

Listing 3: Initialization Logic

**b. Process Retrieval Logic (`get_mlq_proc`):** The `get_mlq_proc` function serves as the core of the scheduler, responsible for selecting the next process for CPU allocation from the set of priority queues. The algorithm is designed based on the Multi-Level Queue (MLQ) model combined with a *Slot-based Allocation* mechanism to address the issue of resource starvation.

#### 1. General Operating Principle

The function iterates sequentially through the queues from the highest priority (`prio = 0`) to the lowest (`prio = MAX_PRIO - 1`). The decision to select a process is based on two prerequisites:

- **Readiness:** The queue at that priority level must have waiting processes (`!empty`).
- **Resource Quota (Slot):** The queue must have remaining CPU usage quota (`slot[i] > 0`).

## 2. Algorithm Flow

The process selection procedure occurs through the following steps:

- **Step 1: Critical Section Protection.** Since the system simulates a multi-processor environment, the function begins by locking the Mutex (`pthread_mutex_lock(&queue_lock)`). This ensures the data integrity of the `mlq_ready_queue` when multiple CPUs access it simultaneously.
- **Step 2: Priority Traversal.** A loop iterates from `i = 0` (highest priority) to `MAX_PRIO`. At each priority level `i`:
  - *Check Empty:* If `mlq_ready_queue[i]` is empty, the system skips it and checks the next priority level.
  - *Check Slot (Core Logic):*
    - \* **Case A (Slot Remaining - `slot[i] > 0`):** The queue is allowed to run. The system retrieves the first process (`dequeue`) and decrements the remaining slots by 1 (`slot[i]--`).  
**Decision:** Break the loop immediately to return this process to the CPU.
    - \* **Case B (Slot Exhausted - `slot[i] == 0`):** The queue has used up its quota for the current cycle. The system performs a Slot Refill using the formula: `slot[i] = MAX_PRIO - i`.  
**Decision:** Crucially, the algorithm DOES NOT select a process from this queue but continues the loop (`continue`) to check the next lower priority queue.
- **Step 3: System State Update.** If a process is found (`proc != NULL`), it is added to the running list (`running_list`) for tracking.
- **Step 4: Finish.** Unlock the Mutex (`pthread_mutex_unlock`) and return the `pcb_t` pointer of the selected process (or `NULL` if no viable process exists).

## 3. Slot-based Mechanism Analysis

This implements a mechanism of "**Conditional Yielding**":

- **Purpose:** To prevent high-priority processes (e.g., Priority 0) from monopolizing the CPU indefinitely.
- **Operation:** When a high-priority queue consumes its allocated slots (e.g., 140 slots for Prio 0), it is forced to reset its slots and "yield" the checking turn to lower-priority queues within the current function call. This ensures relative fairness, allowing lower-priority processes (such as Prio 139) a chance to execute (CPU time) even under heavy load conditions.

## 4. Complexity Analysis

In the worst-case scenario, the algorithm must traverse through all `MAX_PRIO` queues.

**Complexity:**  $O(K + N)$

- $K$ : The number of priority levels (`MAX_PRIO`).
- $N$ : The cost of the `dequeue` operation.

```
1 struct pcb_t * get_mlq_proc(void) {
2     struct pcb_t * proc = NULL;
3
4     pthread_mutex_lock(&queue_lock);
5
6     for (int i = 0; i < MAX_PRIO; i++) {
7         if (!empty(&mlq_ready_queue[i])) {
8             if (slot[i] > 0) {
9                 proc = dequeue(&mlq_ready_queue[i]);
10                slot[i]--;
11                break;
12            } else {
13                slot[i] = MAX_PRIO - i;
14            }
15        }
16    }
17
18    if (proc != NULL)
19        enqueue(&running_list, proc);
20
21    pthread_mutex_unlock(&queue_lock);
22
23    return proc;
24}
25}
```

Listing 4: Slot-based Decision Logic

c. **Process Placement (put\_mlq\_proc / add\_mlq\_proc)** These functions handle putting a process back into the ready queue (e.g., after a time slice expires or upon creation).

- **Mechanism:** They map the process's priority (`proc->prio`) to the correct index in the `mlq_ready_queue` array and invoke the FIFO `enqueue` operation.
- **Synchronization:** These operations are wrapped in mutex locks to ensure the queue structure is not corrupted by concurrent access from the loader or other CPUs.

d. **Synchronization Mechanism (Mutex Protection)** A critical aspect of this implementation is the handling of Shared Resources in a multi-core environment.

In this simulation, the `mlq_ready_queue` is a **Global Shared Resource**.

- *Scenario without Mutex:* CPU 0 checks queue 0 and sees 1 process. Simultaneously, CPU 1 checks queue 0 and sees the same process. Both CPUs try to `dequeue` the same pointer. This creates a **Race Condition**, leading to memory corruption or one process being executed twice.

We use `pthread_mutex_lock(&queue_lock)` to define a **Critical Section**.

- **Scope:** The lock covers the entire duration of reading the state (checking `empty` and `slot`) and modifying the state (`dequeue`, `enqueue`, `slot --`).
- **Atomicity:** This ensures that the complex operation of "Check Slot → Dequeue → Decrement Slot" appears as a single atomic instruction to other CPUs. No other thread can interrupt or modify the queue during this sequence.

```
1 pthread_mutex_lock(&queue_lock); // BEGIN Critical Section
2 // ...
3 // Safe access to shared mlq_ready_queue and slot arrays
4 // ...
5 pthread_mutex_unlock(&queue_lock); // END Critical Section
```

Listing 5: Critical Section Protection

### 2.1.3 Scheduler Algorithm Conclusion

This section describes the implementation of a **Slot-based Multi-Level Queue (MLQ)** scheduler for an SMP environment. By integrating a quota mechanism ( $slot = MAX\_PRIO - prio$ ) into the priority traversal logic, the design **aims to balance** two conflicting objectives: prioritizing critical tasks while mitigating the starvation of lower-priority processes. Additionally, the application of **Mutex locks is designed to maintain** thread safety, **addressing** potential race conditions as multiple CPUs concurrently access the shared ready queue structure.



Figure 5: Taskflow of `get_mrq_proc()` in MLQ scheduling

## 2.2 Memory Management

### 2.2.1 Memory management

**a. Overview** This module implements logical memory management per process in a simplified way. Each process owns an `mm_struct` that contains:

- One user `vm_area_struct` (`vmaid = 0`) acting as the container for the whole user address space,
- A fixed-size symbol region table `symrgtbl[]` that stores allocated regions by `rgid`,
- And the root pointers to the multi-level page tables handled by `mm64.c`.

Instead of multiple VMAs for text/data/heap/stack, our design keeps a single user VMA and represents logical “segments” as virtual memory regions (`vm_rg_struct`, VMR) inside that VMA. Each VMR is a contiguous range `[rg_start, rg_end]`, recorded in `symrgtbl[rgid]`, while free gaps are chained in `vm_freerg_list` for reuse.

The core design principles are:

- **VMA / region management:** All user regions live inside one user `vm_area_struct`; logical segments are `vm_rg_struct` descriptors, and free holes are kept in `vm_freerg_list`.
- **User / Kernel separation:** User code never touches kernel structures (`pcb_t, mm_struct`); it uses system calls and PID, and the kernel side looks up the correct PCB/MM before operating.
- **Allocation strategy:** Always try to reuse holes from `vm_freerg_list` first; if none fits, expand the VMA (`sbrk / vm_end`) via the `SYSMEM_INC_OP` syscall, then let the paging module map the new pages.

**b. Implementation Details** The following code snippets illustrate the core logic handling VMA organization and the secure User-Kernel interface.

#### 1. VMA Data Structure (`os_mm.h`)

The `vm_area_struct` represents a contiguous range of logical addresses `[vm_start, vm_end]`.

```

1 struct vm_area_struct {
2     unsigned long vm_id;      // Region ID (e.g., DATA, HEAP)
3     unsigned long vm_start;   // Logical Start Address
4     unsigned long vm_end;    // Logical End Address
5     struct vm_rg_struct *vm_freerg_list; // List of free holes for reuse
6     struct vm_area_struct *vm_next;       // Pointer to next VMA
7 };

```

Listing 6: Virtual Memory Area Structure

#### 2. Secure User/Kernel Interface (`libstd.c`)

This wrapper demonstrates the protection mechanism. The user space only passes value arguments and the `pid`, never the raw PCB pointer, ensuring isolation.

```

1 int libsyscall(struct pcb_t *caller, uint32_t syscall_idx,
2                 arg_t a1, arg_t a2, arg_t a3)
3 {

```

```
4     struct sc_regs regs;
5     regs.a1 = a1; regs.a2 = a2; regs.a3 = a3;
6
7     /* CRITICAL: Only caller->pid is passed to the kernel.
8      The Kernel must resolve the PCB internally. */
9     return syscall(caller->krnl, caller->pid, syscall_idx, &regs);
10 }
```

Listing 7: User Space System Call Wrapper

### 3. Memory Allocation Algorithm - \_\_alloc (libmem.c)

This function executes in **Kernel Mode** after the PID has been validated. It implements the logic to search for available memory space within the VMA.

```
1 int __alloc(struct pcb_t *caller, int vmaid, int rgid, addr_t size, addr_t *
2     alloc_addr)
3 {
4     pthread_mutex_lock(&mmvm_lock);
5     struct vm_rg_struct rgnode;
6
7     // Check if we can reuse a free region
8     if (get_free_vmrg_area(caller, vmaid, size, &rgnode) == 0)
9     {
10         caller->mm->symrgtbl[rgid].rg_start = rgnode.rg_start;
11         caller->mm->symrgtbl[rgid].rg_end = rgnode.rg_end;
12         *alloc_addr = rgnode.rg_start;
13         pthread_mutex_unlock(&mmvm_lock);
14         return 0;
15     }
16
17     /* If get_free_vmrg_area FAILED, we must expand the heap */
18     struct vm_area_struct *cur_vma = get_vma_by_num(caller->mm, vmaid);
19     if (!cur_vma)
20     {
21         pthread_mutex_unlock(&mmvm_lock);
22         return -1;
23     }
24
25     addr_t inc_sz = PAGING_PAGE_ALIGNNSZ(size);
26     // Usually aligning to page size is good practice, or use exact size if your
27     // inc_vma handles it.
28     // The provided skeleton had some ifdef logic, let's stick to standard alignment.
29
30     addr_t old_sbrk = cur_vma->sbrk;
31
32     /* SYSCALL to increase limit */
```

```

31     struct sc_regs regs;
32
33     regs.a1 = SYSMEM_INC_OP;
34     regs.a2 = vmaid;
35     regs.a3 = inc_sz; // Request expansion
36
37     // Note: In real syscall, we don't pass PCB, but here we simulate it via wrapper
38     // The wrapper 'syscall' takes (krnl, pid, nr, regs)
39     if (syscall(caller->krnl, caller->pid, 17, &regs) < 0)
40     {
41         pthread_mutex_unlock(&mmvm_lock);
42         return -1; // Failed to expand
43     }
44
45     /* Successful increase limit */
46     caller->mm->symrgtbl[rgid].rg_start = old_sbrk;
47     caller->mm->symrgtbl[rgid].rg_end = old_sbrk + size;
48     *alloc_addr = old_sbrk;
49
50     // The remaining space (inc_sz - size) should technically be added to free list
51     // to avoid internal fragmentation, but for this simple assignment we might skip
52     // it
53     // or add logic:
54     if (inc_sz > size)
55     {
56         struct vm_rg_struct *fragment = malloc(sizeof(struct vm_rg_struct));
57         fragment->rg_start = old_sbrk + size;
58         fragment->rg_end = old_sbrk + inc_sz;
59         fragment->rg_next = NULL;
60         enlist_vm_freerg_list(caller->mm, fragment);
61     }
62
63     pthread_mutex_unlock(&mmvm_lock);
64     return 0;
65 }
```

Listing 8: Kernel Space Allocation Routine

#### 4. Memory Deallocation Algorithm - `enlist_vm_freerg_list (libmem.c)`

This function inserts a recently freed memory region back into the reuse list (free list), marking it available for future allocations.

```

1 int enlist_vm_freerg_list(struct mm_struct *mm, struct vm_rg_struct *rg_elmt)
2 {
3     struct vm_rg_struct *rg_node = mm->mmap->vm_freerg_list;
4
5     if (rg_elmt->rg_start >= rg_elmt->rg_end)
```

```
6     return -1;
7
8     if (rg_node != NULL)
9         rg_elmt->rg_next = rg_node;
10
11    /* Enlist the new region */
12    mm->mmap->vm_freerg_list = rg_elmt;
13
14    return 0;
15}
```

Listing 9: Deallocation Logic

c. **Algorithm Description Step 1: User-Kernel Transition (Protection Mechanism)** To ensure memory safety (referencing *Silberschatz Chapter 2*), the allocation process begins with a System Call:

- **User Request:** The user program calls a library function (e.g., `malloc`), which triggers the wrapper `libsyscall`.
- **Trap to OS:** `libsyscall` passes the Process ID (PID) and an operation code (such as `SYSMEM_INC_OP` in our allocation path, or `SYSMEM_MAP_OP` in other mapping scenarios) to the Kernel via a software interrupt/trap.
- **Authentication:** Upon entering Kernel Mode, the kernel uses the provided PID to traverse the process table and retrieve the correct `pcb_t` pointer. This prevents any user process from manipulating the memory of another process.

#### Step 2: VMA Management (`get_vma_by_num`)

Once in Kernel Mode, the system must identify which memory region the process is requesting.

- **Activity:** The function `get_vma_by_num` traverses the linked list `mm->mmap` to find the VMA whose `vm_id` equals the requested `vmaid`.
- **Purpose:** In our implementation, there is exactly one user VMA per process (`vmaid = 0`), which acts as the container for all user regions. The logical separation between different user “segments” (variables/objects) is handled by `vm_rg_struct` entries stored in `mm->symrgtbl[]`, rather than by multiple VMAs.

#### Step 3: Memory Allocation (`_alloc`)

The allocation logic prioritizes filling "holes" in the memory space before expanding it, ensuring resource efficiency.

1. **Reuse Strategy:** The system inspects the `vm_freerg_list` of the current VMA. This list contains memory regions that were previously freed. The algorithm applies a **First-Fit** strategy: it traverses the list, and if a node satisfying  $rg\_end - rg\_start \geq size$  is found, the system either splits the node or claims the entire node for the new allocation.

2. **Expansion Strategy:** If the free list is empty or contains no suitable regions, the system performs a linear expansion. The new allocation address starts at the current `vm_end`, and the boundary is updated:  $vm\_end = vm\_end + size$ . This behavior is analogous to the Unix `sbrk()` system call.
3. **Registration:** Finally, the system updates the `symrgtbl` (Symbol Region Table) with the `rgid`. This maps a User-managed Region ID to the actual Kernel-managed logical address.

#### Step 4: Memory Deallocation (`enlist_vm_freerg_list`)

When a process frees memory, the region is not immediately physically erased but transitioned to an "Available" state.

- **Logic:** The function receives a memory region structure (`rg_elmt`).
- **Operation:** It inserts this node into the head of the singly linked list `vm_freerg_list`.
- **Significance:** This turns used memory into a new "hole". The next allocation request (via `_alloc`) will scan and potentially reuse this region, thereby minimizing external fragmentation of the logical address space.

#### Conclusion:

The Memory Management module at the Logical Level **implements** a memory management mechanism **predicated on** strict User/Kernel separation and an allocation strategy **utilizing** Free List reuse. This architecture **functions as** the initial abstraction layer, establishing the logical address space before these addresses are translated into physical frames by the 5-Level Paging mechanism detailed in the subsequent section.

### 2.2.2 Multi-level paging

a. **Implementation Details** The following core components constitute the paging mechanism.

**1. Bit Masking and Address Decoding (`mm64.h`)** The system uses macros to "peel off" each layer of the virtual address.

```

1  /* PGD Index: Extract 9 bits from position 48 to 56 */
2  #define PAGING64_ADDR_PGD_MASK    GENMASK64(56, 48)
3  #define PAGING64_ADDR_PGD_LOBIT   48
4  #define PAGING64_ADDR_PGD(addr)   ((addr & PAGING64_ADDR_PGD_MASK) >>
5                                PAGING64_ADDR_PGD_LOBIT)
6
6  /* P4D -> PUD -> PMD: Shift the bit window downwards */
7  #define PAGING64_ADDR_P4D(addr)   ((addr & GENMASK64(47, 39)) >> 39)
8  #define PAGING64_ADDR_PUD(addr)   ((addr & GENMASK64(38, 30)) >> 30)
9  #define PAGING64_ADDR_PMD(addr)   ((addr & GENMASK64(29, 21)) >> 21)
10
11 /* PT Index (Leaf Level): Bits 20-12 */
12 #define PAGING64_ADDR_PT_MASK    GENMASK64(20, 12)
13 #define PAGING64_ADDR_PT(addr)   ((addr & PAGING64_ADDR_PT_MASK) >> 12)
14
15 /* Offset: Last 12 bits */

```

```
16 #define PAGING64_ADDR_OFFSET(addr) (addr & GENMASK64(11, 0))
```

Listing 10: Bit Manipulation Macros in mm64.h

**Page Table Entry - PTE** (mm64.h & mm64.c) The PTE stores the page state. Manipulating the PTE requires bitwise operations (|, &, ~).

```

1  /* mm64.h - PTE Bit Definitions */
2  #define PAGING_PTE_PRESENT_MASK BITULL(0) // Bit 0: Page is in RAM
3  #define PAGING_PTE_DIRTY_MASK BITULL(6) // Bit 6: Page was written to
4  #define PAGING_PTE_SWAPPED_MASK BITULL(9) // Bit 9: Page is swapped out
5  #define PAGING_PTE_FPN_MASK GENMASK64(51, 12) // Bits 12-51: Frame Number
6
7  /* mm64.c - PTE Initialization Function */
8  int init_pte(addr_t *pte, int pre, addr_t fpn, int drt, int swp, int swptyp, addr_t
   swpoff) {
9    if (pre != 0) { // If page exists (Present)
10      if (swp == 0) { // Case 1: Page is in RAM
11        SETBIT(*pte, PAGING_PTE_PRESENT_MASK);
12        CLRBIT(*pte, PAGING_PTE_SWAPPED_MASK);
13        /* Write FPN to bits 12-51 */
14        SETVAL(*pte, fpn, PAGING_PTE_FPN_MASK, PAGING_PTE_FPN_LOBIT);
15      } else { // Case 2: Page is Swapped
16        SETBIT(*pte, PAGING_PTE_SWAPPED_MASK);
17        CLRBIT(*pte, PAGING_PTE_PRESENT_MASK);
18        /* Write Swap Offset */
19        SETVAL(*pte, swpoff, PAGING_PTE_SWPOFF_MASK, PAGING_PTE_SWPOFF_LOBIT);
20      }
21    }
22    return 0;
23 }
```

Listing 11: PTE Definition and Initialization

**3. Page Walk Mechanism** (mm64.c) The `get_page_table_entry` function is the core of translation, traversing 5 table levels.

```

1  uint64_t *get_page_table_entry(struct mm_struct *mm, addr_t addr, int alloc)
2  {
3    // Get Indices
4    int pgd_idx = PAGING64_ADDR_PGD(addr);
5    int p4d_idx = PAGING64_ADDR_P4D(addr);
6    int pud_idx = PAGING64_ADDR_PUD(addr);
7    int pmd_idx = PAGING64_ADDR_PMD(addr);
8    int pt_idx = PAGING64_ADDR_PT(addr);
9
10   // Check Root (PGD)
```

```

11     if (mm->pgd == NULL)
12     {
13         if (!alloc)
14             return NULL;
15         mm->pgd = alloc_aligned_table();
16
17         if (!mm->pgd)
18             return NULL;
19         memset(mm->pgd, 0, sizeof(struct pgd_t));
20     }
21
22 // Walk PGD -> P4D
23 struct p4d_t *p4d_table = get_next_level((uint64_t *)&mm->pgd->entries[pgd_idx],
24                                         alloc);
25 if (!p4d_table)
26     return NULL;
27
28 // Walk P4D -> PUD
29 struct pud_t *pud_table = get_next_level((uint64_t *)&p4d_table->entries[p4d_idx],
30                                         alloc);
31 if (!pud_table)
32     return NULL;
33
34 // Walk PUD -> PMD
35 struct pmd_t *pmd_table = get_next_level((uint64_t *)&pud_table->entries[pud_idx],
36                                         alloc);
37 if (!pmd_table)
38     return NULL;
39
40 // Walk PMD -> PT
41 struct pt_t *pt_table = get_next_level((uint64_t *)&pmd_table->entries[pmd_idx],
42                                         alloc);
43 if (!pt_table)
44     return NULL;
45
46 // Return pointer to the specific Page Table Entry (Leaf)
47 return (uint64_t *)&pt_table->entries[pt_idx];
48 }
```

Listing 12: Page Walk Logic (Simplified)

#### 4. Hardware Simulation (`mm-memphy.c`) RAM is simulated not as a physical chip but as a BYTE array.

```

1 /* mm-memphy.c */
2 int MEMPHY_read(struct memphy_struct *mp, addr_t addr, BYTE *value) {
3     if (mp == NULL) return -1;
```

```
4  /* Direct access to storage array at index addr */
5  *value = mp->storage[addr];
6  return 0;
7 }
8
9 int MEMPHY_write(struct memphy_struct *mp, addr_t addr, BYTE data) {
10 if (mp == NULL) return -1;
11 mp->storage[addr] = data;
12 return 0;
13 }
```

Listing 13: Physical Memory Access

**b. Algorithm Description** The system executes address translation via a strict procedure, simulating the hardware MMU (Memory Management Unit).

**Step 1: Address Decoding** The computer views the virtual address not as a large integer, but as a collection of indices.

- **Input:** 64-bit Virtual Address (VA).
- **Algorithm:** Use Bitwise AND (&) with a Mask to isolate specific bits, then Bitwise Right Shift (>>) to normalize the value into an integer index.
- **Example:** To get the PGD Index, the system extracts the highest 9 bits (56-48). This value determines the position in the PGD table.

**Step 2: Page Walk (Tree Traversal)** This is the core algorithm. Conceptually, the page table is a 5-level tree.

1. **Root:** Start from `mm->pgd` (stored in the CR3 register on real CPUs).

2. **Traversal:**

- Use PGD Index to select an entry in PGD. This entry contains the physical address of the P4D table.
- Use P4D Index to select an entry in P4D. This points to the PUD table.
- Continue similarly through PUD and PMD.

3. **Leaf:** At the final level (PT), use the PT Index to retrieve the **PTE**.

**Logic Flow:** CR3 → [PGD] → [P4D] → [PUD] → [PMD] → [PT] → Frame.

**Step 3: PTE Handling & State Check** Once the PTE is retrieved, the system checks the status bits:

- **Case A: Bit Present = 1 (Hit):** The page is in RAM. The system extracts the 40-bit FPN (Frame Page Number) from the PTE.
- **Case B: Bit Present = 0 & Swapped = 1 (Page Fault):** The data is not in RAM but in Swap. The system triggers `__mm_swap_page`: Find a free frame → Read data from Swap → Update PTE (Present=1, Swapped=0).

- **Case C: Bit Present = 0 & Swapped = 0 (Invalid):** Accessing unallocated memory → Segmentation Fault.

**Step 4: Physical Address Composition** After obtaining the frame number (FPN), the final physical address is calculated:

$$\text{PhysicalAddress} = (\text{FPN} \times \text{PageSize}) + \text{Offset}$$

In code, this is optimized using bitwise operations:

- `FPN << 12`: Left shift 12 bits (equivalent to multiplying by 4096).
- `| Offset`: Bitwise OR with Offset (lower 12 bits of VA) to combine them.

**Step 5: Physical Access** Finally, the Physical Address (PA) is passed to `mm-memphy.c`.

- The module treats RAM as a massive array `storage[]`.
- It accesses `storage[PA]` to read or write the actual data byte.

**Conclusion** The successful implementation of the 5-Level Paging mechanism demonstrates memory management at the finest granularity. The Page Walk algorithm, combined with precise Bitwise operations, ensures the system can translate addresses within the vast 64-bit space while transparently handling complex scenarios like Page Faults and Swapping.

### 3 Interprets the results of running tests

#### 3.1 Scheduling

##### 3.1.1 The result input file of scheduler (sched)

The following output describes the behavior of the scheduler over multiple time slots, showing how processes are loaded, dispatched, and processed by the CPU.

```
1 Time slot      0
2 ld_routine
3     Loaded a process at input/proc/p2s, PID: 1 PRIO: 0
4 Time slot      1
5     CPU 1: Dispatched process  1
6     Loaded a process at input/proc/p1s, PID: 2 PRIO: 1
7 Time slot      2
8     CPU 0: Dispatched process  2
9 Time slot      3
10    Loaded a process at input/proc/p2s, PID: 3 PRIO: 1
11 Time slot      4
12    Loaded a process at input/proc/p3s, PID: 4 PRIO: 0
13 Time slot      5
14    CPU 1: Put process  1 to run queue
15    CPU 1: Dispatched process  4
16 Time slot      6
17    CPU 0: Put process  2 to run queue
18    CPU 0: Dispatched process  1
19 Time slot      7
20 Time slot      8
21 Time slot      9
22    CPU 1: Put process  4 to run queue
23    CPU 1: Dispatched process  4
24 Time slot      10
25    CPU 0: Put process  1 to run queue
26    CPU 0: Dispatched process  1
27 Time slot      11
28 Time slot      12
29 Time slot      13
30    CPU 1: Put process  4 to run queue
31    CPU 1: Dispatched process  4
32 Time slot      14
33    CPU 0: Processed  1 has finished
34    CPU 0: Dispatched process  3
35 Time slot      15
36 Time slot      16
37    CPU 1: Processed  4 has finished
```

```
38      CPU 1: Dispatched process  2
39 Time slot    17
40 Time slot    18
41      CPU 0: Put process  3 to run queue
42      CPU 0: Dispatched process  3
43 Time slot    19
44 Time slot    20
45      CPU 1: Put process  2 to run queue
46      CPU 1: Dispatched process  2
47 Time slot    21
48 Time slot    22
49      CPU 0: Put process  3 to run queue
50      CPU 0: Dispatched process  3
51      CPU 1: Processed  2 has finished
52      CPU 1 stopped
53 Time slot    23
54 Time slot    24
55 Time slot    25
56 Time slot    26
57      CPU 0: Processed  3 has finished
58      CPU 0 stopped
```

### Explanation of Output

The output demonstrates the behavior of the scheduler across multiple time slots and the corresponding actions performed by the CPU. Below is an analysis of key steps:

- **Time Slot 0 - 4 (Loading and Initial Dispatching):**

- *Loading*: Processes are loaded sequentially. Process 1 (Prio 0) loads at slot 0; Process 2 (Prio 1) at slot 1; Process 3 (Prio 1) at slot 3; and Process 4 (Prio 0) at slot 4.
- *Dispatching*: CPU 1 picks up Process 1 at slot 1. CPU 0 picks up Process 2 at slot 2. This shows the workload being distributed across both CPUs immediately as processes arrive.

- **Time Slot 5 - 6 (Preemption and Context Switching):**

- *Preemption*: At slot 5, Process 1 (on CPU 1) exhausts its time slice and is put back into the run queue. CPU 1 immediately dispatches the newly loaded Process 4 (Prio 0).
- *Process Migration*: At slot 6, Process 2 (on CPU 0) is preempted. CPU 0 then picks up Process 1 (which was previously on CPU 1). This clearly demonstrates *process migration* in the SMP environment.

- **Time Slot 9 - 13 (Re-dispatching):**

- *Round-Robin Behavior*: Processes 4 and 1 continue to execute. At slot 9 and 13, CPU 1 puts Process 4 back to the queue but immediately re-dispatches it (likely because it has the highest priority/quota at that moment). Similarly, CPU 0 re-dispatches Process 1 at slot 10.

- **Time Slot 14 - 16 (Process Completion):**

- *Completion*: Process 1 finishes execution on CPU 0 at slot 14. CPU 0 then switches to Process 3 (Prio 1).
- *Completion*: Process 4 finishes execution on CPU 1 at slot 16. CPU 1 then switches to Process 2 (Prio 1), which had been waiting since slot 6.

- **Time Slot 22 - 26 (Finalization):**

- *Termination*: Process 2 finishes at slot 22, and CPU 1 stops as there are no more processes for it. Process 3 continues on CPU 0 until slot 26, where it finishes, and CPU 0 finally stops.

### 3.1.2 Scheduling: Gantt chart



Figure 6: Gantt Chart for CPU 0



Figure 7: Gantt Chart for CPU 1

## 3.2 Memory

### 3.2.1 Result of the input file of Memory (os\_0\_mlq\_paging)

```

1 Time slot    0
2 ld_routine

```

```
3     Loaded a process at input/proc/p0s, PID: 1 PRIO: 0
4 Time slot    1
5     CPU 1: Dispatched process  1
6     Loaded a process at input/proc/p1s, PID: 2 PRIO: 15
7 Time slot    2
8     CPU 0: Dispatched process  2
9 liballoc:152
10 --- Page Table Dump ---
11 print_pgtbl:
12 PGD=0x739974001000 P4D=0x739974003000 PUD=0x739974005000 PMD=0x739974007000 PT=0
13             x739974009000
14 0000000000000000: 0000000000000001
15 Time slot    3
16     Loaded a process at input/proc/p1s, PID: 3 PRIO: 0
17 liballoc:152
18 --- Page Table Dump ---
19 print_pgtbl:
20 PGD=0x739974001000 P4D=0x739974003000 PUD=0x739974005000 PMD=0x739974007000 PT=0
21             x739974009000
22 0000000000000000: 0000000000000001
23 Time slot    4
24 libfree:168
25 --- Page Table Dump ---
26 print_pgtbl:
27 PGD=0x739974001000 P4D=0x739974003000 PUD=0x739974005000 PMD=0x739974007000 PT=0
28             x739974009000
29 0000000000000000: 0000000000000001
30 Time slot    5
31 liballoc:152
32 --- Page Table Dump ---
33 print_pgtbl:
34 PGD=0x739974001000 P4D=0x739974003000 PUD=0x739974005000 PMD=0x739974007000 PT=0
35             x739974009000
36 0000000000000000: 0000000000000001
37 Time slot    6
38     Loaded a process at input/proc/p0s, PID: 4 PRIO: 0
39 libwrite:343
40 --- Page Table Dump ---
41 print_pgtbl:
42 PGD=0x739974001000 P4D=0x739974003000 PUD=0x739974005000 PMD=0x739974007000 PT=0
43             x739974009000
44 0000000000000000: 0000000000000001
45 Time slot    7
46     CPU 1: Put process  1 to run queue
```

```
42     CPU 1: Dispatched process 3
43 Time slot 8
44     CPU 0: Put process 2 to run queue
45     CPU 0: Dispatched process 4
46 Time slot 9
47 liballoc:152
48 --- Page Table Dump ---
49 print_pgtbl:
50 PGD=0x73996c001000 P4D=0x73996c003000 PUD=0x73996c005000 PMD=0x73996c007000 PT=0
      x73996c009000
51 0000000000000000: 0000000000001001
52 Time slot 10
53 liballoc:152
54 --- Page Table Dump ---
55 print_pgtbl:
56 PGD=0x73996c001000 P4D=0x73996c003000 PUD=0x73996c005000 PMD=0x73996c007000 PT=0
      x73996c009000
57 0000000000000000: 0000000000001001
58 Time slot 11
59 libfree:168
60 --- Page Table Dump ---
61 print_pgtbl:
62 PGD=0x73996c001000 P4D=0x73996c003000 PUD=0x73996c005000 PMD=0x73996c007000 PT=0
      x73996c009000
63 0000000000000000: 0000000000001001
64 Time slot 12
65 liballoc:152
66 --- Page Table Dump ---
67 print_pgtbl:
68 PGD=0x73996c001000 P4D=0x73996c003000 PUD=0x73996c005000 PMD=0x73996c007000 PT=0
      x73996c009000
69 0000000000000000: 0000000000001001
70 Time slot 13
71 libwrite:343
72 --- Page Table Dump ---
73 print_pgtbl:
74 PGD=0x73996c001000 P4D=0x73996c003000 PUD=0x73996c005000 PMD=0x73996c007000 PT=0
      x73996c009000
75 0000000000000000: 0000000000001001
76     CPU 1: Put process 3 to run queue
77     CPU 1: Dispatched process 1
78 libread:304
79 Time slot 14
80     CPU 0: Put process 4 to run queue
```

```
81      CPU 0: Dispatched process  3
82 Time slot  15
83 Time slot  16
84 Time slot  17
85 Time slot  18
86      CPU 0: Processed  3 has finished
87      CPU 0: Dispatched process  4
88 libread:304
89 Time slot  19
90      CPU 1: Put process  1 to run queue
91      CPU 1: Dispatched process  1
92 libfree:168
93 --- Page Table Dump ---
94 print_pgtbl:
95 PGD=0x739974001000 P4D=0x739974003000 PUD=0x739974005000 PMD=0x739974007000 PT=0
96           x739974009000
97 0000000000000000: 0000000000000001
98 Time slot  20
99 Time slot  21
100     CPU 1: Processed  1 has finished
101     CPU 1: Dispatched process  2
102 Time slot  22
103 Time slot  23
104 Time slot  24
105     CPU 0: Put process  4 to run queue
106     CPU 0: Dispatched process  4
107 libfree:168
108 --- Page Table Dump ---
109 print_pgtbl:
110 PGD=0x73996c001000 P4D=0x73996c003000 PUD=0x73996c005000 PMD=0x73996c007000 PT=0
111           x73996c009000
112 0000000000000000: 0000000000001001
113 Time slot  25
114     CPU 1: Processed  2 has finished
115     CPU 1 stopped
116 Time slot  26
117     CPU 0: Processed  4 has finished
118     CPU 0 stopped
```

### 3.2.2 Explain the status of the memory allocation in heap and data segments

- **Heap Segment:** It is the memory area for initialization (Dynamic Allocation) while the program is running. It has no fixed size and can be expanded.

- **Data Segment:** In this simplified simulation model, the "Data Segment" refers to the actual byte values written into the allocated Heap memory frames. While the Heap defines the "container" (virtual addresses and size), the Data represents the "content" stored in the corresponding Physical Frames (RAM).

### 3.2.3 Chronological Analysis of Memory Allocation

The following analysis tracks the state of memory allocation across specific time slots where memory intervention occurs. Two distinct processes are identified by their Page Global Directory (PGD) addresses:

- **Process A (p0s):** PGD Address 0x739974001000
- **Process B (p1s):** PGD Address 0x73996c001000

Start Explain the status of the memory allocation in heap and data segments.

- **Time Slot 2: Initial Allocation (Process A)**

- **Event:** liballoc:152 triggered by Process A.
- **Action:** The process requests memory allocation. The OS expands the Heap and creates a new virtual memory region starting at Virtual Address 0.
- **Page Table Dump:** 00...00: 00...01
- **Analysis:** The Page Table Entry (PTE) value 0x1 indicates:
  - \* **Present Bit (Bit 0):** 1 (Page is in RAM).
  - \* **Frame Page Number (FPN):** 0 (Derived from the upper bits).

Process A is assigned **Physical Frame 0**.

- **Time Slot 4: Logical Deallocation (Process A)**

- **Event:** libfree:168.
- **Action:** Process A calls `free`.
- **Heap Status:** The memory region is logically marked as "free" and added to the `vm_freerg_list` for future reuse.

- **Time Slot 5: Re-allocation (Process A)**

- **Event:** liballoc:152.
- **Action:** Process A requests allocation again.
- **Heap Status:** The OS checks the `vm_freerg_list`, finds the region freed in Time Slot 4, and reuses it.
- **Mapping:** The mapping remains 00...01 (Physical Frame 0).

- **Time Slot 6: Data Write (Process A)**

- **Event:** libwrite:343.
- **Action:** Data is written to the allocated address.

- **Data Status:** The actual value is stored in **Physical Frame 0**.
- **Time Slot 9: Allocation for New Process (Process B)**
  - **Event:** liballoc:152.
  - **Context Switch:** The PGD changes to 0x73996c001000, indicating a context switch to Process B.
  - **Page Table Dump:** 00...00: 00...1001
  - **Analysis:** The PTE value 0x1001 indicates:
    - \* **Present Bit:** 1.
    - \* **FPN:** 1 (The value 0x1000 at bit offset 12 corresponds to index 1).

Since Frame 0 is occupied by Process A, Process B is assigned the next available resource, **Physical Frame 1**.
- **Time Slot 11 & 13: Free and Write (Process B)**
  - **Time Slot 11 (Free):** Process B frees its memory region. The region enters Process B's private free list.
  - **Time Slot 13 (Write):** Process B writes data. This data is stored in **Physical Frame 1**.
- **Time Slot 19 & 24: Final Deallocation** Both processes perform their final `libfree` operations (Slot 19 for Process A, Slot 24 for Process B), releasing their respective logical memory regions.

#### 3.2.4 Summary of Result Flow

The table below summarizes the correlation between Time Slots, Processes, and Physical Memory status.

| Slot | PGD (Process)    | Event | Physical Frame | Logical Heap Status          |
|------|------------------|-------|----------------|------------------------------|
| 2    | ...7400 (Proc A) | Alloc | Frame 0        | New Region created           |
| 4    | ...7400 (Proc A) | Free  | Frame 0        | Region moved to Free List    |
| 5    | ...7400 (Proc A) | Alloc | Frame 0        | Region reused from Free List |
| 6    | ...7400 (Proc A) | Write | Frame 0        | Data written to Frame 0      |
| 9    | ...6c00 (Proc B) | Alloc | <b>Frame 1</b> | New Region created           |
| 11   | ...6c00 (Proc B) | Free  | Frame 1        | Region moved to Free List    |
| 13   | ...6c00 (Proc B) | Write | Frame 1        | Data written to Frame 1      |

Table 1: Memory Allocation Flow Summary

## 4 Answer question

### 4.1 Question 1

**Question:** What is the mechanism to pass a complex argument to a system call using the limited registers?

**Answer:**

CPU registers are very limited in number and size, they cannot hold large or complex data directly. Instead, the user-space program places the data in its own memory and passes only the memory address (pointer) of that data in a register. When the system call is invoked, the kernel receives the pointer from the register. But this method has a problem that the passed address may be wrong or bad address (pointing to kernel memory or unmapped memory) leading to the kernel crashing. To handle this problem, the kernel checks if the address provided is actually within the User space memory area. Then the kernel copies the data from User space into Kernel Space buffer, the Kernel performs its own safe copy of the data.

### 4.2 Question 2

**Question:** Which mechanisms does the operating system use to manage system calls that become unresponsive?

**Answer:**

Operating systems use a multi-layered approach to manage system calls that become unresponsive. The mechanisms used to manage this fall into three categories:

- Interruption Mechanisms: when the system call blocks, the operating system provides ways for the user or the application to forcefully break. In Linux most of blocking system calls place the process in a task interruptible state, when the process receives a signal like Ctrl + C, the kernel prematurely wakes the process.
- Timeout/Watchdogs: The kernel monitors itself to ensure that a single stuck system call does not freeze the entire computer. Many blocking system calls accept a timeout parameter. The kernel sets an internal timer; if the condition is not satisfied before the timer expires, the call returns with an error. In addition, the operating system runs a background kernel thread that scans all processes in every constant time (120 seconds in Linux), if it finds a process stuck in Uninterruptible Sleep (D state) for more than this duration, it logs a "Hung Task" error to the kernel log with a stack trace.
- Asynchronous Avoidance: Modern operating systems encourage developers to avoid "blocking" system calls entirely to prevent unresponsiveness. Applications can set file descriptors to non-blocking mode (O\_NONBLOCK). This means that if a system call would freeze, it returns immediately with an error instead of waiting. In addition, there are some API (Application Programming Interface) like io\_uring, epoll,... in Linux that allow applications to submit system calls to a queue and retrieve results later, ensuring the main application thread never actually blocks.

### 4.3 Question 3

**Question:** Evaluate the comparative advantages of the scheduling algorithm implemented in this assignment in relation to other scheduling algorithms you have learned. What is the complexity of this algorithm?

**Answer:**

The scheduling algorithm implemented in this assignment is a **Non-Feedback Multilevel Queue (MLQ)** augmented with a **Slot-based Round Robin** mechanism for inter-queue dispatching. The design employs a static array of priority queues, where each priority level is assigned a specific execution quota (slot). Based on the theoretical framework provided in *Operating System Concepts (10th Edition)* and the specific logic in `sched.c`, we present a detailed comparative evaluation and complexity analysis below.

## 1. Comparative Evaluation against Standard Algorithms

### a. Comparison with First-Come, First-Served (FCFS)

- **Theoretical Background:** FCFS is the most basic non-preemptive scheduling algorithm. The CPU is allocated to the process at the head of the ready queue. The critical drawback identified in operating system theory is the "*Convoy Effect*". This phenomenon occurs when a CPU-intensive process occupies the processor for an extended period, causing all subsequent I/O-bound processes to wait in the ready queue. This leads to extremely low I/O device utilization and a high Average Waiting Time.
- **Assignment Implementation Analysis:** The implemented algorithm in the assignment is inherently **Preemptive**.
  - *Mechanism:* The simulation framework in `os.c` enforces a `time_slot`. Even if a process has not completed its CPU burst, it is forced to yield the CPU and is placed back into the `mlq_ready_queue` via `put_mlq_proc()`.
  - *Code Evidence:* In `sched.c`, the scheduler manages `slot[i]`. When a process consumes its time slice, the scheduler is invoked again to pick a new candidate.
- **Comparative Advantage:** Our Slot-based MLQ algorithm completely eliminates the Convoy Effect. By forcing time-sharing (Round Robin within queues), short processes and I/O-bound processes are guaranteed frequent CPU access. This significantly improves the system's **Responsiveness**, making it suitable for interactive environments, whereas FCFS is strictly limited to Batch Systems.

### b. Comparison with Shortest-Job-First (SJF)

- **Theoretical Background (Optimality vs. Prediction):** SJF (or strictly, *Shortest-Next-CPU-Burst*) is theoretically proven to yield the minimum average waiting time. However, it suffers from a fundamental implementation flaw: the OS cannot know the length of the next CPU burst ( $\tau_{n+1}$ ) in advance.

To implement SJF, the system relies on **Exponential Averaging** to predict the future based on past behavior using the formula:

$$\tau_{n+1} = \alpha t_n + (1 - \alpha)\tau_n \quad (1)$$

Where  $t_n$  is the most recent burst,  $\tau_n$  is the past prediction, and  $\alpha$  ( $0 \leq \alpha \leq 1$ ) is the weighting factor.

- **Mechanism and Properties:** Functionally, this mechanism acts as a **Low-Pass Filter**, smoothing out short-term fluctuations (noise) to reveal the underlying trend of the process.

- *Exponential Decay:* Expanding the formula ( $\tau_{n+1} = \alpha t_n + (1 - \alpha)\alpha t_{n-1} + \dots$ ) reveals that the weight of past bursts decays exponentially. This ensures recent history dominates the prediction while distant history fades away.

- *Weighting Control:* The parameter  $\alpha$  controls stability. Typically  $\alpha = 0.5$  is used to balance recent and past history.

- **Scenario Analysis (When is it Good vs. Bad?):**

- **Effective Scenario (Stationary Behavior):** The algorithm excels when processes exhibit **Locality of Reference** (e.g., a process staying in a consistent CPU-bound phase). The formula quickly converges to the true average, and the smoothing effect prevents the scheduler from overreacting to minor random variations in burst length.

- **Degraded Scenario (Instability and Seasonality):**

- \* *Prediction Lag:* If a process behavior shifts suddenly (e.g., I/O-bound  $\rightarrow$  CPU-bound), the prediction lags behind due to the inertia of history, leading to suboptimal scheduling during the transition.
- \* *Seasonality Failure:* Simple exponential smoothing fails to capture cyclic patterns (e.g., 3 short bursts followed by 1 long burst). The algorithm will constantly "chase" the data—predicting high after a peak and low after a valley—always being one step behind the true cycle.
- \* *Outlier Sensitivity:* If  $\alpha \approx 1$ , the system lacks stability. A single random long burst (outlier) causes the prediction to spike, unfairly penalizing the process in the next cycle.

- **Assignment Implementation Analysis (Static Determinism):** Our implemented algorithm prioritizes **Determinism** and **Feasibility** over theoretical optimality.

- *Mechanism:* Calculating exponential averages requires floating-point arithmetic or complex fixed-point math inside the kernel, which incurs per-process overhead. Instead of guessing  $\tau_{n+1}$ , our system relies on the explicit `proc->prio` value defined in the PCB.
- *Code Evidence:* The selection logic in `sched.c` (`get_mfq_proc`) utilizes simple integer comparisons and the `slot` mechanism. This avoids the runtime cost of prediction and ensures predictable behavior ( $O(1)$  decision complexity).

- **Comparative Evaluation:** Unlike SJF, which can degrade if  $\alpha$  is poorly tuned or if processes exhibit erratic/seasonal burst patterns, our Slot-based MLQ guarantees that high-priority processes always run first. Simultaneously, the slot mechanism prevents the starvation that strict priority scheduling usually suffers from, providing a balanced trade-off between responsiveness and fairness without the computational cost of prediction.

**c. Comparison with Shortest-Remaining-Time-First (SRTF)**

- **Theoretical Background:** SRTF is the **preemptive** version of SJF. When a new process arrives, if its next CPU burst is shorter than the remaining time of the currently executing process, the scheduler preempts the current process. Theoretically, SRTF provides an even lower average waiting time than non-preemptive SJF. However, it shares the same implementation hurdles (prediction difficulty) and introduces a new risk: frequent context switching if many short jobs arrive.

- **Assignment Implementation Analysis:**

- *Preemption Logic:* Our algorithm is also preemptive, but the preemption condition is different. SRTF preempts based on *Remaining Time* (requiring prediction), whereas our algorithm preempts based on *Time Slice Expiration* (Round Robin logic) or *Slot Exhaustion*.
- *Responsiveness:* Both algorithms are responsive to new processes. In SRTF, a new short job runs immediately. In our MLQ, a new high-priority job is enqueued at the tail of its priority queue and waits for the current time slice to finish (or runs immediately if the current process yields).

- **Comparative Advantage:** The critical advantage of our design over SRTF is the prevention of **Starvation**. In SRTF, a continuous stream of short processes can indefinitely block long processes (CPU-bound), causing them to never execute. In contrast, our Slot-based MLQ imposes a finite "execution budget" (slots) on every priority level. Even if the highest-priority queue (often containing interactive/short tasks) remains populated, it cannot monopolize the CPU. Once its slots are exhausted ( $\text{slot}[i] == 0$ ), the scheduler is forced to yield to lower-priority queues, guaranteeing fairness for long-running tasks.

**d. Comparison with Pure Priority Scheduling**

- **Theoretical Background:** In standard Priority Scheduling, the CPU is always allocated to the highest priority process. The fatal flaw of this design is "*Indefinite Blocking*" or "*Starvation*". If a steady stream of high-priority processes arrives, low-priority processes may never execute. The standard solution is "*Aging*" (gradually increasing the priority of waiting processes), which adds complexity to the system.
- **Assignment Implementation Analysis (The Slot Mechanism):** This is the most innovative aspect of the assignment's design. It solves starvation **without** changing process priorities (*Aging*).
  - *Code Logic:* See implementation in `sched.c` (lines 65–77).
  - *Operational Theory:* The system assigns a "CPU Budget" to each priority level based on the formula  $Budget = MAX\_PRIO - Prio$ .
  - *Example:* Priority 0 gets 140 slots. Priority 1 gets 139 slots. Even if Priority 0 has infinite tasks, after 140 dispatches, the `slot[0]` becomes 0. The scheduler is mathematically forced to skip Priority 0 and serve Priority 1.
- **Comparative Advantage:** The implementation achieves **Starvation Freedom** through a "Share-based" approach. It guarantees that every priority level, no matter how low, eventually receives a turn when higher levels exhaust their quotas. This is a robust and fairness-oriented improvement over standard Priority Scheduling.

**e. Comparison with Standard Multilevel Queue (Partitioned)**

- **Theoretical Background:** Standard MLQ partitions the ready queue into distinct groups, typically "Foreground" (Interactive) and "Background" (Batch). The scheduling between queues is often Fixed-Priority (leading to starvation) or Time-Slicing with coarse granularity (e.g., 80% CPU for Foreground, 20% for Background).

- **Assignment Implementation Analysis:** Our implementation offers a **Fine-Grained Proportional Share**.
  - *Granularity:* Instead of just 2 or 3 partitions, the system supports `MAX_PRIO` (140) distinct levels.
  - *Distribution:* The slot allocation decreases linearly (140, 139, ..., 1).
- **Comparative Advantage:** This linear distribution creates a smoother degradation of service. It allows the OS to differentiate processes with high precision (e.g., a process with priority 10 is slightly favored over 11), providing more flexibility than the rigid Foreground/Background binary of standard MLQ.

#### f. Comparison with Multilevel Feedback Queue (MLFQ)

- **Theoretical Background:** MLFQ is considered the most general and powerful scheduling algorithm. Its key feature is *Feedback*: processes move between queues. A process using too much CPU moves down; a process waiting too long moves up (Aging). This allows the OS to learn the nature of the process (I/O-bound vs CPU-bound) dynamically.
- **Assignment Implementation Limitation:** The assignment specification explicitly states a non-feedback design.
  - *Code Evidence:* In `put_mlfq_proc()`, a process is always enqueued back to `mlq_ready_queue[proc->prio]`. The `prio` field is never modified by the scheduler.
- **Disadvantage/Trade-off:** The lack of feedback makes the system less adaptive. If a high-priority process becomes CPU-intensive, it continues to enjoy a large slot quota (140 slots), potentially degrading overall system throughput compared to an MLFQ which would demote such a process. However, this simplifies the implementation significantly and reduces the overhead of priority recalculation.

## 2. Algorithmic Complexity Analysis

The performance of the scheduler is determined by the operations on the `queue_t` structure (array-based) and the search logic in `sched.c`. Let  $N$  be the number of processes in a specific queue, and  $K$  be the number of priority levels ( $K = \text{MAX\_PRIO} = 140$ ).

- **Enqueue Operation ( $O(1)$ ):**
  - *Code Reference:* `queue.c: enqueue()`.
  - *Analysis:* The function inserts a process at the tail of the array: `q->proc[q->size] = proc`. This requires direct index access and incrementing the size counter.
  - *Complexity:* Constant time,  $O(1)$ .
- **Dequeue Operation ( $O(N)$ ):**
  - *Code Reference:* `queue.c: dequeue()`.
  - *Analysis:* The scheduler retrieves the process at the head (index 0). Because the queue is implemented as a contiguous memory block (array), removing the first element creates a "gap". The loop `for (i = 0; i < q->size - 1; i++)` is required to shift all remaining  $N-1$  elements to the left to preserve data contiguity.

- *Complexity:* Linear time relative to queue size,  $O(N)$ .
- **Scheduler Selection Logic ( $O(K + N)$ ):**
  - *Code Reference:* `sched.c: get_mlq_proc()`.
  - *Analysis:* The routine employs a linear search across priority levels: `for (int i = 0; i < MAX_PRIO; i++)`. In the worst-case scenario (e.g., when the system is idle or all high-priority slots are exhausted), the loop iterates  $K$  times. Upon finding a valid queue, it calls `dequeue()`.
  - *Calculation:*  $Cost = Search\_Cost + Retrieval\_Cost = O(K) + O(N)$ .
  - **Total Complexity:**  $O(K + N)$ .

#### 4.4 Question 4

**Question:** Discuss the architectural and operational implications of implementing multiple memory segments in this simple OS.

**Answer:**

Implementing multiple memory segments fundamentally structures how the OS manages the virtual address space on top of the physical paging system. The implications can be divided into architectural changes and operational behaviors.

##### 1. Architectural Implications

From an architectural perspective, this adds a logical organization layer above the raw paging mechanism:

- a. **Data Structure Transformation:** In the design, a segment is represented by a `vm_area_struct` inside the per-process `mm_struct`. Implementing multiple segments means that instead of a single VMA (current `vm_id = 0` for "data/heap"), `mm->mmap` becomes a linked list of VMAs. For example:
  - `vm_id = 0`: Code / Read-only data
  - `vm_id = 1`: Heap
  - `vm_id = 2`: Stack
  - `vm_id = 3`: Shared regions, etc.
- b. **Logical Abstraction Layer:** Architecturally, this creates a logical layer on top of the 5-level page tables:
  - The **5-level paging** (PGD-P4D-PUD-PMD-PT) remains unchanged and is solely responsible for translating virtual page numbers to physical frames.
  - The **Segmentation layer** decides *which* virtual ranges exist and *what* they are used for. Each VMA tracks its virtual interval `[vm_start, vm_end]`, current growth point `sbrk`, and a free-region list for local holes.
- c. **Structured Extensibility:** This makes the address space structured and extensible. Different regions can later be assigned different protection policies (e.g., code as read-only, stack as non-executable, or shared segments mapped into multiple processes) while still sharing the same page table root (`mm->pgd`).

## 2. Operational Implications

On the operational side, supporting multiple segments changes the workflow of memory operations:

- a. **Segment-Aware Operations:** All core memory functions (`__alloc`, `__free`, `__read`, `__write`) become segment-aware. They take a `vma_id` parameter to select the correct VMA via `get_vma_by_num(mm, vma_id)` before performing any work. The allocator only searches and updates the free-region list of that specific segment, and `inc_vma_limit` grows only that targeted VMA when more virtual space is needed.
- b. **Bookkeeping vs. Isolation:** There is a trade-off involving increased bookkeeping for better isolation. Every time the kernel extends a segment, it must check for overlaps with other VMAs (`validate_overlap_vm_area`) before calling `vm_map_ram` to map new pages. While this adds overhead (walking the VMA list, extra validation), it guarantees that distinct segments (e.g., Heap vs. Stack) do not collide and grow independently.
- c. **Contextual Paging and Swapping:** While low-level page-table operations in `mm64.c` (allocating frames, filling PTEs, swapping victim pages) remain structurally the same, the context of each page becomes clearer. The OS knows whether a page belongs to code, heap, or stack based on the VMA it falls into. This facilitates segment-specific policies (e.g., preventing code pages from being swapped out or prioritizing stack pages) without altering the core paging algorithms.

In summary, while multiple segments slightly increase complexity in the memory manager, the OS gains a much more organized virtual address space and a natural architectural hook to implement protection and per-segment policies on top of the underlying 5-level paging mechanism.

### 4.5 Question 5

**Question:** What will happen if we divide the address to more than 2 levels in the paging memory management system?

**Answer:**

When the address is divided to more than 2 levels in the paging memory management system, there will be some key consequences:

#### 1. Advantages

- a. **Reduced Page Table Size:** By dividing the address space into more levels, the system can allocate page table entries more efficiently, only creating page table pages for regions that are actually in use. This reduces the overall memory footprint of page tables compared to a flat or two-level structure that might require allocating entries for unused address ranges.
- b. **Fine-grained Control:** Additional levels provide more granular control over memory management, allowing the operating system to manage different regions of the address space with higher utilization and more efficient allocation.

#### 2. Challenges

- a. **Increased Translation Overhead:** Each additional level requires an extra memory access during address translation, which increases the time needed to translate virtual addresses to physical addresses. This overhead can impact system performance, especially for memory-intensive applications that frequently access different pages.
- b. **Implementation Complexity:** Managing multiple levels of page tables requires more complex algorithms for page table creation, modification, and cleanup, making the memory management subsystem harder to design, implement, and maintain. The complexity also increases the potential for bugs and makes debugging more difficult.
- c. **Potential Fragmentation:** With more levels, the page table structure itself can become fragmented across physical memory (due to partially filled tables), leading to inefficient memory usage. This fragmentation can also complicate memory allocation strategies for the page tables themselves.

## 4.6 Question 6

**Question:** What are the advantages and disadvantages of segmentation with paging?

**Answer:**

Segmentation with paging is a memory management scheme that combines the benefits of both segmentation and paging. It organizes memory into logical segments, each further divided into fixed-size pages, offering both flexibility and efficient use of physical memory.

### 1. Advantages

- a. **Eliminates External Fragmentation:** Paging ensures that physical memory is divided into fixed-size frames, and since segments are broken into pages, there is no external fragmentation in physical memory. This eliminates the need for contiguous physical memory allocation that plagues pure segmentation schemes.
- b. **Flexible Memory Management:** The system can efficiently handle variable-sized logical divisions while maintaining fixed-size physical allocations.
- c. **Flexible Protection and Sharing:** Protection can be applied per segment (e.g., code = read-only, stack = read/write), enabling fine-grained control over memory access and enforcing security boundaries. Segments can be shared between processes (e.g., shared libraries) while paging avoids fragmentation, minimizing duplication and improving memory efficiency.
- d. **Allows Large Logical Address Spaces:** Each segment can be large, even larger than physical memory, because paging supports virtual memory. This enables processes to work with address spaces that exceed available physical RAM through demand paging and swapping mechanisms.

### 2. Disadvantages

- a. **Increased Complexity in Hardware/OS:** Managing both segments and pages requires a segment table and a page table for each segment, with two-level translation increasing system complexity. This requires additional hardware support and more complex software algorithms, increasing the risk of errors and maintenance overhead.

- b. **Higher Overhead in Address Translation:** Address translation involves multiple steps: logical address → segment → page → frame → physical address, which is more complex than pure paging or pure segmentation. The system may need Translation Lookaside Buffers (TLBs) for both segment and page entries, introducing additional memory access latency and overhead in systems with limited resources.
- c. **More Memory Needed for Table Storage:** The system must store a segment table and multiple page tables (one per segment), which can be costly for programs with many segments. This increases the memory overhead for management structures compared to simpler memory management schemes.
- d. **Internal Fragmentation Still Exists:** Each page is fixed size, so the last page of a segment may have unused space, leading to internal fragmentation within pages. While this is typically small compared to external fragmentation, it still represents wasted memory.

In summary, segmentation with paging effectively balances flexibility, protection, and memory efficiency by eliminating external fragmentation in physical memory and supporting logical program structures. However, this comes at the cost of added complexity in hardware and software design, increased memory overhead for table storage, higher address translation overhead, and the persistence of internal fragmentation. The hybrid system requires maintaining both segment tables and page tables, with operations spanning both abstraction levels, making the implementation more complex than pure paging or pure segmentation systems.

#### 4.7 Question 7

**Question:** What happens if the synchronization is not handled in your Simple OS? Illustrate the problem of your simple OS (assignment outputs) by example if you have any.

**Answer:**

The Simple OS uses a multi-threaded architecture where multiple CPU threads (from `cpu_routine()` in `os.c`) and a loader thread (from `ld_routine()`) execute concurrently, sharing critical data structures. Without proper synchronization, severe system failures would occur.

##### 1. Synchronization Mechanisms in Simple OS

The OS implements three main synchronization mechanisms:

- **Timer Synchronization (`timer.c`):** Uses mutexes (`event_lock`, `timer_lock`) and condition variables (`event_cond`, `timer_cond`) to coordinate time slot advancement. The `timer_routine()` waits for all devices to complete before incrementing `_time`, and `next_slot()` synchronizes CPU/loader threads with the timer.
- **Scheduler Synchronization (`sched.c`):** Uses `queue_lock` mutex to protect shared scheduler state including `mlq_ready_queue[]`, `slot_remaining[]`, `current_queue_index`, and `running_list`. Functions like `get_mlq_proc()`, `put_mlq_proc()`, and `add_mlq_proc()` are all protected by this mutex.
- **Memory Management Synchronization (`libmem.c`):** Uses `mmvm_lock` mutex to protect VM area structures and page table operations. Functions like `__alloc()`, `__write()`, and `__read()` acquire this lock before accessing memory management structures.

## 2. Consequences of Missing Synchronization

**a. Ready Queue Corruption:** Without `queue_lock` protection, multiple CPU threads could simultaneously call `get_m1q_proc()` and `put_m1q_proc()`, leading to race conditions on `m1q_ready_queue[]`. This could result in:

- **Lost Processes:** A process dequeued by one CPU thread might be simultaneously dequeued by another, causing one CPU to receive NULL or an invalid process pointer.
- **Duplicate Process Execution:** The same process could be dispatched to multiple CPUs simultaneously, leading to inconsistent state and data corruption.
- **Queue Structure Corruption:** Concurrent enqueue/dequeue operations could corrupt the queue's internal structure (size, pointers), causing segmentation faults or infinite loops.

**b. Scheduler State Inconsistency:** The scheduler maintains state variables `slot[]` and `m1q_ready_queue` that must be updated atomically. Without synchronization:

- **Incorrect Priority Scheduling:** Multiple threads reading and updating `m1q_ready_queue` simultaneously could cause the scheduler to skip priority levels or process them out of order, violating the MLQ scheduling policy.
- **Slot Counter Corruption:** The `slot[MAX_PRIO]` counter could be incorrectly decremented by multiple threads, causing processes to receive incorrect time slices or be prematurely moved to lower priority queues.

**c. Timer Desynchronization:** The timer mechanism in `timer_routine()` coordinates all threads by waiting for each device's `done` flag before advancing time slots. Without proper mutex protection:

- **Time Slot Skips:** If multiple threads call `next_slot()` simultaneously, the timer might advance before all threads complete their work, causing some threads to miss time slots or execute out of order.
- **Deadlock in Timer:** The condition variable waits in `timer_routine()` and `next_slot()` (lines 61-63 below) could deadlock if mutexes are not properly acquired, freezing the entire system.
- **Inconsistent Time Perception:** Different CPU threads might observe different values of `current_time()` if the timer advances while threads are reading it, leading to incorrect process scheduling decisions.

**d. Memory Management Corruption:** Without `mmvm_lock` protection, concurrent memory operations could corrupt critical structures:

- **VM Area List Corruption:** Multiple threads accessing `mm_struct->mmap` (the VM area linked list) simultaneously could corrupt list pointers, causing processes to access wrong memory regions or crash when traversing the list.
- **Race Conditions:** Concurrent calls to `pte_set_fpn()` and `pte_get_entry()` could result in reading partially updated page table entries, causing incorrect address translations and memory access violations.

- **Symbol Table Corruption:** The `symrgtbl[]` array could be corrupted if multiple threads allocate memory regions simultaneously, leading to processes accessing incorrect memory addresses or overwriting each other's data.

### 3. System-Wide Failures

Without synchronization, the Simple OS would experience cascading failures:

- **Non-Deterministic Behavior:** The same input could produce different outputs on each run due to race conditions, making debugging and testing extremely difficult.
- **System Crashes:** Segmentation faults from corrupted pointers, null pointer dereferences, or invalid memory accesses would cause the entire OS simulation to crash.
- **Incorrect Execution Results:** Even if the system doesn't crash, processes might execute incorrectly due to corrupted state, producing wrong outputs that are difficult to detect.

### 4. Experimental Demonstration: Removing Timer Synchronization

To demonstrate the consequences of missing synchronization, we removed all mutex protection from `timer_routine()` (`timer.c`) and `next_slot()` functions. This eliminates the critical synchronization mechanism that coordinates time slot advancement across all CPU threads and the loader thread.

#### a. Experimental Setup

The mutexes `event_lock` and `timer_lock` (defined per device in `struct timer_id_t`), along with their associated condition variables (`event_cond` and `timer_cond`), were removed from the timer synchronization code. Specifically:

- In `timer_routine()`: Removed mutex locks at lines 32, 44, 53, 56 that protect the `done` flag checks and updates
- In `next_slot()`: Removed mutex locks at lines 69, 72, 75, 82 that coordinate between threads and the timer

This allows multiple threads (CPU threads from `cpu_routine()` in `os.c` and the loader thread from `ld_routine()`) to access and modify the timer state (`_time` of `timer.c`) and device `done` flags concurrently without any protection.

#### b. Observed Output

The following output was captured from a test run with timer synchronization removed:

```
1 Time slot 0
2 ld_routine
3 Loaded a process at input/proc/p0s, PID: 1 PRIO: 0
4
5 Time slot 1
6 CPU 0: Dispatched process 1
7 Loaded a process at input/proc/p1s, PID: 2 PRIO: 15
8 CPU 1: Dispatched process 2
9
10 Time slot 2
```

```
11 liballoc:152
12 --- Page Table Dump ---
13 print_pgtbl:
14 PGD=0x7fbab8001000 P4D=0x7fbab8003000 PUD=0x7fbab8005000 PMD=0x7fbab8007000
15 PT=0x7fbab8009000
16 0000000000000000: 0000000000000001
17
18 Time slot 3
19 liballoc:152
20 --- Page Table Dump ---
21 print_pgtbl:
22 PGD=0x7fbab8001000 P4D=0x7fbab8003000 PUD=0x7fbab8005000 PMD=0x7fbab8007000
23 PT=0x7fbab8009000
24 0000000000000000: 0000000000000001
25 Loaded a process at input/proc/p1s, PID: 3 PRIO: 0
26
27 Time slot 4
28 libfree:168
29 --- Page Table Dump ---
30 print_pgtbl:
31 PGD=0x7fbab8001000 P4D=0x7fbab8003000 PUD=0x7fbab8005000 PMD=0x7fbab8007000
32 PT=0x7fbab8009000
33 0000000000000000: 0000000000000001
34
35 Time slot 5
36 liballoc:152
37 --- Page Table Dump ---
38 print_pgtbl:
39 PGD=0x7fbab8001000 P4D=0x7fbab8003000 PUD=0x7fbab8005000 PMD=0x7fbab8007000
40 PT=0x7fbab8009000
41 0000000000000000: 0000000000000001
42
43 Time slot 6
44 libwrite:343
45 --- Page Table Dump ---
46 print_pgtbl:
47 PGD=0x7fbab8001000 P4D=0x7fbab8003000 PUD=0x7fbab8005000 PMD=0x7fbab8007000
48 PT=0x7fbab8009000
49 0000000000000000: 0000000000000001
50
51 Loaded a process at input/proc/p0s, PID: 4 PRIO: 0
52 CPU 0: Put process 1 to run queue
53 CPU 1: Put process 2 to run queue
54 CPU 1: Dispatched process 3
55 CPU 0: Dispatched process 4
```

```
50
51 Time slot    7
52
53 Time slot    8
54 liballoc:152
55 --- Page Table Dump ---
56 print_pgtbl:
57 PGD=0x7fbab800b000 P4D=0x7fbab800d000 PUD=0x7fbab800f000 PMD=0x7fbab8011000
      PT=0x7fbab8013000
58 0000000000000000: 0000000000001001
59
60 Time slot    9
61 liballoc:152
62 --- Page Table Dump ---
63 print_pgtbl:
64 PGD=0x7fbab800b000 P4D=0x7fbab800d000 PUD=0x7fbab800f000 PMD=0x7fbab8011000
      PT=0x7fbab8013000
65 0000000000000000: 0000000000001001
66
67 Time slot    10
68 libfree:168
69 --- Page Table Dump ---
70 print_pgtbl:
71 PGD=0x7fbab800b000 P4D=0x7fbab800d000 PUD=0x7fbab800f000 PMD=0x7fbab8011000
      PT=0x7fbab8013000
72 0000000000000000: 0000000000001001
73
74 Time slot    11
75 liballoc:152
76 --- Page Table Dump ---
77 print_pgtbl:
78 PGD=0x7fbab800b000 P4D=0x7fbab800d000 PUD=0x7fbab800f000 PMD=0x7fbab8013000
      0000000000000000: 0000000000001001
79
80 Time slot    12
81 libwrite:343
82 --- Page Table Dump ---
83 print_pgtbl:
84 PGD=0x7fbab800b000 P4D=0x7fbab800d000 PUD=0x7fbab800f000 PMD=0x7fbab8011000
      PT=0x7fbab8013000
85 0000000000000000: 0000000000001001
86
87 Time slot    13
88 CPU 1: Put process 3 to run queue
```

```
90 CPU 1: Dispatched process 1
91 libread:304
92 CPU 0: Put process 4 to run queue
93 CPU 0: Dispatched process 3
94
95 Time slot 14
96
97 Time slot 15
98
99 Time slot 16
100
101 Time slot 17
102 CPU 0: Processed 3 has finished
103 CPU 0: Dispatched process 4
104 libread:304
105
106 Time slot 18
107
108 Time slot 19
109 CPU 1: Put process 1 to run queue
110 CPU 1: Dispatched process 1
111 libfree:168
112 --- Page Table Dump ---
113 print_pgtbl:
114 PGD=0x7fbab8001000 P4D=0x7fbab8003000 PUD=0x7fbab8005000 PMD=0x7fbab8007000
          PT=0x7fbab8009000
115 0000000000000000: 0000000000000001
116
117 Time slot 20
118
119 Time slot 21
120 CPU 1: Processed 1 has finished
121 CPU 1: Dispatched process 2
122
123 Time slot 22
124
125 Time slot 23
126 CPU 0: Put process 4 to run queue
127 CPU 0: Dispatched process 4
128 libfree:168
129 --- Page Table Dump ---
130 print_pgtbl:
131 PGD=0x7fbab800b000 P4D=0x7fbab800d000 PUD=0x7fbab800f000 PMD=0x7fbab8011000
          PT=0x7fbab8013000
```

```
132 0000000000000000: 0000000000001001
133
134 Time slot 24
135
136 Time slot 25
137 CPU 0: Processed 4 has finished
138 CPU 1: Processed 2 has finished
139 CPU 0 stopped
140 CPU 1 stopped
```

### c. Analysis of the Results

While this particular run completed with no errors, several critical observations highlight the dangers of missing synchronization:

- **Concurrent Operations in Same Time Slot:** At time slot 6, we observe multiple events occurring simultaneously within the same time slot: `libwrite:343`, process loading (PID 4), CPU 0 putting process 1 to run queue, CPU 1 putting process 2 to run queue, and both CPUs dispatching new processes (3 and 4). Without timer synchronization, the loader thread, CPU 0, and CPU 1 are all executing concurrently. Without mutexes, threads can read and modify the `done` flag and `_time` variable concurrently, causing operations from different logical time slots to interleave.
- **Potential Page Table Race Conditions:** While the page table dumps appear consistent in this run, without proper synchronization, concurrent access to page table structures could lead to race conditions. Multiple threads could simultaneously call `pte_set_fpn()` or `pte_get_entry()` (in `mm64.c`), which traverse the 5-level page table hierarchy via `get_page_table_entry()`. Without mutex protection, one thread could be updating a page table entry (via `get_next_level()` allocating new directory levels) while another thread reads it, potentially observing partially updated pointers or inconsistent states. The fact that this run shows consistent page tables does not guarantee correctness—race conditions are non-deterministic and may manifest differently in subsequent runs.
- **Non-Deterministic Execution Order:** The output shows that operations from different threads can appear in any order within a time slot. For example, at time slot 6, the process loading, CPU queue operations, and dispatches all occur in an unpredictable sequence. This non-determinism makes it impossible to reproduce bugs consistently, as the exact interleaving depends on thread scheduling by the operating system.
- **Time Slot Coordination Failure:** The `timer_routine()` function (lines 20-64 of `timer.c`) is designed to:
  - i. Wait for all devices to set their `done` flag (lines 30-45)
  - ii. Increment `_time` (line 48)
  - iii. Signal all devices to continue (lines 51-57)

Without mutex protection, multiple threads can call `next_slot()` simultaneously, and the timer thread can advance `_time` while other threads are still reading it or setting their `done` flags. This breaks the atomicity of time slot transitions, causing threads to operate on inconsistent time values.

- **Ambiguous Results:** The fact that this run completed with no errors demonstrates a critical characteristic of race conditions: they are **non-deterministic**. The absence of visible errors in one run does not guarantee correctness. While this particular run shows consistent page table structures and no obvious data corruption, the concurrent operations observed (especially at time slots 6 and 13) indicate that threads are executing without proper coordination. Subsequent runs with the same input might:
  - Crash due to corrupted pointers or invalid memory accesses
  - Deadlock if threads wait on condition variables without proper mutex protection (the condition variable waits in `timer_routine()` and `next_slot()` require mutexes to function correctly)
  - Exhibit different execution orders, making debugging and testing impossible

#### d. Why This Demonstrates the Problem

The removal of timer synchronization mutexes creates a **time-of-check to time-of-use (TOC-TOU)** vulnerability and breaks the atomicity of time slot transitions. The synchronization protocol in `timer.c` works as follows:

- i. Each device thread calls `next_slot()` which sets `done = 1` and signals the timer (lines 69-72)
- ii. The device then waits for the timer to advance (lines 75-82)
- iii. The timer waits for all devices to set `done = 1` (lines 30-45)
- iv. The timer increments `_time` and resets all `done` flags (lines 48, 54-56)
- v. The timer signals all devices to continue (line 55)

Without mutex protection, this protocol breaks:

- Thread A might read `_time = 5` and set `done = 1`
- Thread B might read `_time = 5` and also set `done = 1`
- The timer thread might increment `_time` to 6 while Thread A is still in the middle of its operation
- Thread A completes its operation thinking it's in time slot 5, but the timer has already advanced
- Multiple threads can observe different values of `_time` simultaneously

The concurrent operations observed at time slots 6 and 13 provide concrete evidence of this race condition: multiple threads are executing operations simultaneously without proper coordination, breaking the atomicity of time slot transitions.

In summary, synchronization is essential for the Simple OS's correctness. The mutex-protected critical sections in `timer.c` (protecting `_time` and `done` flags), `sched.c` (protecting queue operations), and `libmem.c` (protecting memory management structures) ensure that shared data structures are accessed atomically, preventing race conditions, data corruption, and system failures. The experimental removal of timer synchronization



demonstrates that even when a run appears successful, the underlying race conditions create non-deterministic behavior. The concurrent operations observed in the output show that threads are not properly coordinated, which can lead to subtle bugs, data corruption, and unpredictable system behavior in subsequent runs.

## 5 Overall

In this assignment, we undertook the design and implementation of the core components for a simple Operating System simulation, focusing on a Symmetric Multiprocessing (SMP) environment. The project integrated three critical modules: a Slot-based Scheduler, a Hybrid Memory Management system, and a Synchronization mechanism.

For process management, we implemented a Slot-based Multi-Level Queue (MLQ) scheduling algorithm. Unlike standard priority scheduling which is prone to starvation, our design introduces a dynamic slot mechanism ( $slot = MAX\_PRIO - prio$ ) combined with Round-Robin execution within each queue. This approach is designed to balance honoring process priority with maintaining fairness, aiming to allow lower-priority tasks to eventually receive CPU resources even under heavy load. The implementation addresses the logic of context switching and load balancing across multiple simulated CPUs.

Regarding memory management, we constructed a hybrid architecture that combines Segmentation with 5-Level Paging. A focal point of this implementation is the simulation of the hardware page walk mechanism. We programmed the traversal logic that iteratively resolves a 64-bit virtual address through the five hierarchical levels: PGD → P4D → PUD → PMD → PT. By utilizing bit-masking and shifting operations to extract directory indices at each level, our system replicates the translation process of modern MMUs. This mechanism is intended to enable the OS to handle sparse memory allocation within a massive 128 PiB address space while establishing isolation between user and kernel modes.

Furthermore, we addressed the challenge of concurrency in a multi-threaded architecture. We identified potential race conditions that could lead to data corruption or system instability when multiple CPUs and the loader access shared resources simultaneously. By applying synchronization mechanisms, specifically Mutex locks on the ready queue, memory structures, and the global timer, we sought to preserve the atomicity of critical operations and promote the deterministic behavior of the system.

In conclusion, this assignment has provided a comprehensive hands-on experience in building OS kernel components. It bridged the gap between theoretical concepts, such as virtual memory translation, process scheduling, thread safety and practical system programming. The insights gained from handling the complexity of the 5-level page walking algorithm and SMP synchronization serve as a foundation for understanding the architecture of modern, high-performance operating systems.

## 6 Source Code

1. **Scheduler:** [Link GitHub](#)
2. **Memory Management:** [Link GitHub](#)