Previously, two assumptions:

each process occupies a contiguous memory region ❌
physical memory is large enough to contain one or more processes with complete memory space ❌

Assuming both assumptions are not made, situations can have a few issues

what if the logical memory space of process is more than physical memory?
what if same program is executed on computer with lesser physical memory?

General Idea

Effectively, this is needed when

Generally - the logical address space is split into small chunks, where some chunks reside in physical memory, and the other is stored on secondary storage.

Pros

completely separates logical memory addresses from physical emmory

allows for more efficient use of physical memory

allows more processes to reside in memory, improving CPU utilisation

Extended Paging Scheme

Two page types are added

memory resident (page is in physical memory)
non-memory resident (page in secondary storage)
store a bit to discern whether page is in physical memory or not in page table entry

Page fault

When CPU tries to access a non-memory resident page.

The CPU can only access a memory resident page, and to access a non-memory resident page, the page has to be brought into physical memory.

Steps to access page

Check page table
Access physical memory location if page is memory resident.
Page fault: Trap to OS
Locate the page in secondary storage
Load the page into physical memory
Update page table
Go back to step 1 to try access physical memory.

flowchart TD
step1(Check page table) --> step2{Is page memory resident}
step2-->|yes|finish(Access physical memory location)
step2-->|no|step3(Page fault: trap to OS)
step3-->step4(Locate page in secondary storage)
step4-->step5(Load page into physical memory)
step5-->step6(Update page table)
step6-->step1

Thrashing

If the memory access results in page fault most of the time.

Locality Principles

How do we know that thrashing is unlikely to happen?

How can we confirm that after a page is loaded, it is likely to be used for future accesses?

Most programs exhibit locality behaviours, where most times are spent on a relatively small part of code, and in a time period, accesses are made to a relatively small part of data only.

Temporal locality

Memory address likely to be used again

Thus, to exploit this principle, we can consider that after a page is loaded, it will be likely to be accessed in near future. Effectively, we keep loaded pages, because we believe that it will be accessed again.

Spatial locality

Memory addresses closed to a used address is likely to be used

Thus, to exploit this principle, we can consider that after a page is loaded, pages nearby will be likely to be accessed in near future. Effectively, we can store the pages contiguous to it, as they would most likely be accessed.

Demand Paging

When a new process is launched, should all text (instruction), data (global vars) pages be allocated in RAM?

The considerations are as follows:

there is a large startup cost if there are a large number of pages to allocate and initialise
- we need a way to find out which pages to load if we don’t load all the pages
footprint of process in physical memory should be minimised to allow more processes to be memory resident

General Idea

The page is only loaded when there is a page fault, and the process starts with no memory resident page.

Pros

fast startup time for new process

small memory footprint

Cons

process can appear sluggish at start, due to multiple page faults

cascading effect of page faults on other processes

Page Table Structure

Page table information is kept with the process information, and takes up physical memory space. Modern computer systems provided huge logical space, which requires a huge number of pages. Each page requries a page table entry, which also results in a huge page table.

Problems with huge page table

high overhead

fragmented page table (as the page table is so big, it will have to occupy multiple memory pages)

Direct Paging

Keep all entries in single table

Given pages,

bits is needed to specify one unique page
page table entries are needed, with physical frame number and additional information bits.

32-bit system

If the page size is specified as 4KiB, which is bits,

every page table entry is 4 bytes

page table size is bytes = 4MiB

2-Level Paging

Note that the process may not use the entire virtual memory space, and thus a full page table is a waste of space.

Split full page table into regions

In 2-Level paging:

the full page table is split into regions (similar to splliting logical memory space into pages)
only a few regions are used, and new regions are allocated dynamically
a directory is used to keep track of the regions

Thus, it is implemented by splitting page table into smaller page tables with a page table number for identification.

Given a page table with entries,

splitting into smaller page tables, bits are needed to identify one page table
smaller page tables contain entries
single page directory is needed to keep track of smaller page tables, with indices

Pros

there can be empty entries in the page directory

corresponding pages need not be allocated

saves space

less overhead

overhead is just the page directory, plus the allocated page tables

32-bit system

Direct paging scheme $2^{12}$ bits,

every page table entry is 4 bytes

page table size is bytes = 4MiB

With the 2-level paging scheme,

with the page directory having entries,

the page directory has an overhead of bytes = 2KiB

with the page tables having thus entries, and only allocating perhaps 3 page tables in use,

the page tables have a total overhead of bytes = 24KiB

the total voerhead is 26KiB, significantly lesser.

Inverted Page Tables

As there is a page table for each processes, there will be processes in memory for independent page labels. However, there might be a case where there are more processes than physical memory frames , etc: . Thus, out of the page tables, only entries can be valid at maximum.

Keep single mapping of physical frame to <pid, page #>

Instead of keeping a page table for each mapping, a single mapping of a physical frame is kept to the process itself - meaning that each physical frame will be uniquely identified by their process, and their page number.

In an inverted page table, the entries are not ordered by page number, but instead frame number, and thus to lookup a page, there is a need to search the whole table.

One table for all processes (save overhead)

Slow translation (due to searching whole table)

Page Replacement Algorithms

When there is no more free physical memory frame during a page fault, a memory page needs to be freed before the requested page can be accessed and added to the page table.

When a page is evicted (freed), there are two possibilities:

if the page is clean: it is not modified, and thus, there is no need to write back
if the page is dirty: it is modified, and thus, there is a need to write back

For page replacement algorithms

Only the page number is important, and memory references are modeled as memory reference strings (a sequence of page numbers)

Evaluation

where:

is the probability of page fault
is the access time needed for a memory resident page
is the access time needed if a page fault occurs

As , is the target of reduction to keep reasonable.

Thus, to evaluate an algorithm, we want one that reduces the total number of page faults.

Optimal Page Replacement

Replace page that will not be used again for the longest period of time

This guarantees the minimum number of page faults.

As this needs future knowledge of memory references, this is NOT realizable.

However, this is still useful as base of comparison for other algorithms. The closer the result is to one given by OPT, the better the algorithm.

FIFO Page Replacement

Evict oldest memory page

Implementation

OS maintains a queue of resident page numbers
Remove the first page in queue if replacement is needed
Update queue during page fault trap

Pros

Simple to implement

Belady's anomaly

Given more physical frames, the amount of page faults increases, instead of decreases.

Cons

Belady’s anomaly

does not exploit temporal locality

Least Recently Used

Make use of temporal locality, by replacing the page that has not been used in the longest time.

This algorithm expects that if a page is not reused in a short time window, it most likely will not be used again. (This approximates the OPT algorithm)

Pros

does not suffer from Belady’s anomaly

Cons

implementation needs substantial hardware support

need to keep track of last access time

Possible implementation approaches

Approach A: Counter

Use a logical time counter, incremented for every memory reference. Page table entry then contains a time-of-use field, which is replaced everytime reference occurs. If a page needs to be freed, the page can be replaced with smallest time-of-use.

Cons

Need to search through all pages

Time-of-use constantly increasing, with a possible overflow

Approach B: Stack

A stack of page numbers is maintained, and when a page is referenced,

the page is removed from the stack (if existing)
page is pushed to top of stack

When a page needs to be freed,

replace page at bottom of stack

Cons

Entries can be removed from anywhere in the stack.

Hard to implement in hardware

CLOCK

Modified FIFO to give a second chance to pages that are accessed

Each Page Table Entry now maintains a reference bit, to determine whether it has been reaccessed again.

Algorithm

Select oldest FIFO page
If reference bit is 0, page is replaced
Otherwise, page is given a 2nd chance, and reference bit is cleared to 0.
Repeat Step 1, with the next oldest FIFO page

flowchart TD

s1(select oldest FIFO page) --> s2{check reference bit}
s2 -->|0| s2a(replace page)
s2 -->|1| s3(clear reference bit of page to 0)
s3 --> s4(select next oldest FIFO page)
s4 --> s2

Cons

degenerates into FIFO algorithm when reference bit is 1 for all pages

Frame Allocation

When there are $N$ physical memory frames, with $M$ processes, what is the best way to distribute the frames?

Simple Approaches

Equal allocation

Each process gets frames

Proportional allocation

Given that processes are different in size, each process gets

Implicit Assumptions

Page replacement can affect which frames are replaced, and whether the frames that are replaced are replaced locally or globally.

Local replacement

Local replacement

Victim page selected among pages of process causing page fault

If algorithms follow local replacement,

Pros

frames allocated to process remain constant, with stable performance

Cons

if frame allocation is not enough, process is hindered.

Thrashing is limited to one process, but single process can hog I/O and degrade performance of other processes.

Global replacement

Global replacement

Victim page can be selected among any page.

If algorithms follow global replacement,

Pros

self-adjustment is allowed between process

Cons

badly behaved processes can affect others

frames allocated to a process is different from run-to-run

Thrashing can cause other processes to thrash by stealing pages from other processes (cascading thrashing)

Working Set Model

Observation

The set of pages referenced by a process is relatively constant in a period of time (locality).

Thus, we can consider that:

in a new locality, a process will cause page fault for set of pages
with the set of pages in the frame, there will be minimal page faults until process transits to new locality.

Defines a working set window (interval of time), where enough frames are allocated to reduce possibility of page faults.

The Working Set Window is defined, with an interval of time, where

The accuracy of the model is directly affected by the choice of interval ,

if it is too small, it misses pages in the current locality
if it is too big, it contains pages from different locality

Explorer

Virtual Memory Management

General Idea

Extended Paging Scheme

Locality Principles

Demand Paging

General Idea

Page Table Structure

Direct Paging

2-Level Paging

Inverted Page Tables

Page Replacement Algorithms

Evaluation

Optimal Page Replacement

FIFO Page Replacement

Least Recently Used

Possible implementation approaches

CLOCK

Frame Allocation

Simple Approaches

Implicit Assumptions

Local replacement

Global replacement

Working Set Model

Graph View

Table of Contents

Backlinks