Previously, two assumptions:
- each process occupies a contiguous memory region ❌
- physical memory is large enough to contain one or more processes with complete memory space ❌
Assuming both assumptions are not made, situations can have a few issues
- what if the logical memory space of process is more than physical memory?
- what if same program is executed on computer with lesser physical memory?
General Idea
Effectively, this is needed when
Generally - the logical address space is split into small chunks, where some chunks reside in physical memory, and the other is stored on secondary storage.
Pros
- completely separates logical memory addresses from physical emmory
- allows for more efficient use of physical memory
- allows more processes to reside in memory, improving CPU utilisation
Extended Paging Scheme
Two page types are added
- memory resident (page is in physical memory)
- non-memory resident (page in secondary storage)
- store a bit to discern whether page is in physical memory or not in page table entry
Page fault
When CPU tries to access a non-memory resident page.
The CPU can only access a memory resident page, and to access a non-memory resident page, the page has to be brought into physical memory.
Steps to access page
- Check page table
- Access physical memory location if page is memory resident.
- Page fault: Trap to OS
- Locate the page in secondary storage
- Load the page into physical memory
- Update page table
- Go back to step 1 to try access physical memory.
flowchart TD step1(Check page table) --> step2{Is page memory resident} step2-->|yes|finish(Access physical memory location) step2-->|no|step3(Page fault: trap to OS) step3-->step4(Locate page in secondary storage) step4-->step5(Load page into physical memory) step5-->step6(Update page table) step6-->step1
Thrashing
If the memory access results in page fault most of the time.
Locality Principles
How do we know that thrashing is unlikely to happen?
How can we confirm that after a page is loaded, it is likely to be used for future accesses?
Most programs exhibit locality behaviours, where most times are spent on a relatively small part of code, and in a time period, accesses are made to a relatively small part of data only.
Temporal locality
Memory address likely to be used again
Thus, to exploit this principle, we can consider that after a page is loaded, it will be likely to be accessed in near future. Effectively, we keep loaded pages, because we believe that it will be accessed again.
Spatial locality
Memory addresses closed to a used address is likely to be used
Thus, to exploit this principle, we can consider that after a page is loaded, pages nearby will be likely to be accessed in near future. Effectively, we can store the pages contiguous to it, as they would most likely be accessed.
Demand Paging
When a new process is launched, should all text (instruction), data (global vars) pages be allocated in RAM?
The considerations are as follows:
- there is a large startup cost if there are a large number of pages to allocate and initialise
- we need a way to find out which pages to load if we don’t load all the pages
- footprint of process in physical memory should be minimised to allow more processes to be memory resident
General Idea
The page is only loaded when there is a page fault, and the process starts with no memory resident page.
Pros
- fast startup time for new process
- small memory footprint
Cons
- process can appear sluggish at start, due to multiple page faults
- cascading effect of page faults on other processes
Page Table Structure
Page table information is kept with the process information, and takes up physical memory space. Modern computer systems provided huge logical space, which requires a huge number of pages. Each page requries a page table entry, which also results in a huge page table.
Problems with huge page table
- high overhead
- fragmented page table (as the page table is so big, it will have to occupy multiple memory pages)
Direct Paging
Keep all entries in single table
Given
bits is needed to specify one unique page page table entries are needed, with physical frame number and additional information bits.
32-bit system
If the page size is specified as 4KiB, which is
bits,
- every page table entry is 4 bytes
- page table size is
bytes = 4MiB
2-Level Paging
Note that the process may not use the entire virtual memory space, and thus a full page table is a waste of space.
Split full page table into regions
In 2-Level paging:
- the full page table is split into regions (similar to splliting logical memory space into pages)
- only a few regions are used, and new regions are allocated dynamically
- a directory is used to keep track of the regions
Thus, it is implemented by splitting page table into smaller page tables with a page table number for identification.
Given a page table with
- splitting into
smaller page tables, bits are needed to identify one page table - smaller page tables contain
entries - single page directory is needed to keep track of smaller page tables, with
indices
Pros
- there can be empty entries in the page directory
- corresponding pages need not be allocated
- saves space
- less overhead
- overhead is just the page directory, plus the allocated page tables
32-bit system
Direct paging scheme
2^{12}
bits,
- every page table entry is 4 bytes
- page table size is
bytes = 4MiB With the 2-level paging scheme,
- with the page directory having
entries,
- the page directory has an overhead of
bytes = 2KiB - with the page tables having thus
entries, and only allocating perhaps 3 page tables in use,
- the page tables have a total overhead of
bytes = 24KiB - the total voerhead is 26KiB, significantly lesser.
Inverted Page Tables
As there is a page table for each processes, there will be
Keep single mapping of physical frame to
<pid, page #>
Instead of keeping a page table for each mapping, a single mapping of a physical frame is kept to the process itself - meaning that each physical frame will be uniquely identified by their process, and their page number.
In an inverted page table, the entries are not ordered by page number, but instead frame number, and thus to lookup a page, there is a need to search the whole table.
One table for all processes (save overhead)
Slow translation (due to searching whole table)
Page Replacement Algorithms
When there is no more free physical memory frame during a page fault, a memory page needs to be freed before the requested page can be accessed and added to the page table.
When a page is evicted (freed), there are two possibilities:
- if the page is clean: it is not modified, and thus, there is no need to write back
- if the page is dirty: it is modified, and thus, there is a need to write back
For page replacement algorithms
Only the page number is important, and memory references are modeled as memory reference strings (a sequence of page numbers)
Evaluation
where:
is the probability of page fault is the access time needed for a memory resident page is the access time needed if a page fault occurs
As
Thus, to evaluate an algorithm, we want one that reduces the total number of page faults.
Optimal Page Replacement
Replace page that will not be used again for the longest period of time
This guarantees the minimum number of page faults.
As this needs future knowledge of memory references, this is NOT realizable.
However, this is still useful as base of comparison for other algorithms. The closer the result is to one given by OPT, the better the algorithm.
FIFO Page Replacement
Evict oldest memory page
Implementation
- OS maintains a queue of resident page numbers
- Remove the first page in queue if replacement is needed
- Update queue during page fault trap
Pros
Simple to implement
Belady's anomaly
Given more physical frames, the amount of page faults increases, instead of decreases.
Cons
- Belady’s anomaly
- does not exploit temporal locality
Least Recently Used
Make use of temporal locality, by replacing the page that has not been used in the longest time.
This algorithm expects that if a page is not reused in a short time window, it most likely will not be used again. (This approximates the OPT algorithm)
Pros
- does not suffer from Belady’s anomaly
Cons
- implementation needs substantial hardware support
- need to keep track of last access time
Possible implementation approaches
Approach A: Counter
Use a logical time counter, incremented for every memory reference. Page table entry then contains a time-of-use field, which is replaced everytime reference occurs. If a page needs to be freed, the page can be replaced with smallest time-of-use.
Cons
- Need to search through all pages
- Time-of-use constantly increasing, with a possible overflow
Approach B: Stack
A stack of page numbers is maintained, and when a page is referenced,
- the page is removed from the stack (if existing)
- page is pushed to top of stack
When a page needs to be freed,
- replace page at bottom of stack
Cons
- Entries can be removed from anywhere in the stack.
- Hard to implement in hardware
CLOCK
Modified FIFO to give a second chance to pages that are accessed
Each Page Table Entry now maintains a reference bit, to determine whether it has been reaccessed again.
Algorithm
- Select oldest FIFO page
- If reference bit is 0, page is replaced
- Otherwise, page is given a 2nd chance, and reference bit is cleared to 0.
- Repeat Step 1, with the next oldest FIFO page
flowchart TD s1(select oldest FIFO page) --> s2{check reference bit} s2 -->|0| s2a(replace page) s2 -->|1| s3(clear reference bit of page to 0) s3 --> s4(select next oldest FIFO page) s4 --> s2
Cons
- degenerates into FIFO algorithm when reference bit is 1 for all pages
Frame Allocation
When there are
N
physical memory frames, withM
processes, what is the best way to distribute the frames?
Simple Approaches
Equal allocation
Each process gets
frames
Proportional allocation
Given that processes are different in size, each process gets
Implicit Assumptions
Page replacement can affect which frames are replaced, and whether the frames that are replaced are replaced locally or globally.
Local replacement
Local replacement
Victim page selected among pages of process causing page fault
If algorithms follow local replacement,
Pros
- frames allocated to process remain constant, with stable performance
Cons
- if frame allocation is not enough, process is hindered.
Thrashing is limited to one process, but single process can hog I/O and degrade performance of other processes.
Global replacement
Global replacement
Victim page can be selected among any page.
If algorithms follow global replacement,
Pros
- self-adjustment is allowed between process
Cons
- badly behaved processes can affect others
- frames allocated to a process is different from run-to-run
Thrashing can cause other processes to thrash by stealing pages from other processes (cascading thrashing)
Working Set Model
Observation
The set of pages referenced by a process is relatively constant in a period of time (locality).
Thus, we can consider that:
- in a new locality, a process will cause page fault for set of pages
- with the set of pages in the frame, there will be minimal page faults until process transits to new locality.
Defines a working set window (interval of time), where enough frames are allocated to reduce possibility of page faults.
The Working Set Window
The accuracy of the model is directly affected by the choice of interval
- if it is too small, it misses pages in the current locality
- if it is too big, it contains pages from different locality