Motivation
Physical memory is volatile, and thus we need external storage to store persistent information.
Thus, to access this external storage for persistence, we need access to this media. The file system acts as an abstraction for this.
File system provides
- abstraction on top of physical media
- high level resource management
- protection between processes and users
- sharing between processes and users
Criteria
Self-contained
Information stored on a media is enough to describe entire organisation.
Effectively, the file system should be able to ‘plug-and-play’ - it should work automagically..
Persistent
The file systems should work beyond lifetimes of OS and processes.
Efficient
The file system should provide good management of free/used space, and minimum overhead for bookkeeping.
Memory Management vs File System
Memory Management | File System Management | |
---|---|---|
Underlying storage | RAM | Disk |
Access Speed | Constant | Variable disk I/O time |
Unit of Addressing | Physical memory address | Disk sector |
Usage | Address space for process (Implicit access) | Non-volatile data (Explicit access) |
Organisation | Paging/segmentation | Many different systems (ext* , FAT* , HFS ) |
File System Abstraction
File
Logical unit of information
File acts as an Abstract Data Type, which are an abstraction over data, allowing a set of common operations with various possible implementation.
The abstractions contain:
- data
- metadata
File Metadata
Field | Description |
---|---|
Name | Human readable reference |
Identifier | Unique ID used internally by FS |
Type | Indicate different types of file |
Size | Current size |
Protection | Access permissions |
Time/date/owner/information | Creation/text modification |
Table of contents | Information on how to access |
File Name
Different file systems have different naming rules to determine valid file name.
Windows
In Windows, the extension tied to the file name informs what the file is (and the application to run it).
Unix
In Unix, they use information stored in the header.
File Type
Operating system support a number of file types, such as
- regular files
- directories
- special files
Each file type has an associated set of operations.
Regular files could be either
- ASCII files (text files) that can be displayed or printed as is
- Binary files that have a predefined/internal structure
File Protection
Types of access:
- Read (retrieve info)
- Write (write/rewrite)
- Execute (load file into memory)
- Append (add new information)
- Delete (remove file from FS)
- List (read metadata)
A common scheme for this is Role-Based Authorisation Scheme (RBAC).
In Unix, the users are classified into three classes (Owner, Group, Universe), and permissions for each of these classes are defined.
The ACL can be seen using getfacl
- which shows further information on the file.
Operations on File Metadata
Operations could be
- renaming
- changing attributes
- read attributes
File Data
A file can be stored as
- an array of bytes (each byte with a unique offset)
- fixed length record (allows to jump to any record easily)
- variable length record (flexible but harder to locate record)
The file itself can be accessed
- sequentially (read in order from beginning) - no skipping
- randomly (read in any order)
- direct (allows random access to any record directly)
Operations on File Data
Operations on file data
- create
- open
- read
- write
- repositioning
- truncate
As System Calls
OS provides file operations as system calls to provide protection, concurrent and efficient access and maintain information.
For an opened file
- the file pointer points to the current location in file
- the disk location shows actual file location on disk
- open count refers to how many processes this file has opened
Note that:
- several processes can open the same file
- several different files can be opened at any time
How can we organise the open-file information
Two approaches
- System-wide open-file table: One entry per unique file
- Per-process open-file table: One entry points to system-wide table
What happens when two processes use the same file descriptor?