Section 14: The UNIX File System
Most UNIX machines store their files on magnetic disk drives. A disk
drive is a device that can store information by making electrical
imprints on a magnetic surface. One or more heads skim close to the
spinning magnetic plate, and can detect, or change, the magnetic state
of a given spot on the disk. The drives use disk controllers to
position the head at the correct place at the correct time to read
from, or write to, the magnetic surface of the plate. It is often
possible to partition a single disk drive into more than one logical
storage area. This section describes how the UNIX operating system
deals with a raw storage device like a disk drive, and how it manages
to make organized use of the space.
How the UNIX file system works
Every item in a UNIX file system can de defined as belonging to one of
four possible types:
- Ordinary files
- Ordinary files can contain text, data, or program information. An
ordinary file cannot contain another file, or directory. An ordinary
file can be thought of as a one-dimensional array of bytes.
- Directories
- In a previous section, we described directories as containers that
can hold files, and other directories. A directory is actually
implemented as a file that has one line for each item contained within
the directory. Each line in a directory file contains only the name
of the item, and a numerical reference to the location of the item.
The reference is called an i-number, and is an index to a table
known as the i-list. The i-list is a complete list of all the
storage space available to the file system.
- Special files
- Special files represent input/output (i/o) devices, like a tty
(terminal), a disk drive, or a printer. Because UNIX treats such
devices as files, a degree of compatibility can be achieved between
device i/o, and ordinary file i/o, allowing for the more efficient use
of software. Special files can be either character special
files, that deal with streams of characters, or block special
files, that operate on larger blocks of data. Typical block sizes
are 512 bytes, 1024 bytes, and 2048 bytes.
- Links
- A link is a pointer to another file. Remember that a directory is
nothing more than a list of the names and i-numbers of files. A
directory entry can be a hard link, in which the i-number
points directly to another file. A hard link to a file is
indistinguishable from the file itself. When a hard link is made,
then the i-numbers of two different directory file entries point to
the same inode. For that reason, hard links cannot span across file
systems. A soft link (or symbolic link) provides an
indirect pointer to a file. A soft link is implemented as a directory
file entry containing a pathname. Soft links are distinguishable from
files, and can span across file systems. Not all versions of UNIX
support soft links.
The I-List
When we speak of a UNIX file system, we are actually referring to an
area of physical memory represented by a single i-list. A UNIX
machine may be connected to several file systems, each with its own
i-list. One of those i-lists points to a special storage area, known
as the root file system. The root file system contains the
files for the operating system itself, and must be available at all
times. Other file systems are removable. Removable file systems can
be attached, or mounted, to the root file system. Typically,
an empty directory is created on the root file system as a mount
point, and a removable file system is attached there. When you issue
a cd command to access the files and directories of a mounted
removable file system, your file operations will be controlled through
the i-list of the removable file system.
The purpose of the i-list is to provide the operating system with a
map into the memory of some physical storage device. The map is
continually being revised, as the files are created and removed, and
as they shrink and grow in size. Thus, the mechanism of mapping must
be very flexible to accomodate drastic changes in the number and size
of files. The i-list is stored in a known location, on the same
memory storage device that it maps.
Each entry in an i-list is called an i-node. An i-node is a
complex structure that provides the necessary flexibility to track the
changing file system. The i-nodes contain the information necessary to
get information from the storage device, which typically communicates
in fixed-size disk blocks. An i-node contains 10 direct
pointers, which point to disk blocks on the storage device. In
addition, each i-node also contains one indirect pointer, one
double indirect pointer, and one triple indirect
pointer. The indirect pointer points to a block of direct
pointers. The double indirect pointer points to a block of indirect
pointers, and the triple indirect pointer points to a block of double
indirect pointers. By structuring the pointers in a geometric
fashion, a single i-node can represent a very large file.
It now makes a little more sense to view a UNIX directory as a list of
i-numbers, each i-number referencing a specific i-node on a specific
i-list. The operating system traces its way through a file path by
following the i-nodes until it reaches the direct pointers that
contain the actual location of the file on the storage device.
The file system table
Each file system that is mounted on a UNIX machine is accessed through
its own block special file. The information on each of the block
special files is kept in a system database called the file system
table, and is usually located in /etc/fstab. It includes information
about the name of the device, the directory name under which it will
be mounted, and the read and write privileges for the device. It is
possible to mount a file system as "read-only," to prevent users from
changing anything.
File system quotas
Although not originally part of the UNIX filesystem, quotas quickly
became a widely-used tool. Quotas allow the system administrator to
place limits on the amount of space the users can allocate. Quotas
usually place restrictions on the amount of space, and the number of
files, that a user can take. The limit can be a soft limit,
where only a warning is generated, or a hard limit, where no
further operations that create files will be allowed.
The command
- quota
will let you know if you're
over your soft limit. Adding the -v option will provide statistics
about your disk usage.
File system related commands
Here are some commands related to file system usage, and other topics
discussed in this section:
- bdf
- On HP-UX systems, reports file system usage statistics
- df
- Reports on free disk blocks, and i-nodes
- du
- Summarizes disk usage in a specified directory hierarchy
- ln
- Creates a hard link (default), or a soft link (with -s option)
- mount, umount
- Attaches, or detaches, a file system (super user only)
- mkfs
- Constructs a new file system (super user only)
- fsck
- Evaluates the integrity of a file system (super user only)
A brief tour of the UNIX filesystem
The actual locations and names of certain system configuration files
will differ under different inplementations of UNIX. Here are some
examples of important files and directories under version 9 of the
HP-UX operating system:
- /stand/vmunix
- The kernel program
- /dev/
- Where special files are kept
- /usr/bin/
- Executable system utilities, like sh, cp, rm
- /etc/
- System configuration files and databases
- /usr/lib/
- Operating system and programming libraries
- /tmp/
- System scratch files (all users can write here). As a courtesy to others please remove your files from this directory as soon as you are finished with them.
- /lost+found/
- Where the file system checker puts detached files
- /usr/include/
- Standard system header files
- /usr/local/
- Typically a place where local utilities go
- /usr/man
- The manual pages are kept here
Other places to look for useful stuff
If you get an account on an unfamiliar UNIX system, take a tour of the
directories listed above, and familiarize yourself with their
contents. Another way to find out what is available is to look at the
contents of your PATH environment variable:- echo
$PATH
You can use the ls command to list the contents of
each directory in your path, and the man command to get help on
unfamiliar utilities. A good systems administrator will ensure that
manual pages are provided for the utilities installed on the system.