NOTE: CentOS Enterprise Linux is built from the Red Hat Enterprise Linux source code. Other than logo and name changes CentOS Enterprise Linux is compatible with the equivalent Red Hat version. This document applies equally to both Red Hat and CentOS Enterprise Linux.
Once a mass storage device is in place, there is little that it
can be used for. True, data can be written to it and read back from
it, but without any underlying structure data access is only
possible by using sector addresses (either geometrical or
What is needed are methods of making the raw storage a hard
drive provides more easily usable. The following sections explore
some commonly-used techniques for doing just that.
The first thing that often strikes a system administrator is
that the size of a hard drive may be much larger than necessary for
the task at hand. As a result, many operating systems have the
capability of dividing a hard drive's space into various partitions or slices.
Because they are separate from each other, partitions can have
different amounts of space utilized, and that space in no way
impacts the space utilized by other partitions. For example, the
partition holding the files comprising the operating system is not
affected even if the partition holding the users' files becomes
full. The operating system still has free space for its own
Although it is somewhat simplistic, you can think of partitions
as being similar to individual disk drives. In fact, some operating
systems actually refer to partitions as "drives". However, this
viewpoint is not entirely accurate; therefore, it is important that
we look at partitions more closely.
Partitions are defined by the following attributes:
Partition type field
These attributes are explored in more detail in the following
A partition's geometry refers to its physical placement on a
disk drive. The geometry can be specified in terms of starting and
ending cylinders, heads, and sectors, although most often
partitions start and end on cylinder boundaries. A partition's size
is then defined as the amount of storage between the starting and
The partition type refers to the partition's relationship with
the other partitions on the disk drive. There are three different
The following sections describe each partition type.
Primary partitions are partitions that take up one of the four
primary partition slots in the disk drive's partition table.
Extended partitions were developed in response to the need for
more than four partitions per disk drive. An extended partition can
itself contain multiple partitions, greatly extending the number of
partitions possible on a single drive. The introduction of extended
partitions was driven by the ever-increasing capacities of new disk
Logical partitions are those partitions contained within an
non-extended primary partition.
Each partition has a type field that contains a code indicating
the partition's anticipated usage. The type field may or may not
reflect the computer's operating system. Instead, it may reflect
how data is to be stored within the partition. The following
section contains more information on this important point.
Even with the proper mass storage device, properly configured,
and appropriately partitioned, we would still be unable to store
and retrieve information easily — we are missing a way of
structuring and organizing that information. What we need is a
The concept of a file system is so fundamental to the use of
mass storage devices that the average computer user often does not
even make the distinction between the two. However, system
administrators cannot afford to ignore file systems and their
impact on day-to-day work.
A file system is a method of representing data on a mass storage
device. File systems usually include the following features:
File-based data storage
Hierarchical directory (sometimes known as "folder")
Tracking of file creation, access, and modification times
Some level of control over the type of access allowed for a
Some concept of file ownership
Accounting of space utilized
Not all file systems posses every one of these features. For
example, a file system constructed for a single-user operating
system could easily use a more simplified method of access control
and could conceivably do away with support for file ownership
One point to keep in mind is that the file system used can have
a large impact on the nature of your daily workload. By ensuring
that the file system you use in your organization closely matches
your organization's functional requirements, you can ensure that
not only is the file system up to the task, but that it is more
easily and efficiently maintainable.
With this in mind, the following sections explore these features
in more detail.
While file systems that use the file metaphor for data storage
are so nearly universal as to be considered a given, there are
still some aspects that should be considered here.
First is to be aware of any restrictions on file names. For
instance, what characters are permitted in a file name? What is the
maximum file name length? These questions are important, as it
dictates those file names that can be used and those that cannot.
Older operating systems with more primitive file systems often
allowed only alphanumeric characters (and only uppercase at that),
and only traditional 8.3 file names
(meaning an eight-character file name, followed by a
three-character file extension).
While the file systems used in some very old operating systems
did not include the concept of directories, all commonly-used file
systems today include this feature. Directories are themselves
usually implemented as files, meaning that no special utilities are
required to maintain them.
Furthermore, because directories are themselves files, and
directories contain files, directories can therefore contain other
directories, making a multi-level directory hierarchy possible.
This is a powerful concept with which all system administrators
should be thoroughly familiar. Using multi-level directory
hierarchies can make file management much easer for you and for
Most file systems keep track of the time at which a file was
created; some also track modification and access times. Over and
above the convenience of being able to determine when a given file
was created, accessed, or modified, these dates are vital for the
proper operation of incremental backups.
More information on how backups make use of these file system
features can be found in Section
Access control is one area where file systems differ
dramatically. Some file systems have no clear-cut access control
model, while others are much more sophisticated. In general terms,
most modern day file systems combine two components into a cohesive
access control methodology:
Permitted action list
User identification means that the file system (and the
underlying operating system) must first be capable of uniquely
identifying individual users. This makes it possible to have full
accountability with respect to any operations on the file system
level. Another often-helpful feature is that of user groups — creating ad-hoc collections of
users. Groups are most often used by organizations where users may
be members of one or more projects. Another feature that some file
systems support is the creation of generic identifiers that can be
assigned to one or more users.
Next, the file system must be capable of maintaining lists of
actions that are permitted (or not permitted) against each file.
The most commonly-tracked actions are:
Reading the file
Writing the file
Executing the file
Various file systems may extend the list to include other
actions such as deleting, or even the ability to make changes
related to a file's access control.
One constant in a system administrator's life is that there is
never enough free space, and even if there is, it will not remain
free for long. Therefore, a system administrator should at least be
able to easily determine the level of free space available for each
file system. In addition, file systems with well-defined user
identification capabilities often include the capability to display
the amount of space a particular user has consumed.
This feature is vital in large multi-user environments, as it is
an unfortunate fact of life that the 80/20 rule often applies to
disk space — 20 percent of your users will be responsible for
consuming 80 percent of your available disk space. By making it
easy to determine which users are in that 20 percent, you can more
effectively manage your storage-related assets.
Taking this a step further, some file systems include the
ability to set per-user limits (often known as disk quotas) on the amount of disk space that can
be consumed. The specifics vary from file system to file system,
but in general each user can be assigned a specific amount of
storage that a user can use. Beyond that, various file systems
differ. Some file systems permit the user to exceed their limit for
one time only, while others implement a "grace period" during which
a second, higher limit is applied.
Many system administrators give little thought to how the
storage they make available to users today is actually going to be
used tomorrow. However, a bit of thought spent on this matter
before handing over the storage to users can save a great deal of
unnecessary effort later on.
The main thing that system administrators can do is to use
directories and subdirectories to structure the storage available
in an understandable way. There are several benefits to this
By enforcing some level of structure on your storage, it can be
more easily understood. For example, consider a large mult-user
system. Instead of placing all user directories in one large
directory, it might make sense to use subdirectories that mirror
your organization's structure. In this way, people that work in
accounting have their directories under a directory named
accounting, people that work in
engineering would have their directories under engineering, and so on.
The benefits of such an approach are that it would be easier on
a day-to-day basis to keep track of the storage needs (and usage)
for each part of your organization. Obtaining a listing of the
files used by everyone in human resources is straightforward.
Backing up all the files used by the legal department is easy.
With the appropriate structure, flexibility is increased. To
continue using the previous example, assume for a moment that the
engineering department is due to take on several large new
projects. Because of this, many new engineers are to be hired in
the near future. However, there is currently not enough free
storage available to support the expected additions to
However, since every person in engineering has their files
stored under the engineering directory,
it would be a straightforward process to:
Procure the additional storage necessary to support
Back up everything under the engineering directory
Restore the backup onto the new storage
Rename the engineering directory on
the original storage to something like engineering-archive (before deleting it entirely
after running smoothly with the new configuration for a month)
Make the necessary changes so that all engineering personnel can
access their files on the new storage
Of course, such an approach does have its shortcomings. For
example, if people frequently move between departments, you must
have a way of being informed of such transfers, and you must modify
the directory structure appropriately. Otherwise, the structure no
longer reflects reality, which makes more work — not less
— for you in the long run.
Once a mass storage device has been properly partitioned, and a
file system written to it, the storage is available for general
For some operating systems, this is true; as soon as the
operating system detects the new mass storage device, it can be
formatted by the system administrator and may be accessed
immediately with no additional effort.
Other operating systems require an additional step. This step
— often referred to as mounting
— directs the operating system as to how the storage may be
accessed. Mounting storage normally is done via a special utility
program or command, and requires that the mass storage device (and
possibly the partition as well) be explicitly identified.