14.1 Overview of OCFS2
Oracle Cluster File System 2 (OCFS2) is a general-purpose journaling
file system that is fully integrated in the Linux 2.6 kernel and later.
OCFS2 allows you to store application binary files, data files, and
databases on devices in a SAN. All nodes in a cluster have concurrent
read and write access to the file system. A distributed lock manager
helps prevent file access conflicts. OCFS2 supports up to 32,000
subdirectories and millions of files in each directory. The O2CB cluster
service (a driver) runs on each node to manage the cluster.
14.1.1 Features and Benefits
In August 2005, OCFS2 was added to SUSE Linux Enterprise Server 9 to support Oracle Real
Application Cluster (RAC) databases and Oracle Home (its application
files). In SUSE Linux Enterprise Server 10 and later, OCFS2 can be used for any of the
following storage solutions:
-
Oracle RAC and other databases
-
General applications and workloads
-
XEN image store in a cluster
XEN virtual machines and virtual servers can be stored on OCFS2
volumes that are mounted by cluster servers to provide quick and easy
portability of XEN virtual machines between servers.
-
LAMP (Linux, Apache, MySQL, and PHP | Pearl | Python)
stacks
In addition, it is fully integrated with Heartbeat 2.
As a high-performance, symmetric, parallel cluster file system, OCFS2
supports the following functions:
-
An application’s files are available to all nodes in the
cluster. Users simply install it once on an OCFS2 volume in the
cluster.
-
All nodes can concurrently read and write directly to storage via the
standard file system interface, enabling easy management of
applications that run across a cluster.
-
File access is coordinated through the Distributed Lock Manager
(DLM).
DLM control is good for most cases, but an application’s design
might limit scalability if it contends with the DLM to coordinate
file access.
-
Storage backup functionality is available on all back-end storage. An
image of the shared application files can be easily created, which
can help provide effective disaster recovery.
OCFS2 also provides the following capabilities:
-
Metadata caching
-
Metadata journaling
-
Cross-node file data consistency
-
A GTK GUI-based administration via the
ocfs2console utility
-
Operation as a shared-root file system
-
Support for multiple-block sizes (each volume can have a different
block size) up to 4 KB, for a maximum volume size of 16 TB
-
Support for up to 255 cluster nodes
-
Context-dependent symbolic link (CDSL) support for node-specific
local files
-
Asynchronous and direct I/O support for database files for
improved database performance
14.1.2 O2CB Cluster Service
The O2CB cluster service is a set of modules and in-memory file systems
that are required to manage OCFS2 services and volumes. You can enable
these modules to be loaded and mounted system boot. For instructions,
see
Configuring OCFS2 Services.
Table 14-1 O2CB Cluster Service Stack
Node Manager (NM)
|
Keeps track of all the nodes in the
/etc/ocfs2/cluster.conf file
|
Heartbeat (HB)
|
Issues up/down notifications when nodes join or leave the
cluster
|
TCP
|
Handles communications between the nodes with the TCP protocol
|
Distributed Lock Manager (DLM)
|
Keeps track of all locks and their owners and status
|
CONFIGFS
|
User space configuration file system. For details, see
In-Memory File Systems.
|
DLMFS
|
User space interface to the kernel space DLM. For details, see
In-Memory File Systems.
|
14.1.3 Disk Heartbeat
OCFS2 requires the nodes to be alive on the network. The O2CB cluster
service sends regular keepalive packets to ensure that they are. It
uses a private interconnect between nodes instead of the LAN to avoid
network delays that might be interpreted as a node disappearing and
thus, lead to a node’s self-fencing.
The OC2B cluster service communicates the node status via a disk
heartbeat. The heartbeat system file resides on the SAN, where it is
available to all nodes in the cluster. The block assignments in the
file correspond sequentially to each node’s slot assignment.
Each node reads the file and writes to its assigned block in the file
at two-second intervals. Changes to a node’s time stamp indicates
the node is alive. A node is dead if it does not write to the heartbeat
file for a specified number of sequential intervals, called the
heartbeat threshold. Even if only a single node is alive, the O2CB
cluster service must perform this check, because another node could be
added dynamically at any time.
You can modify the disk heartbeat threshold in the
/etc/sysconfig/o2cb file, using the
O2CB_HEARTBEAT_THRESHOLD parameter.
The wait time is calculated as follows:
(O2CB_HEARTBEAT_THRESHOLD value - 1) * 2 = threshold in seconds
For example, if the
O2CB_HEARTBEAT_THRESHOLD value is set
at the default value of 7, the wait time is 12 seconds ((7 - 1) * 2
= 12).
14.1.4 In-Memory File Systems
OCFS2 uses two in-memory file systems for communications:
Table 14-2 In-Memory File Systems Used by OCFS2
configfs
|
Communicates the list of nodes in the cluster to the in-kernel
node manager, and communicates the resource used for the heartbeat
to the in-kernel heartbeat thread
|
/config
|
ocfs2_dlmfs
|
Communicates locking and unlocking for clusterwide locks on
resources to the in-kernel distributed lock manager that keeps
track of all locks and their owners and status
|
/dlm
|
14.1.5 Management Utilities and Commands
OCFS2 stores parameter files specific to the node on the node. The
cluster configuration file
(/etc/ocfs2/cluster.conf) resides on
each node assigned to the cluster.
The ocfs2console utility is a GTK GUI-based
interface for managing the configuration of the OCFS2 services in the
cluster. Use this utility to set up and save the
/etc/ocfs2/cluster.conf file to all
member nodes of the cluster. In addition, you can use it to format,
tune, mount, and umount OCFS2 volumes.
Additional OCFS2 utilities are described in the following table. For
information about syntax for these commands, see their man pages.
Table 14-3 OCFS2 Utilities
debugfs.ocfs2
|
Examines the state of the OCFS file system for the purpose of
debugging.
|
fsck.ocfs2
|
Checks the file system for errors and optionally repairs errors.
|
mkfs.ocfs2
|
Creates an OCFS2 file system on a device, usually a partition on a
shared physical or logical disk. This tool requires the O2CB
cluster service to be up.
|
mounted.ocfs2
|
Detects and lists all OCFS2 volumes on a clustered system. Detects
and lists all nodes on the system that have mounted an OCFS2
device or lists all OCFS2 devices.
|
ocfs2cdsl
|
Creates a context-dependent symbolic link (CDSL) for a specified
filename (file or directory) for a node. A CDSL filename has its
own image for a specific node, but has a common name in the OCFS2.
|
tune.ocfs2
|
Changes OCFS2 file system parameters, including the volume label,
number of node slots, journal size for all node slots, and volume
size.
|
Use the following commands to manage O2CB services. For more
information about the o2cb command syntax, see its
man page.
Table 14-4 O2CB Commands
/etc/init.d/o2cb status
|
Reports whether the o2cb services are loaded and mounted
|
/etc/init.d/o2cb load
|
Loads the O2CB modules and in-memory file systems
|
/etc/init.d/o2cb online
ocfs2
|
Onlines the cluster named ocfs2
At least one node in the cluster must be active for the cluster to
be online.
|
/etc/init.d/o2cb offline
ocfs2
|
Offlines the cluster named ocfs2
|
/etc/init.d/o2cb unload
|
Unloads the O2CB modules and in-memory file systems
|
/etc/init.d/o2cb start
ocfs2
|
If the cluster is set up to load on boot, starts the cluster named
ocfs2 by loading o2cb and onlining the cluster
At least one node in the cluster must be active for the cluster to
be online.
|
/etc/init.d/o2cb stop ocfs2
|
If the cluster is set up to load on boot, stops the cluster named
ocfs2 by offlining the cluster and unloading the O2CB modules and
in-memory file systems
|
14.1.6 OCFS2 Packages
The OCFS2 kernel module (ocfs2) is installed
automatically in SUSE Linux Enterprise Server 10 and later. To use OCFS2, use YaST (or the
command line if you prefer) to install the
ocfs2-tools and ocfs2console
packages on each node in the cluster.
-
Log in as the root user or equivalent, then
open the YaST Control Center.
-
Select .
-
In the field, enter
ocfs2
The software packages ocfs2-tools and
ocfs2console should be listed in the right
panel. If they are selected, the packages are already installed.
-
If you need to install the packages, select them, then click
and follow the on-screen instructions.