|
|
|
|
45.6 Data Problems
Data problems are when the machine might or might not
boot properly but, in either case, it is clear that there is data
corruption on the system and that the system needs to be recovered. These
situations call for a backup of your critical data, enabling you to
recover a system state from before your system failed. SUSE Linux Enterprise offers
dedicated YaST modules for system backup and restoration as well as a
rescue system that can be used to recover a corrupted system from the
outside.
45.6.1 Backing Up Critical Data
System backups can be easily managed using the YaST System Backup
module:
-
As root, start YaST and select
.
-
Create a backup profile holding all details needed for the backup,
filename of the archive file, scope, and type of the backup:
-
Select .
-
Enter a name for the
archive.
-
Enter the path to the location of the backup if you want to
keep a local backup.
For your backup to be archived on a network server (via
NFS), enter the IP address or name of the server and the
directory that should hold your archive.
-
Determine the archive type and click .
-
Determine the backup options to use, such as whether files not
belonging to any package should be backed up and whether a list
of files should be displayed prior to creating the archive.
Also determine whether changed files should be identified
using the time-consuming MD5 mechanism.
Use to enter a dialog for
the backup of entire hard disk areas. Currently, this option
only applies to the Ext2 file system.
-
Finally, set the search constraints to exclude certain system
areas from the backup area that do not need to be backed up,
such as lock files or cache files. Add, edit, or delete items
until your needs are met and leave with .
-
Once you have finished the profile settings, you can
start the backup right away with
or configure automatic backup. It is also possible to
create other profiles tailored for various other
purposes.
To configure automatic backup for a given profile, proceed as
follows:
-
Select from the
menu.
-
Select .
-
Determine the backup frequency. Choose
, , or
.
-
Determine the backup start time. These settings depend on the
backup frequency selected.
-
Decide whether to keep old backups and how many
should be kept. To receive an automatically generated
status message of the backup process, check .
-
Click to apply your settings and
have the first backup start at the time specified.
45.6.2 Restoring a System Backup
Use the YaST System Restoration module to restore the system
configuration from a backup. Restore the entire backup or
select specific components that were corrupted and need to be reset to
their old state.
-
Start .
-
Enter the location of the backup file. This could be a local
file, a network mounted file, or a file on a removable device, such
as a floppy or a CD. Then click .
The following dialog displays a summary of the archive
properties, such as the filename, date of creation, type of backup,
and optional comments.
-
Review the archived content by clicking . Clicking returns you
to the dialog.
-
opens a dialog in which
to fine-tune the restore process. Return to the dialog by clicking
.
-
Click to open the view of packages to
restore.
Press to restore all files in the archive
or use the various , , and buttons to
fine-tune your selection. Only use the option if the RPM database is corrupted or deleted
and this file is included in the backup.
-
After you click , the backup is
restored. Click to leave the module after the
restore process is completed.
45.6.3 Recovering a Corrupted System
There are several reasons why a system could fail to come up and run
properly. A corrupted file system after a system crash, corrupted
configuration files, or a corrupted boot loader configuration are the
most common ones.
SUSE Linux Enterprise offers two different methods to cope with this kind of
situation. You can either use the YaST System Repair functionality or
boot the rescue system. The following sections cover both flavors
of system repair.
Using YaST System Repair
Before launching the YaST System Repair module, determine in which mode to
run it to best fit your needs. Depending on the severeness and cause of your
system failure and your expertise, there are three different modes to choose
from:
- Automatic Repair
-
If your system failed due to an unknown cause and you basically do
not know which part of the system is to blame for the failure, use
. An extensive automated check will be
performed on all components of your installed system. For a detailed
description of this procedure, refer to
Automatic Repair.
- Customized Repair
-
If your system failed and you already know which component is to blame,
you can cut the lengthy system check with short by limiting the scope of the system analysis to
those components. For example, if the system messages prior to the failure
seem to indicate an error with the package database, you can limit the
analysis and
repair procedure to checking and restoring this aspect of your
system. For a detailed description of this procedure, refer to
Customized Repair.
- Expert Tools
-
If you already have a clear idea of what component failed and how
this should be fixed, you can skip the analysis runs and directly apply
the
tools necessary for the repair of the respective component. For details,
refer to
Expert Tools.
Choose one of the repair modes as described above and proceed with the system
repair as outlined in the following sections.
Automatic Repair
To start the automatic repair mode of YaST System Repair, proceed
as follows:
-
Boot the system with the original installation medium used for
the initial installation (as outlined in Section 3.0,
Installation with YaST).
-
In , select
.
-
Select .
YaST now launches an extensive analysis of the installed system.
The progress of the procedure is displayed at the bottom of the screen
with two progress bars. The upper bar shows the progress of the currently
running test. The lower bar shows the overall progress of the
analysis. The log window in the top section tracks the currently running
test and its result. See Figure 45-2.
The following
main test runs are performed with every run. They contain, in turn, a
number of individual subtests.
- Partition Tables of All Hard Disks
-
Checks the validity and coherence of the partition tables of all
detected hard disks.
- Swap Partitions
-
The swap partitions of the installed system are detected, tested, and
offered for activation where applicable. The offer should be accepted
for the sake of a higher system repair speed.
- File Systems
-
All detected file systems are subjected to a file system–specific
check.
- Entries in the File /etc/fstab
-
The entries in the file are checked for completeness and consistency.
All valid partitions are mounted.
- Boot Loader Configuration
-
The boot loader configuration of the installed system (GRUB or LILO) is
checked for completeness and coherence. Boot and root devices are
examined and the availability of the initrd modules is checked.
- Package Database
-
This checks whether all packages necessary for the operation of a
minimal installation are present. While it is optionally possible also
to analyze the base packages, this takes a long time because of their
vast number.
-
Whenever an error is encountered, the procedure stops and a
dialog opens outlining the details and possible solutions.
Read the screen messages carefully before accepting the proposed
fix. If you decide to decline a proposed solution, your system remains
unchanged.
-
After the repair process has been terminated successfully,
click and and remove the
installation media. The system automatically reboots.
Customized Repair
To launch the mode and
selectively check certain components of your installed system, proceed as
follows:
-
Boot the system with the original installation medium used for
the initial installation (as outlined in Section 3.0,
Installation with YaST).
-
In , select
.
-
Select .
Choosing shows a list of test
runs that are all marked for execution at first. The total range of tests
matches that of automatic repair. If you already know where no damage is
present, unmark the corresponding tests. Clicking
starts a narrower test procedure that
probably has a significantly shorter running time.
Not all test groups can be applied individually. The analysis of the fstab
entries is always bound to an examination of the file systems, including
existing swap partitions. YaST automatically resolves such dependencies
by selecting the smallest number of necessary test runs.
-
Whenever an error is encountered, the procedure stops and a
dialog opens outlining the details and possible solutions.
Read the screen messages carefully before accepting the proposed
fix. If you decide to decline a proposed solution, your system remains
unchanged.
-
After the repair process has been terminated successfully,
click and and remove the
installation media. The system automatically reboots.
Expert Tools
If you are knowledgeable with SUSE® Linux Enterprise and already have a very clear idea
of what needs to be repaired in your system, directly apply the tools
skipping the system analysis.
To make use of the feature of the
YaST System Repair module, proceed as follows:
-
Boot the system with the original installation medium used for
the initial installation (as outlined in Section 3.0,
Installation with YaST).
-
In , select
.
-
Select and choose one or more repair options.
-
After the repair process has been terminated successfully,
click and and remove the
installation media. The system automatically reboots.
Expert tools provides the following options to repair your faulty
system:
- Install New Boot Loader
-
This starts the YaST boot loader configuration module. Find details
in Section 17.3,
Configuring the Boot Loader with YaST.
- Start Partitioning Tool
-
This starts the expert partitioning tool in YaST. Find details in
Section 7.5.6,
Partitioner.
- Repair File System
-
This checks the file systems of your installed system. You are first
offered a selection of all detected partitions and can then choose the
ones to check.
- Recover Lost Partitions
-
It is possible to attempt to reconstruct damaged partition tables.
A list of detected hard disks is presented first for selection.
Clicking starts the examination. This can take a
while depending on the processing power and size of the hard disk.
IMPORTANT: Reconstructing a Partition Table
The reconstruction of a partition table is tricky. YaST attempts to
recognize lost partitions by analyzing the data sectors of the hard
disk. The lost partitions are added to the rebuilt partition table
when recognized. This is, however, not successful in all imaginable
cases.
- Save System Settings to Floppy
-
This option saves important system files to a floppy disk. If
one of these files become damaged, it can be restored from disk.
- Verify Installed Software
-
This checks the consistency of the package database and the
availability of the most important packages. Any damaged installed
packages can be reinstalled with this tool.
Using the Rescue System
Your Linux system contains a rescue system. The rescue system is a small
Linux system that can be loaded into a RAM disk and mounted as root file
system, allowing you to access your Linux partitions from the outside.
Using the rescue system, you can recover or modify any important aspect of
your system:
-
Manipulate any type of configuration file.
-
Check the file system for defects and start automatic repair
processes.
-
Access the installed system in a change root environment
-
Check, modify, and reinstall the boot loader configuration
-
Resize partitions using the parted command. Find more information about
this tool at the Web site of GNU Parted (https://www.gnu.org/software/parted/parted.html).
The rescue system can be loaded from various sources and locations. The
simplest option is to boot the rescue system from the original
installation CD or DVD:
-
Insert the installation medium into your CD or DVD drive.
-
Reboot the system.
-
At the boot screen, choose the
option.
-
Enter root at the Rescue: prompt. A
password is not required.
If your hardware setup does not include a CD or DVD drive, you can boot
the rescue system from a network source (including the SUSE FTP
server). The following example applies to a remote boot scenario—if
using another boot medium, such as a floppy disk, modify the
info file accordingly and boot as you would for a
normal installation.
-
Enter the configuration of your PXE boot setup and replace
install=protocol://instsource
with
rescue=protocol://instsource.
As with a normal installation, protocol stands
for any of the supported network protocols (NFS, HTTP, FTP, etc.) and
instsource for the path to your network
installation source.
-
Boot the system using Wake on LAN.
-
Enter root at the Rescue:
prompt. A password is not required.
Once you have entered the rescue system, you can make use of the virtual
consoles that can be reached with
+ F1
to
+ F6.
A shell and many other useful utilities, such as the mount program, are
available in the /bin directory. The
sbin directory contains important file and network
utilities for reviewing and repairing the file system. This directory also
contains the most important binaries for system maintenance, such as fdisk,
mkfs, mkswap, mount, mount, init, and shutdown, and ifconfig, ip, route, and
netstat for maintaining the network. The directory
/usr/bin contains the vi editor, find, less, and ssh.
To see the system messages, either use the command
dmesg or view the file
/var/log/messages.
Checking and Manipulating Configuration Files
As an example for a configuration that might be fixed using the rescue
system, imagine you have a broken configuration file that prevents the
system from booting properly. You can fix this using the rescue system.
To manipulate a configuration file, proceed as follows:
-
Start the rescue system using one of the methods described above.
-
To mount a root file system located under
/dev/sda6 to the rescue system, use the
following command:
mount /dev/sda6 /mnt
All directories of the system are now located under
/mnt
-
Change the directory to the mounted root file system: cd /mnt
-
Open the problematic configuration file in the vi editor. Adjust and save
the configuration.
-
Unmount the root file system from the rescue system: umount /mnt
-
Reboot the machine.
Repairing and Checking File Systems
Generally, file systems cannot be repaired on a running system. If you
encounter serious problems, you may not even be able to mount your root file
system and the system boot may end with a kernel panic.
In this case, the only way is to repair the system from the outside. It is
strongly recommended to use the YaST System Repair for this task (see
Using YaST System Repair for details). However, if you need to do a
manual file system check or repair, boot the rescue system. It contains the
utilities to check and repair the
ext2, ext3,
reiserfs, xfs, jfs,
dosfs, and vfat file systems.
Accessing the Installed System
If you need to access the installed system from the rescue system to, for
example, modify the boot loader configuration, or to execute a hardware
configuration utility, you need to do this in a change root
environment.
To set up a change root environment based on the installed
system, proceed as follows:
-
First mount the root partition from the installed system and the device file
system:
mount /dev/sda6 /mnt
mount --bind /dev /mnt/dev
-
Now you can change root into the new environment:
chroot /mnt
-
Then mount /proc and /sys:
mount /proc
mount /sys
-
Finally, mount the remaining partitions from the installed system:
mount -a
-
Now you have access to the installed system. Before rebooting the system,
unmount the partitions with
umount -a and leave the change
root environment with exit.
WARNING: Limitations
Although you have full access to the files and applications of the
installed system, there are some limitations. The kernel that is running is
the one that was booted with the rescue system. It only supports essential
hardware and it is not possible to add kernel modules from the installed
system unless the kernel versions are exactly the same (which is unlikely).
So you cannot access a sound card, for example. It is also
not possible to start a graphical user interface.
Also note that you leave the change root environment
when you switch the console with
+ F1
to
+ F6.
Modifying and Reinstalling the Boot Loader
Sometimes a system cannot boot because the boot loader configuration is
corrupted. The start-up routines cannot, for example, translate physical
drives to the actual locations in the Linux file system without a working
boot loader.
To check the boot loader configuration and reinstall the boot loader,
proceed as follows:
-
Perform the necessary steps to access the installed system as described in
Accessing the Installed System.
-
Check whether the following files are correctly configured according to the
GRUB configuration principles outlined in Section 17.0,
The Boot Loader.
-
/etc/grub.conf
-
/boot/grub/device.map
-
/boot/grub/menu.lst
Apply fixes to the device mapping (device.map)
or the location of the root partition and configuration files, if
necessary.
-
Reinstall the boot loader using the following command
sequence: grub --batch < /etc/grub.conf
-
Unmount the partitions, log out from the change root
environment, and reboot the system:
umount -a
exit
reboot
|
|
|