A Tool for Cold Mirroring of Solaris System Disks

[mirror_boot.sh]

www.boran.com/security/sp/coldmirroring20010306.html

Minimum downtime and prevention of data loss is important for most system administrators. The traditional solution is to use backups or RAID to cover for disk failures. We describe an alternative for "cold mirroring" of system disks — it mounts a spare disk, copies files to the spare, installs a boot block and copies over a new vfstab. This creates a fully updated bootable spare disk. The administrator is notified of success/failure by syslog or email. This tool, called mirror_boot.sh, has been tested on several Solaris versions.

Why Not RAID?

If the system disks (/, /usr, /var file systems) are on RAID and for example the raid controller (or fiber cable) fails, you have a problem, unless the RAID is fully redundant. Also, cold mirroring is simpler, and software RAID can be difficult to recover when the system disk fails.

For some servers, I prefer to put system (and certain data) files on a "normal" disk and mirror to a second disk once or twice a week ("cold mirroring"). If the boot disk dies, we simply boot from the mirror disk. This solution is easier to understand, to recover from in a disaster scenario, and system disks can be more easily added/removed/changed.

In addition, files changed by accident can be recovered since the last mirror run, and deleted files can be recovered until the disk fills up and needs to be wiped clean. More details are provided below.

How mirror_boot.sh Works

Each night the offline disk is mounted and synchronized with the primary disk. The script is called from the root cron nightly. It mounts the spare disk under /newroot, copies all file systems, installs a boot block and copies over a new vfstab. This creates a fully updated bootable spare disk. The results of the script are sent to the administrator via email (sample output is mirror_output.txt).

Advantages:
- Cheap (disks are pretty cheap and evolving faster than tapes).
- Simple (no complex hardware or software).
- Easy to understand in a crisis (when RAID complexity can be unbearable).
- Cold mirrored disks are not written to at the same time as the masters: deleted files several days old can be recovered (e.g., after an attack or a mistake by system administrator or user).
Disadvantages:
- Hot failover is not available, and a minimum downtime of, say, 15 minutes has to be acceptable.
- The backup disks have to cleaned now and again, as they can fill up.
Read this document carefully before using the script!

So How Do We Use mirror_boot.sh?

Start off by downloading mirror_boot.sh, browsing the source and reading this article completely. Then practice on a non-critical Solaris host.

We'll now walk through two examples, an x86 and a SPARC Solaris 8 box, both of which have system disks with three partitions: root, /var and swap.

The disks in our example are set up as follows:

The primary and mirror disk do not have to be identical

a) The partitions on the secondary disk must be at least as big as the primary.

b) It is an advantage if the second disk is bigger, as it allows deleted files to be kept longer (the backup file systems don't need to be purged as often).

c) The file systems on the primary or secondary disk can in fact be spread out over several physical disks. The mirror script does not care, as it simply backs up each file system to an equivalent under /newroot.

1. Active (primary) system disk:
   On x86: / c0d0s0, swap c0d0s1, /var c0d0s5
   On SPARC: / c0t3d0s0, swap c0t3d0s1, /var c0t3d0s6
2. Backup (secondary) disk
   On x86: / c0d1s0, swap c0d1s1, /var c0d1s5
   On SPARC: / c0t1d0s0, swap c0t1d0s1, /var c0t1d0s6

a) Set up the secondary disk:

Fdisk second disk (x86 architecture only), e.g., "fdisk /dev/rdsk/c0d1p0"

Partition the second disk identically to the first (with format): e.g., create root, swap and /var partition with the same number of cylinders, then write the disk label and quit format.

Note: If the master and mirror disk are identical, the fmthard tool together with prtvtoc is much faster than manually using format. For example, if the master is target 3 and the second disk is target 1, and we wish to give it the disk label "mirror," then:
On x86:
/usr/sbin/prtvtoc /dev/rdsk/c0d0s2 | /usr/sbin/fmthard -n mirror -s - /dev/rdsk/c0d1s2

On SPARC:
/usr/sbin/prtvtoc /dev/rdsk/c0t3d0s2 | /usr/sbin/fmthard -n mirror -s - /dev/rdsk/c0t1d0s2

Create file systems on the second disk, e.g.,

on x86: newfs c0d1s0; newfs c0d1s5;
on SPARC: newfs c0t1d0s0; newfs c0t1d0s6;

b) Get the secondary disk online:

Add entries for the new disk in /etc/vfstab under /newroot, /newroot/var, etc.

Tip..

I've written a script to initialise this tool for a specific set of machines, you may wish to download and adapt it for your needs.

/etc/vfstab for our x86 example:

/dev/dsk/c0d0s1 - - swap - no - swap - /tmp tmpfs - yes size=150m /dev/dsk/c0d0s0 /dev/rdsk/c0d0s0 / ufs 1 no logging /dev/dsk/c0d0s6 /dev/rdsk/c0d0s6 /var ufs 1 yes logging,nosuid,noatime # # Cold mirror backup disk /dev/dsk/c0d1s0 /dev/rdsk/c0d1s0 /newroot ufs 1 no logging /dev/dsk/c0d1s6 /dev/rdsk/c0d1s6 /newroot/var ufs 1 yes logging,nos uid,noatime

/etc/vfstab for our SPARC example:

/dev/dsk/c0t3d0s1 - - swap - no - /dev/dsk/c0t3d0s0 /dev/rdsk/c0t3d0s0 / ufs 1 no logging /dev/dsk/c0t3d0s6 /dev/rdsk/c0t3d0s6 /var ufs 1 no logging,noatime,nosuid swap - /tmp tmpfs - yes size=120m # # Cold mirror backup disk /dev/dsk/c0t1d0s0 /dev/rdsk/c0t1d0s0 /newroot ufs 1 no logging /dev/dsk/c0t1d0s6 /dev/rdsk/c0t1d0s6 /newroot/var ufs 1 no logging,noatime,nosuid

Create mount points, mount the new file systems, check contents and unmount, e.g.,

mkdir /newroot; chmod 777 /newroot; mount /newroot; df -k umount /newroot;

Ensure all /newroot volumes are unmounted before proceeding.
Create a file /etc/vfstab.newroot that can be used to boot the system from the second disk. This step is critical and must be carried out with care, for example make sure that the swap device is correctly set.
cp /etc/vfstab /etc/vfstab.newroot vi /etc/vfstab.newroot

/etc/vfstab.newroot for our x86 example:

swap - /tmp tmpfs - yes - swap - /tmp tmpfs - yes size=150m # /dev/dsk/c0d1s0 /dev/rdsk/c0d1s0 / ufs 1 no logging /dev/dsk/c0d1s6 /dev/rdsk/c0d1s6 /var ufs 1 yes logging,nosuid,noatime

/etc/vfstab.newroot for our SPARC example:

/dev/dsk/c0t1d0s1 - - swap - no - swap - /tmp tmpfs - yes size=120m /dev/dsk/c0t1d0s0 /dev/rdsk/c0t1d0s0 / ufs 1 no logging /dev/dsk/c0t1d0s6 /dev/rdsk/c0t1d0s6 /var ufs 1 no logging,noatime,nosuid

Attention: On an x86 system with IDE disks, just copy vfstab to vfstab.newroot. The reason for this is that if the primary IDE fails, the system will not boot, so the backup disk will have to replace the master and hence will assume its device address. See the recovery section below for an example.

c) Configure the script /secure/mirror_boot.sh

Set the list of non-root partitions, e.g., targets='/var'
Set the list of mount points, e.g., mounts='/proc /var /var/run /oldroot';
Set the reporting email address: admin='Joe.Bloggs@mycompany.com'
Should an email be sent if there was no problem in the backup? This option is recommended for the first week or two, until confidence is gained.

VERBOSE='1';

Should any important applications be started before the backup and restarted afterwards? For example, if a database such as MySQL is running, it is advisable to stop it during the backup.

stop_daemons='sh /etc/rc3.d/S99msql stop'; start_daemons='sh /etc/rc3.d/S99msql start'; #Otherwise, comment out the daemon stop/start lines: #stop_daemons=''; #start_daemons='';

If running manually, or debugging on the command line, set DEBUG='1'. The results will be printed on the terminal, rather than sent via email. More informative messages are given as to what the script is doing.
If it's not necessary to install a bootblock, set BOOTABLE=0
If certain files do not need to be backed up, set the variable $ignore_these with an appropriate egrep regular expression, for example:

ignore_these='.tmp|^core'; # If we want to ignore tar archives: ignore_these='.tmp|^core|.gz|.Z';

d) Do a dry run to check configuration — set DEBUG='1' and run the script. It will not copy over the file systems, but do everything else, so it allows you to check that everything has been set up correctly.

e) First real run — set DEBUG='0' and let it go:

/secure/mirror_boot.sh &

If you run it in background mode, then progress can be followed by running "df -k" now and again.

f) SPARC: Change the EEPROM so that the system will boot from the mirror automatically if the main disk fails.

Assuming the mirror has a devalias called disk3 that points to the physical device, then test a manual boot from the mirror first:

boot disk3

Then set the backup boot device:

eeprom boot-device="disk3 disk1"

Test: Switch off/unplug the main disk and reboot.

g) Configure the mirroring to run regularly. For example, a root cron entry for each Wednesday at 4 a.m.. would be:

0 4 * * 3 /secure/mirror_boot.sh

Maintenance & Known Problems

The /newroot target is never wiped clean. This has the advantage that files deleted several days (or even months) ago can be recovered, but the disadvantage that the newroot targets are likely to fill up over time (and hence need wiping every few weeks/months). ufsdump could be used rather than cpio to copy the device, but then files deleted before the last backup cannot be recovered. File systems would have the same size also. If you prefer to use ufsdump, it's a pretty simple operation to modify the script.
Tip for cleaning /newroot: If /newroot or one of the other mirror file systems fills up, don't just wipe all files and hope the next mirroring runs fine. Try, for example, deleting all files that are were backed up more than 60 days ago:
mount /newroot cd /newroot find . -xdev -ctime +60 -type f -exec rm \{\} \;
Unix domain sockets are ignored, since trying to back them up caused errors like the following:
"cpio: Cannot open "./home/dns/var/run/ndc", skipped, errno 122, Operation not supported on transport endpoint"
This shouldn't cause a problem, as applications will create new sockets as needed.
cpio is used with the -pdmu options (copy all files listed in stdin, create dirs, maintain dates, overwrite new). It was attempted to use the "P" option to preserve Solaris ACLs, but this produced errors like:
"Error with acl() of "usr/dt/dthelp/nls/en_US.UTF-8", errno 2, No such file or directory"
One system complains of the following. It remains an unresolved issue:
Backing up /var....
cpio: Error with lstat() of "ipno ", errno 2, No such file or directory
cpio: Error with lstat() of "grou ", errno 2, No such file or directory
Note: The number of allowed links has been exceeded in this file system. It may be the reason.
If vfstab is changed (e.g. after a new disk is added), then vfstab.newroot needs to be updated too, which the administrator may forget. The solution could be to develop an awk or Perl script to auto-generate vfstab.newroot, or write a script that compares vfstab with vfstab.newroot and notifies the administrator (via a cron job) if the two seem inconsistent.
There is limited protection against media problems — i.e., if the master disk has corrupted files, they will be copied to the mirror. If the secondary medium has problems, we should notice: when copying files, cpio should stop and complain. If the master disk has problems, one would hope to see retry errors in syslog. One possibility would be to have several disks and rotate backups between them, essentially having several full backups of different ages.
Since the backup is not done in single user mode, it's critical that all active applications be stopped, otherwise there is a considerable risk of inconsistency. See the example for stopping Mysql above.
If the system is booted from the backup disk, the mirror script will try to execute itself from cron (as it would on the master): the mirror script was updated in Mar'01 to detect when it is run from the mirror and fail gracefully.

Recovery

It's all very well doing mirroring or backups, but it's useless if you haven't tested it in a disaster scenario.

1) Solaris SPARC with scsi disks

Example: "server1" has two scsi disks. The main one has target 3, the second (target 1) is mirrored each night from the first. If the main disk fails, proceed as follows to boot from the second disk.

a) Go to the "ok" prompt.

b) "boot disk1"

c) Log in as usual, check with "df" that you really are using the backup disk.

d) Disable the mirroring script in the root cron, until the primary disk is working again!

2) Solaris Intel with IDE disks

Solaris on Intel x86 Architecture

Booting a Solaris PC has several key steps that are important to understand when backing up/restoring boot disks

a) The PC BIOS decides what devices will be used for booting, in what order and whether certain devices are disabled.

b) The SCSI controller (if there is one) can be configured to insist on its own boot device.

c) The boot disk has been now selected and activated.

d) The Device Configuration Assistant (DCA) detects devices, makes them visible to the boot manager and chooses a root file system. If ESC is pressed when this is booting, devices can be changed as can default boot order, boot disks etc., via a menu-based interface. It is installed in the first few disk cylinders, in a separate partition (/boot).

e) Boot Manager: Allow options to be given to Solaris on booting, or using the boot interpreter.

f) Finally, trusty old Solaris starts up.

Example: "server1" has two IDE disks. The main one is the IDE master, the second (IDE slave) is mirrored each night from the first. If the main disk fails, proceed as follows to boot from the second disk. Important: /etc/vfstab must be identical on both disks for the following to work.

a) Switch off the machine, remove the faulty (master) IDE disk, change the jumpers on the slave IDE disk to be a master.

b) Switch on.

c) Confirm BIOS changes (some PCs such as Compaq will detect the disk removal and ask you to confirm the fact that the number of disks detected has changed).

d) Log in as usual, check with "df" that you really are using the backup disk.

e) Disable the mirroring script in the root cron, until the primary disk is working again!

3) Solaris Intel with scsi disks

Example: "server1" has two SCSI disks. The main one has target 0, the second, target 1, is mirrored each night from the first. If the main disk fails, proceed as follows to boot from the second disk:

a) Stop the PC, switch it back on and allow the bios to do its checks.

b) When the SCSI controller is checking the disks, Press Ctrl-A to enter the SCSI menu (the command needed to enter the SCSI menu depends on the controller).

c) Change the boot device to SCSI ID 1 (which is the ID of the second disk). Save changes, and exit, and it will probably insist on a reboot.

d) After the SCSI disk check, the "Device Configuration Assistant" (DCA) will start. Press Esc to enter its menu and F2 ("continue") about three times until the list of boot disks is shown. Select the backup disk (target 1) and press continue, then allow the boot process to continue.

e) Log in as usual, check with "df" that you really are using the backup disk.

f) Disable the mirroring script in the root cron, until the primary disk is working again!

g) To tell the DCA to permanently use the secondary disk for booting Solaris, the "setprop bootpath" line in /boot/solaris/bootenv.rc needs to point to the boot partition on the secondary disk. For example, root is on /dev/dsk/c0d1s0. Do "ls -l /dev/dsk/c0d1s0" on this and you'll get the path to the actual device.

Alternatives and Future Improvements

(Your Suggestions/Patches/Hacks Are Also Welcome)

Fix the known problems :)
Optionally, allow ufsdump to be used instead of cpio (faster, but files deleted on the master are also deleted on the mirror, providing less restore functionality).
If ufsdump is used, it would make sense to take a snapshot f the filesystem first, with fssnap (Solaris 8 04/01 and later only).
Another alternative would be to use web flash archives. flarcreate (Solaris 8 04/01 and later) is used to create these system images which can be used to duplicate systems in about 10 minutes. I've used these with JumpStart, very fats and useful, perhaps it possible to use then "manually", with manual restores.
Or use dd, which is a much faster way of cloning disks and no partitions have to be created. Run an fsck on the mirror before mounting it after the dd copy, though. Use a larger blocksize if you have lots of memory.
This is also a quick way of setting the initial partitions and mounts points.
SPARC example:
dd if=/dev/rdsk/c0t3d0p0 of=/dev/rdsk/c0t1d0p0 bs=32768k
dd if=/dev/rdsk/c0t3d0s2 of=/dev/rdsk/c0t1d0s2 bs=32768k
fsck /dev/rdsk/c0t1d0s0
fsck /dev/rdsk/c0t1d0s6

X86 example:
dd if=/dev/rdsk/c3d0s2 of=/dev/rdsk/c3d1s2 bs=32768k
fsck /dev/rdsk/c3d1s0
Automatically adapt "setprop bootpath" in /boot/solaris/bootenv.rc on Solaris Intel machines using SCSI disks?
Find a way to back up Solaris ACLs?
Port it to Linux, BSD or other Unix variants?
A different idea is creating one dedicated cold standby host that is capable of replacing any one of many operational systems. This covers not just disk failures but also entire system failure.
David Schweikert has developed a similar tool to the above:
The difference is that it does partition the mirror disk (scales the partitions if the size isn't exactly the same), formats it, uses ufsdump to copy the data and changes automatically the vfstab. We do not care about older versions of files, since we also do regular backups of the disk on tape...
Sounds interesting indeed — I've not yet tested it. A copy of the well written and documented Perl script "mirror_root" is available here. I don't think Solaris Intel or file systems on separate disks are supported though.
Another article that may give you ideas: Wonders of 'dd' and 'netcat' :: Cloning Operating Systems

Acknowledgements

Thanks to the following readers for their ideas and constructive feedback: Garry Garrett, Pavel, David Schweikert, Neil Collins.

About the Author

Seán Boran is an IT security consultant based in Switzerland and the author of the online IT Security Cookbook.

Change history

08.Mar.01 Update TOC, known problems, Future Improvements, fmthard note.
19.Apr.01 Update: dd, Minor fixes. touch /newroot/MIRROR and if /MIRROR exists, fail.
23.Oct.01 Update: fssnap, flarcreate, init script, fix links since SecurityPortal now dead.
20.Jan.02 Link fix.
20.Jun.03 Tip for cleaning /newroot