![]()
Disaster Recovery Using BaculaGeneralWhen disaster strikes, you must have a plan, and you must have prepared in advance otherwise the work of recovering your system and your files will be considerably greater. For example, if you have not previously saved the partitioning information for your hard disk, how can you properly rebuild it if the disk must be replaced?Unfortunately, many of the steps one must take before and immediately after a disaster are very operating system dependent. As a consequence, this chapter will discuss in detail disaster recovery (also called Bare Metal Recovery) for Linux and Solaris. For Solaris, the procedures are still quite manual. For FreeBSD the same procedures may be used but they are not yet developed. For Win32, no luck. Apparently an "emergency boot" disk allowing access to the full system API without interference does not exist. Important ConsiderationsHere are a few important considerations concerning disaster recovery that you should take into account before a disaster strikes.
Steps to Take Before Disaster Strikes
Bare Metal Recovery on Linux with a Bacula Rescue CDROMThe remainder of this section concerns recovering a Linux computer, and parts of it relate to the Red Hat version of Linux. The Solaris procedures can be found below under the Solaris Bare Metal Recovery section of this chapter.If you wish to use a floppy for restoration, please see the chapter Bare Metal Floppy Recovery on Linux with a Bacula Floppy Rescue Disk. A so called "Bare Metal" recovery is one where you start with an empty hard disk and you restore your machine. There are also cases where you may lose a file or a directory and want it restored. Please see the previous chapter for more details for those cases. Bare Metal Recovery assumes that you have the following items for your system:
RestrictionsIn addition, to the above assumptions, the following conditions or restrictions apply:
DirectoriesTo build the Bacula Rescue CDROM, you will find the necessary scripts in rescue/linux/cdrom subdirectory of the Bacula source code. If you installed the bacula-rescue rpm package the scripts will be found in the /etc/bacula/rescue/cdrom directory.Preparation for a Bare Metal RecoveryBefore you can do a Bare Metal recovery, you must create a Bacula Rescue CDROM, which will contain everything you need to begin recovery. This assumes that you will have your Directory and Storage daemon running on a different machine. If you want to recover a machine where the Director and/or the database were previously running things will be much more complicated.Creating a Bacula Rescue CDROMThe primary goals of the Bacula rescue CD are:
You should probably make a new rescue CDROM each time you make any major updates to your kernel, and every time you upgrade a major version of Bacula. The whole process with the exception of burning the CDROM is done with the following commands: (Build a working version of Bacula in the bacula-source directory) cd <bacula-source> ./configure (your options) make cd <bacula-source>/rescue/linux/cdrom su (become root) make allFor users of the bacula-rescue rpm the static bacula-fd has already been built and placed in /etc/bacula/rescue/cdrom/bin/ along with a symbolic link to your /etc/bacula/bacula-fd.conf file. Rpm users only need to do the second step: cd /etc/bacula/rescue/cdrom su (become root) make allAt this point, if the scripts are successful, they should have done the following things:
make burnHowever, you may need to modify the Makefile to properly specify your CD burner as the detection process is complicated especially if you have two CDROMs or do not have cdrecord loaded on your system. Users of the rescue rpm package should definitely examine the Makefile since it was configured on the host used to produce the rpm package. If you find that the make burn does not work for you, try doing a: make scanand use the output of that to modify the Makefile accordingly. The "make all" that you did above actually does the equivalent to the following: make kernel make binaries make bacula make isoIf you wish, you can modify what you put on the CDROM and redo any part of the make that you wish. For example, if you want to add a new directory, you might do the first three makes, then add a new directory to the CDROM, and finally do a "make iso". Please see the README file in the rescue/linux/cdrom or /etc/bacula/rescue/cdromdirectory for instructions on changing the contents of the CDROM. At the current time, the size of the CDROM is about 50MB (compressed to about 20MB), so there is quite a bit more room for additional program. Keep in mind that when this CDROM is booted, *everything* is in memory, so the total size cannot exceed your memory size, and even then you will need some reserve memory for running programs, ... Putting Two or More Systems on Your Rescue DiskYou can put multiple systems on the same rescue CD if you wish. This is because the information that is specific to your OS will be stored in the /bacula-hostname directory, where hostname is the name of the host on which you are building the CD. Suppose for example, you have two systems. One named client1 and one named client2. Assume also that your CD burner is on client1, and that is the machine we start on, and that we can ssh into client2 and also client2's disks are mounted on client1.ssh client2 cd <bacula-source> ./configure (your options) make cd rescue/linux/cdrom su (enter root password) make bacula exit exitAgain, for rpm package users the above command set would be: ssh client2 cd /etc/bacula/rescue/cdrom su (enter root password) make bacula exit exitThus we have just built a Bacula rescue directory on client2. Now, on client1, we copy the appropriate directory to two places (explained below), then build an ISO and burn it: cd <bacula-source> ./configure (your options) make cd rescue/linux/cdrom su (enter root password) c=/mnt/client2/home/user/bacula/rescue/linux/cdrom cp -a $c/roottree/bacula-client2 roottree cp -a $c/roottree/bacula-client2 cdtree make all make burn exitAnd with the rpm package: cd /etc/bacula/rescue/cdrom su (enter root password) c=/mnt/client2/etc/bacula/rescue/cdrom cp -a $c/roottree/bacula-client2 roottree cp -a $c/roottree/bacula-client2 cdtree make all make burn exitIn summary, with the above commands, we first build a Bacula directory on client2 in roottree/bacula-client2, then we copied the bacula-client2 directory into the client1's roottree so it is available in memory after booting, and we also copied it into the cdtree so it will also be on the CD as a separate directory and thus can be read without booting the CDROM. Then we made and burned the CDROM for client1, which of course, contains the client2 data. Restoring a Client SystemNow, let's assume that your hard disk has just died and that you have replaced it with an new identical drive. In addition, we assume that you have:
You will take the following steps to get your system back up and running:
Boot with your Bacula Rescue CDROMWhen the CDROM boots, you will be presented with a script that looks like:Welcome to the Bacula Rescue Disk 1.1.0 To proceed, press the <ENTER> key or type "linux <runlevel>" linux 1 -> shell linux 2 -> login (default if ENTER pressed) linux 3 -> network started and login (network not working yet) linux debug -> print debug during boot then loginNormally, at this point, you simply press ENTER. However, you may supply options for the boot if you wish. Once it has booted, you will be requested to login something like: Welcome to the Bacula Rescue CDROM 2.4.21-15.0.4.EL #1 Wed Aug 4 03:08:03 EDT 2004 Please login using root and your root password ... RescueCD login:Note, you must enter the root password for the system on which you loaded the kernel or on which you did the build of the CDROM. Once you are logged in, your will be in the home directory for root, and you can proceed to examine your system. The complete Bacula rescue part of the CD will be in the directory: /bacula-hostname, where hostname is replaced by the name of the host machine on which you did the build for the CDROM. This naming procedure allows you to put multiple restore environments for each of your machines on a single CDROM if you so wish to do. Please see the README document in the rescue/linux/cdrom directory for more information on adding to the CDROM. Start the NetworkAt this point, you should bring up your network. Normally, this is quite simple and requires just a few commands. Please cd into the /bacula-hostname directory before continuing. To simplify your task, we have created a script that should work in most cases by typing:cd /bacula-hostname ./start_networkYou can test it by pinging another machine, or pinging your broken machine machine from another machine. Do not proceed until your network is up. Partition Your Hard Disk(s)Assuming that your hard disk crashed and needs repartitioning, proceed with:./partition.hdaIf you have multiple disks, do the same for each of them. For SCSI disks, the repartition script will be named: partition.sda. If the script complains about the disk being in use, simply go back and redo the df command and umount commands until you no longer have your hard disk mounted. Note, in many cases, if your hard disk was seriously damaged or a new one installed, it will not automatically be mounted. If it is mounted, it is because the emergency kernel found one or more possibly valid partitions. If for some reason this procedure does not work, you can use the information in partition.hda to re-partition your disks by hand using fdisk. Format Your Hard Disk(s)If you have repartitioned your hard disk, you must format it appropriately. The formatting script will put back swap partitions, normal Unix partitions (ext2) and journaled partitions (ext3) as well as Reiser partitions (rei). Do so by entering for each disk:./format.hdaThe format script will ask you if you want a block check done. We recommend to answer yes, but realize that for very large disks this can take hours. Mount the Newly Formatted DisksOnce the disks are partitioned and formatted, you can remount them with the mount_drives script. All your drives must be mounted for Bacula to be able to access them. Run the script as follows:./mount_drives dfThe df command will tell you if the drives are mounted. If not, re-run the script again. It isn't always easy to figure out and create the mount points and the mounts in the proper order, so repeating the ./mount_drives command will not cause any harm and will most likely work the second time. If not, correct it by hand before continuing. Restore and Start the File DaemonIf you have booted with a Bacula Rescue CDROM, your statically linked Bacula File daemon and the bacula-fd.conf file with be in the /bacula-hostname/bin directory. Make sure bacula-fd and bacula-fd.conf are both there.Edit the Bacula configuration file, create the working/pid/subsys directory if you haven't already done so above, and start Bacula. Before starting Bacula, you will need to move it and bacula-fd.conf from /bacula-hostname/bin, to the /mnt/disk/tmp directory so that it will be on your hard disk. Then start it with the following command: chroot /mnt/disk /tmp/bacula-fd -c /tmp/bacula-fd.confThe above command starts the Bacula File daemon with your the proper root disk location (i.e. /mnt/disk/tmp. If Bacula does not start correct the problem and start it. You can check if it is running by entering: ps faxYou can kill Bacula by entering: kill -TERM <pid>where pid is the first number printed in front of the first occurrence of bacula-fd in the ps fax command. Now, you should be able to use another computer with Bacula installed to check the status by entering: status client=xxxxinto the Console program, where xxxx is the name of the client you are restoring. One common problem is that your bacula-dir.conf may contain machine addresses that are not properly resolved on the stripped down system to be restored because it is not running DNS. This is particularly true for the address in the Storage resource of the Director, which may be very well resolved on the Director's machine, but not on the machine being restored and running the File daemon. In that case, be prepared to edit bacula-dir.conf to replace the name of the Storage daemon's domain name with its IP address. Restore Your FilesOn the computer that is running the Director, you now run a restore command and select the files to be restored (normally everything), but before starting the restore, there is one final change you must make using the mod option. You must change the Where directory to be the root by using the mod option just before running the job and selecting Where. Set it to:/then run the restore. You might be tempted to avoid using chroot and running Bacula directly and then using a Where to specify a destination of /mnt/disk. This is possible, however, the current version of Bacula always restores files to the new location, and thus any soft links that have been specified with absolute paths will end up with /mnt/disk prefixed to them. In general this is not fatal to getting your system running, but be aware that you will have to fix these links if you do not use chroot. Final StepAt this point, the restore should have finished with no errors, and all your files will be restored. One last task remains and that is to write a new boot sector so that your machine will boot. For lilo, you enter the following command:./run_liloIf you are using grub instead of lilo, you must enter the following: ./run_grubNote, I've had quite a number of problems with grub because it is rather complicated and not designed to install easily under a simplified system. So, if you experience errors or end up unexpectedly in a chroot shell, simply exit back to the normal shell and type in the appropriate commands from the run_grub script by hand until you get it to install. When you run the run_grub script, it will print the commands that you should manually enter if that is necessary. RebootFirst unmount all your hard disks, otherwise they will not be cleanly shutdown, then reboot your machine by entering exit until you get to the main prompt then enter ctl-d. Once back to the main CDROM prompt, you will need to turn the power off then back on to your machine to get it to reboot.If everything went well, you should now be back up and running. If not, re-insert the emergency boot CDROM, boot, and figure out what is wrong. Restoring a ServerAbove, we considered how to recover a client machine where a valid Bacula server was running on another machine. However, what happens if your server goes down and you no longer have a running Director, Catalog, or Storage daemon? There are several solutions:
The second suggestion is probably a much simpler solution, and one I have done myself. To do so, you might want to consider the following steps:
Linux Problems or BugsSince every flavor and every release of Linux is different, there are likely to be some small difficulties with the scripts, so please be prepared to edit them in a minimal environment. A rudimentary knowledge of vi is very useful. Also, these scripts do not do everything. You will need to reformat Windows partitions by hand, for example.Getting the boot loader back can be a problem if you are using grub because it is so complicated. If all else fails, reboot your system from your floppy but using the restored disk image, then proceed to a reinstallation of grub (looking at the run-grub script can help). By contrast, lilo is a piece of cake. FreeBSD Bare Metal RecoveryThe same basic techniques described above also apply to FreeBSD. Although we don't yet have a fully automated procedure, Alex Torres Molina has provided us with the following instructions with a few additions from Jesse Guardiani and Dan Languille:
Solaris Bare Metal RecoveryThe same basic techniques described above apply to Solaris:
Preparing Solaris Before a DisasterAs mentioned above, before a disaster strikes, you should prepare the information needed in the case of problems. To do so, in the rescue/solaris subdirectory enter:su ./getdiskinfo ./make_rescue_diskThe getdiskinfo script will, as in the case of Linux described above, create a subdirectory diskinfo containing the output from several system utilities. In addition, it will contain the output from the SysAudit program as described in Curtis Preston's book. This file diskinfo/sysaudit.bsi will contain the disk partitioning information that will allow you to manually follow the procedures in the "Unix Backup & Recovery" book to repartition and format your hard disk. In addition, the getdiskinfo script will create a start_network script. Once you have your your disks repartitioned and formatted, do the following:
Bugs and Other ConsiderationsDirectory Modification and Access Times are Modified on pre-1.30 BaculasWhen a pre-1.30 version of Bacula restores a directory, it first must create the directory, then it populates the directory with its files and subdirectories. The act of creating the files and subdirectories updates both the modification and access times associated with the directory itself. As a consequence, all modification and access times of all directories will be updated to the time of the restore.This has been corrected in Bacula version 1.30 and later. The directory modification and access times is reset to the value saved in the backup after all the files and subdirectories have been restored. This has been tested and verified on normal restore operations, but not verified during a bare metal recovery. Strange Bootstrap FilesIf any of you look closely at the bootstrap file that is produced and used for the restore (I sure do), you will probably notice that the FileIndex item does not include all the files saved to the tape. This is because in some instances there are duplicates (especially in the case of an Incremental save), and in such circumstances, Bacula restores only the last of multiple copies of a file or directory.Disaster Recovery of Win32 SystemsDue to open system files, and registry problems, Bacula cannot save and restore a complete Win2K/XP/NT environment.
A suggestion by Damian Coutts
To restore the system state, you first reload a base operating system, then
you would use Bacula to restore all the users files and to
recover the c:\systemstate.bkf file, and finally, run NTBackup
and catalogue the system statefile, and then select it for restore.
The documentation says you can't run a command line restore of the systemstate.
This procedure has been confirmed to work by Ludovic Strappazon -- many thanks!
A new tool is provided in the form of a bacula plugin for the BartPE rescue CD.
BartPE is a self-contained WindowsXP boot CD which you can make using the PeBuilder
tools available at http://www.nu2.nu/pebuilder/
and a valid Windows XP SP1 CDROM. The plugin is provided as a zip archive. Unzip the
file and copy the bacula directory into the plugin directory of your BartPE installation.
Edit the configuration files to suit your installation and build your CD according to the
instructions at Bart's site. This will permit you to boot from the cd, configure and
start networking, start the bacula file client and access your director with the console
program. The programs menu on the booted CD contains entries to install the file client
service, start the file client service, and start the WX-Console. You can also open a
command line window and CD Programs\Bacula and run the command line console bconsole.
You can find quite a few additional resources, both commercial and
free at Storage Mountain, formerly
known as Backup Central.
And finally, the O'Reilly book, "Unix Backup & Recovery" by
W. Curtis Preston covers virtually every backup and recovery topic including
bare metal recovery for a large range of Unix systems.
|