Unattended rdiff-backup HOWTO

$Id: unattended.html,v 1.15 2007/01/15 08:21:36 dean Exp $

This page describes how to set up rdiff-backup to run, as a non-root user, unattended from a crontab. We will utilize features of rdiff-backup and OpenSSH to secure the setup as much as possible.

Unattended Backups 101

As of this writing, I use rdiff-backup from CVS, but 0.13.6 should do the job fine. (You're welcome to use my pre-built bleeding-edge .deb files.) I'm also using OpenSSH 4.1p1, but the examples work with many earlier versions.

The general model I use is to initiate all rdiff-backups from a central backup server, and pull the data from the hosts to be backed up. The central backup server uses a non-root user to perform the backups -- this relies on metadata features of recent rdiff-backup in order to support proper restores, and has the benefit that rdiff-backup exploits/bugs have reduced potential to damage the backup server. The backup still requires root on the host being backed up, but it is protected by ssh mechanisms which restrict the invoked command, and rdiff-backup mechanisms which restrict it to read-only access.

For convenience I'll call the backup server kitty and the host to be backed up fishie.

In this discussion, your input is in bold.

  1. On the backup server kitty, create a new account which will be used to perform the backup. I'll use the account name backup. The shell can typically be set to /bin/false. In my case the home directory is set to /backup which is where I've mounted the filesystem containing all my backups. The account password should be disabled. For example you might have the following entries in your passwd/shadow files:

    /etc/passwd
    backup:x:34:34:backup:/backup:/bin/false
    /etc/shadow
    backup:*:12644:0:99999:7:::

    Your uid/gid may differ, as may many of the fields in shadow.

    Note that if you're backing up multiple hosts, for an extra layer of paranoia you could create an account per host.

  2. Create a passphrase-free ssh key on kitty for fishie:
    	kitty% su
    	root@kitty# su -m backup
    	backup@kitty% ssh-keygen -t rsa
    	Generating public/private rsa key pair.
    	Enter file in which to save the key (/backup/.ssh/id_rsa): /backup/.ssh/id_rsa_fishie_backup
    	Enter passphrase (empty for no passphrase):
    	Enter same passphrase again:
    	Your identification has been saved in /backup/.ssh/id_rsa_fishie_backup.
    	Your public key has been saved in /backup/.ssh/id_rsa_fishie_backup.pub.
    	The key fingerprint is:
    	e0:fc:4a:8a:51:a8:c7:3a:e4:3a:3c:22:f9:4e:35:ca backup@kitty

    Your key fingerprint will almost certainly differ from the example here.

    Note that I've chosen to name the file with "fishie" in it -- this is to emphasize that you can have a key for each host you wish to backup. However it's not necessary to use separate keys -- it's only necessary that you have a key dedicated to the purpose of doing a backup.

  3. Create an ssh config alias which defines how to contact fishie with the backup key. Place the following into /backup/.ssh/config:
    
    	host fishie-backup
    		hostname fishie
    		user root
    		identityfile /backup/.ssh/id_rsa_fishie_backup
    		compression yes
    		protocol 2

    Note that "compression yes" is optional, and you may wish to omit it if kitty and fishie are connected over high-speed nets. The cipher line is also optional, but may reduce cpu overhead. (On a trusted switched network, or over localhost, you may also wish to patch OpenSSH to enable cipher none.)

    This config entry enables backup@kitty to use the "hostname" fishie-backup wherever ssh expects a real hostname. ssh will use the information specified in the config file, which will result in a connection to fishie, using the specified key, compression, cipher, and protocol.

    You may need to make some file permission adjustments, it depends on your system:

    	backup@kitty% chmod -R go-rwx /backup/.ssh

  4. Give permission for backup to access fishie and run rdiff-backup.

    You need the public portion of the key you just generated on kitty:

    	kitty# cat /backup/.ssh/id_rsa_fishie_backup.pub
    	ssh-rsa AAAAB3NzaC1yc2EAAAAB[...] backup@kitty

    Your actual key will be a lot longer (and completely different) from this example.

    Assuming that root@fishie's home directory is /root, we will construct a terribly long line in the file /root/.ssh/authorized_keys2 (on fishie). The line is so long that I'm going to break it in two here for demonstration purposes only, you must join this first line and the public key from above on one line, with only a space between them:

    
    command="rdiff-backup --server --restrict-read-only /",from="kitty",no-port-forwarding,no-X11-forwarding,no-pty
    ssh-rsa AAAAB3NzaC1yc2EAAAAB[...] backup@kitty

    This entry in /root/.ssh/authorized_keys2 permits anyone with the specified key (i.e. backup@kitty) to connect with ssh from the host named kitty and issue the forced rdiff-backup command. It further restricts the ssh connection to eliminate port forwarding, X11 forwarding and a pty. The rdiff-backup invocation is also restricted to read-only operations starting from the root of the file system.

    NOTE: rdiff-backup 0.13.4 fails to support "--restrict-read-only /" without a patch. It works fine with sub-paths (i.e. /home), but you'll need my patch to backup from the root of the filesystem. If you'd prefer not to patch rdiff-backup then you can skip the "--restrict-read-only /" parameters -- it is up to you how paranoid you wish to be.

    If you have any troubles, this step is the one which has most likely caused you problems. Here are some troubleshooting guidelines:

  5. Perform a test backup and populate known_hosts.

    You should now be able to perform a test backup. During this test ssh will probably ask you to accept the fishie host key -- you will need to complete this step before you can begin an unattended backup.

    	backup@kitty% cd /backup
    	backup@kitty% rdiff-backup fishie-backup::/tmp test-backup

    If you are asked for a password or passphrase then something is wrong. Other than asking you to verify the host key it should succeed in performing a backup of fishie::/tmp in test-backup.

    Assuming the first attempt asked you to verify the host key, run the test a second time to verify that it asks you nothing.

  6. Create a cron job on kitty to initiate your backup (i.e. crontab -e -u backup):
    	1 1 * * * rdiff-backup fishie-backup::/ /backup/fishie

chroot Paranoia

It's possible to add even more paranoia on the backup server by placing the backup into a chroot or a jail of sorts. I'm generally too lazy to do this, so I'll just leave it as a suggestion for further investigation.

localhost Backups

The above technique works fine for doing backups to a non-root user on the same host. In this case you might as well use localhost as the hostname in all instances. Note that if you're like me then you'll find the ssh encryption overhead to be a complete waste on localhost -- you could consider my patch to enable cipher none... however my preference is to use sudo:

Create a sudoers entry like the following:

	backup localhost = NOPASSWD: /usr/bin/rdiff-backup --server --restrict-read-only /

Then as user backup invoke rdiff-backup as follows:

	backup@kitty% rdiff-backup --remote-schema '%s' \
		'sudo /usr/bin/rdiff-backup --server --restrict-read-only /'::/tmp test-backup

Note the --remote-schema and the fake hostname result in the invocation of sudo with the desired arguments.

You'll almost certainly need to use some --exclude options for this to avoid backing up your backup directory.

Snapshot Backups

If you are using a volume manager, such as LVM, which supports snapshot volumes (or a filesystem which supports snapshot filesystems) then you can produce a nice stable image for rdiff-backup without going to single user. The only trick is creating/destroying the snapshot at the right time.

I do this by specifying a script instead of rdiff-backup in authorized_keys2: command="/root/lib/snapback",...

Note that when command="..." is given ssh will execute the specified command and ignore whatever command the connecting host had attempted (i.e. ssh will ignore the request for "rdiff-backup --server ...").

The script needs to avoid touching stdin/stdout, but whatever it prints on stderr comes back through the rdiff-backup stderr. Here is an example /root/lib/snapback for LVM1:

	#!/bin/sh

	export PATH=/usr/bin:/bin:/usr/sbin:/sbin

	(
		if ! lvcreate --size 8G --snapshot --name snap /dev/my_vg/my_lv; then
			exit 1
		fi

		if ! mount -o ro /dev/my_vg/snap /mnt/snap; then
			lvremove -f /dev/my_vg/snap
			exit 1
		fi
	) 1>&2 </dev/null || exit 1

	rdiff-backup --server --restrict-read-only /mnt/snap

	(
		umount /mnt/snap
		lvdisplay /dev/my_vg/snap
		lvremove -f /dev/my_vg/snap
	) 1>&2 </dev/null

There's lots which should be customized in the script -- I won't go into the details, see the man pages. Note I use the lvdisplay to monitor how much snapshot free space there is at the end of my backups (it all appears in the crontab output).


dean gaudet -- you can probably guess my email address