The Hidden Blog

As it turns out, I do have a hostname.

Automated and encrypted backups with duplicity

Published on Dec 02, 2014

If you don't like ending up like the plane in the header image (Jeff Sheldon, Unsplash) you should probably make sure you do have a good backup and restore strategy.

The goal of this short guide is to have automated, encrypted and incremental backups from one server to another remote server. To achieve this we are going to use duplicity and it's simplified wrapper duply. There are more detailed guides out there but these are the steps I used and they work for me, the sources used for this guide are linked at the bottom.

Backup master

This is the machine where we are going to store all the backups from the various remote servers on. I added a new user called backup for this purpose. If you don't trust the other servers use a different user for each server.

useradd -m -G users,wheel,audio -s /bin/zsh backup
passwd backup

All the backups will be stored in it's home directory where we'll now create the following directories and remember the path, we'll need that at a later stage. Make sure you replace <server name> with something like the hostname of the server you want to backup in that directory so it'll be easier to figure out which backup is stored where if you are using multiple servers.

A good example would be: /home/backup/incoming-backups/example.com

/home/backup/incoming-backups/<server name>

Backup slave

This is one of many remote servers we want to backup to the master server.

Install dependencies

The first step is to install the dependencies. There's no stable version available for Gentoo at the moment so you'll have to unmask the latest version by adding app-backup/duply ~amd64 to your package.keywords file. Once this is done just install it.

emerge -av app-backup/duply app-backup/duplicity

If you are using Ubuntu/Debian a simple apt-get install duply should also grab the other dependencies.

First Steps

Type duply <server name> create to initialize a new backup. You should probably run this as the root user if you want to backup directories only accessible by root. The <server name> is a placeholder for whatever you want to call your backup set. I usually just use the hostname with no spaces, dots or any special characters.

This will create a new directory in your home directory containing two files: conf and exclude.

/root/.duply/<server name>/

Generate GPG Key

Because we want to encrypt all our backups so we can store them on an untrusted host we need to create a new GPG key. To do that just run gpg --gen-key and accept the default options except the keylength which I usually set to 4096.

You'll probably see a message like that telling you to generate new entropy:

Not enough random bytes available. Please do some other work to give
the OS a chance to collect more entropy!

In that case you could run some commands on the server, install updates or just emerge sys-apps/rng-tools and generate some new entryopy with rngd -r /dev/urandom. This should usually do the trick.

Once the key is generated it'll ask you for some information like "Full Name", "Email" and "Comment". In my case I use the name of the backup set for the full name, my regular email and the FQDM of the server as the comment. This will make it easier to find the private key for the server later on.

Don't forget to write down / store the passphrase somewhere safe. We are going to need it for the following step.

Configuration

Now it's time to edit the conf file to fit out needs. There are a lot of comments in that file explaining the various options so I'm just going to go over my settings without explaining all of these.

Open the conf file and add / uncomment the following values:

GPG_KEY & GPG_PW

If you don't remember your KEY ID just run: gpg --list-secret-keys to get a list of all your secret keys in your keychain. The output will look something like that:

sec   4096R/XXXXXXXX 2014-11-30
uid                  dewey <mail@example.com>
ssb   4096R/YYYYYYYY 2014-11-30

The X'ed value is the GPG_KEY you are looking for, the passphrase is the one you wrote down earlier.

GPG_KEY='XXXXXXXX'
GPG_PW='YOURPASSPHRASE'

GPG_OPTS

GPG_OPTS='--compress-algo=bzip2 --personal-cipher-preferences AES256,AES192'

TARGET

There are a lot of options to choose from, pick the one your master server supports.

Backup to Master Server

TARGET='rsync://backup@example.com/incoming-backup/<server name>'

Backup to Google Drive (with PyDrive)

This is the new and working method, please follow this guide:
https://blog.notmyhostna.me/posts/duplicity-with-pydrive-backend-for-google-drive/

Backup to Google Drive (with gdata) [Deprecated due to Google API changes]

If you want to store your backups in your Google Drive just install the API library via dev-python/gdata and add the Google Drive target. The user is the part in front of the @ of your Google (Apps) email address, the password your the one you are using for that account. The path you specify after the domain part will be created automatically.

If you are using Gentoo make sure to switch your Python interpreter to python2.7 (by using eselect python), 3.x is not supported by the gdata library yet.

TARGET='gdocs://user:password@example.com/backup-incoming/notmyhostna.me'

SOURCE

SOURCE='/'

Manual exclude parameters

If you want to manually ignore directories you can just create a .duplicity-ignore file in that directory and it won't be included in the backup. This is a good option if you want to back up the entire /home/ directory but not the directory of temporary files in your own home directory.

FILENAME='.duplicity-ignore'
DUPL_PARAMS="$DUPL_PARAMS --exclude-if-present '$FILENAME'"

Backup History

With these parameters you'll be able to define how many full or incremental backups will be kept. You should read the documentation / comments and make sure it fits your environment.

MAX_AGE=1M
MAX_FULL_BACKUPS=2
MAX_FULLS_WITH_INCRS=1
MAX_FULLBKP_AGE=2M
DUPL_PARAMS="$DUPL_PARAMS --full-if-older-than $MAX_FULLBKP_AGE "
VOLSIZE=50
DUPL_PARAMS="$DUPL_PARAMS --volsize $VOLSIZE "
VERBOSITY=5

Exclude

The other important file is called exclude and it defines the directories included or excluded from your backup. It's very simple:

- /etc/.git/
+ /etc/
+ /home/user2/imporant.txt
+ /home/dewey/
+ /var/www/
+ /root/.duply/
- **

Every line starting with + will be included in the backup, everything with - will be skipped. - ** will exclude everything not matched by the parent rules. The order matters so make sure you don't exclude /home/someuser/ and later on add /home/someuser/coup.txt - it won't be included that way.

Public Key Authentification

We want to be able to log into the master server without entering our password so the backup task will be able to run automatically in the background. We achieve that by copying our public key (id_rsa.pub to the remote server's authorized_keys file. Luckily there's an easy way to do just that:

ssh-copy-id -i ~/.ssh/id_rsa.pub backup@example.com

Enter your password one last time and we are set. Try to login via ssh to see if it works and if you don't have to enter the password again we succeeded.

First Backup

If you want to see the available duply commands just use duply usage. In our case we are going to use:

duply <server name> backup

for the first full backup. From now on just use

duply <server name> incr

to trigger incremental backups. If you want to see the list of backups stored on the remote host use duply <server name> status and you'll see something like this:

Found primary backup chain with matching signature chain:
-------------------------
Chain start time: Sun Nov 30 21:12:22 2014
Chain end time: Tue Dec  2 00:00:04 2014
Number of contained backup sets: 5
Total number of contained volumes: 10
 Type of backup set:                            Time:      Num volumes:
                Full         Sun Nov 30 21:12:22 2014                 1
         Incremental         Sun Nov 30 21:16:34 2014                 1
         Incremental         Sun Nov 30 21:53:00 2014                 6
         Incremental         Mon Dec  1 00:00:03 2014                 1
         Incremental         Tue Dec  2 00:00:04 2014                 1

Pretty!

Note: The first backup also exported the private and public gpg keys to the ~/.duply/<server name> directory. Please don't skip the section called "Restore" at the end of the guide. We are going to deal with these files there.**

Automation

Nobody likes to do things manually so we are going to tell cronjob to do all the heavy lifting for us. Use crontab -e to view your available cronjobs and add:

0   0   *   *   7   /usr/bin/duply /root/.duply/<server name> full_verify_purge --force
0   0   *   *   1-6 /usr/bin/duply /root/.duply/<server name> incr

Include (mySQL) Databases

If you want to backup mySQL databases too you'll have to grab a database dump, move it to some location included in your exclude file and clean up that location after the backup. Duply got us covered there.

Just create a file called post and pre in your .duply/<server name>/ directory.

pre

/usr/bin/mysqldump --all-databases -u root -pXXXXXXXX | gzip -9 > /var/backups/sql/sqldump_$(date +"%d-%m-%Y").sql.gz

post

/bin/rm /var/backups/sql/sqldump_$(date +"%d-%m-%Y").sql.gz

If your dump takes a long time make sure the timestamps are still covered by your post command and the archives don't build up in that directory.

Once you created these don't forget to add that directory to your exclude file like that:

+ /var/backups/sql/

Restore

Important: After the first backup make sure to copy the whole .duply directory to some place safe. It'll now include your private key.

Your ~/.duply/<server name> directory should now contain your config files and the public and private encryption key. It's very important to store this directory somewhere safe. If you don't have access to this directory it won't be possible to restore and decrypt the backup.

.
├── conf
├── exclude
├── gpgkey.XXXXXXXX.pub.asc
├── gpgkey.XXXXXXXX.sec.asc
├── post
└── pre

If you don't have a way to transfer scp this to a safe place just encrypt it, move it to a public directory and download it.

Generate a password:

openssl rand -base64 32

Create an archive:

tar cvzf duply-<server name>.tar.gz .duply

Encrypt it:

openssl enc -aes-256-cbc -salt -in duply-<server name>.tar.gz -out duply-<server name>.tar.gz.enc -k <your password>

Decrypt it on the target machine:

openssl enc -d -aes-256-cbc -in duply-<server name>.tar.gz.enc -out duply-<server name>.tar.gz -k <your password>

If you want to store the private key in the keychain on your main machine just import gpgkey.XXXXXXXX.sec.asc to GPG Keychain (Mac only) or do it via the command line:

gpg --allow-secret-key-import --import gpgkey.XXXXXXXX.sec.asc

It'll now show up if you run gpg --list-secret-keys on that machine.

Now if you want to restore a server just install duply and duplicity on the new machine, restore the .duply directory, set up the public key authentication and use the following commands:

Restore a single file:

duply <profile> fetch <src_path> <target_path> [<age>]

Restore a directory:

duply <profile> restore <target_path> [<age>]

The default value for <age> is $now, but you may also enter values like 1M or 10H, depends on how frequently your backups run.

Optional

If you want to take this to the next level and also backup your backup master server you could use something like Tarsnap which I already wrote about some time ago: Backup your server with Tarsnap

That's it!

Sources