Automated and encrypted backups with duplicity
Published on Dec 02, 2014If you don't like ending up like the plane in the header image (Jeff Sheldon, Unsplash) you should probably make sure you do have a good backup and restore strategy.
The goal of this short guide is to have automated, encrypted and incremental backups from one server to another remote server. To achieve this we are going to use duplicity and it's simplified wrapper duply. There are more detailed guides out there but these are the steps I used and they work for me, the sources used for this guide are linked at the bottom.
Backup master
This is the machine where we are going to store all the backups from the various remote servers on. I added a new user called backup
for this purpose. If you don't trust the other servers use a different user for each server.
useradd -m -G users,wheel,audio -s /bin/zsh backup
passwd backup
All the backups will be stored in it's home directory where we'll now create the following directories and remember the path, we'll need that at a later stage. Make sure you replace <server name>
with something like the hostname of the server you want to backup in that directory so it'll be easier to figure out which backup is stored where if you are using multiple servers.
A good example would be: /home/backup/incoming-backups/example.com
/home/backup/incoming-backups/<server name>
Backup slave
This is one of many remote servers we want to backup to the master server.
Install dependencies
The first step is to install the dependencies. There's no stable version available for Gentoo at the moment so you'll have to unmask the latest version by adding app-backup/duply ~amd64
to your package.keywords
file. Once this is done just install it.
emerge -av app-backup/duply app-backup/duplicity
If you are using Ubuntu/Debian a simple apt-get install duply
should also grab the other dependencies.
First Steps
Type duply <server name> create
to initialize a new backup. You should probably run this as the root user if you want to backup directories only accessible by root
. The <server name>
is a placeholder for whatever you want to call your backup set. I usually just use the hostname with no spaces, dots or any special characters.
This will create a new directory in your home directory containing two files: conf
and exclude
.
/root/.duply/<server name>/
Generate GPG Key
Because we want to encrypt all our backups so we can store them on an untrusted host we need to create a new GPG key. To do that just run gpg --gen-key
and accept the default options except the keylength which I usually set to 4096
.
You'll probably see a message like that telling you to generate new entropy:
Not enough random bytes available. Please do some other work to give
the OS a chance to collect more entropy!
In that case you could run some commands on the server, install updates or just emerge sys-apps/rng-tools
and generate some new entryopy with rngd -r /dev/urandom
. This should usually do the trick.
Once the key is generated it'll ask you for some information like "Full Name", "Email" and "Comment". In my case I use the name of the backup set for the full name, my regular email and the FQDM of the server as the comment. This will make it easier to find the private key for the server later on.
Don't forget to write down / store the passphrase somewhere safe. We are going to need it for the following step.
Configuration
Now it's time to edit the conf
file to fit out needs. There are a lot of comments in that file explaining the various options so I'm just going to go over my settings without explaining all of these.
Open the conf
file and add / uncomment the following values:
GPG_KEY & GPG_PW
If you don't remember your KEY ID just run: gpg --list-secret-keys
to get a list of all your secret keys in your keychain. The output will look something like that:
sec 4096R/XXXXXXXX 2014-11-30
uid dewey <mail@example.com>
ssb 4096R/YYYYYYYY 2014-11-30
The X'ed value is the GPG_KEY you are looking for, the passphrase is the one you wrote down earlier.
GPG_KEY='XXXXXXXX'
GPG_PW='YOURPASSPHRASE'
GPG_OPTS
GPG_OPTS='--compress-algo=bzip2 --personal-cipher-preferences AES256,AES192'
TARGET
There are a lot of options to choose from, pick the one your master server supports.
Backup to Master Server
TARGET='rsync://backup@example.com/incoming-backup/<server name>'
Backup to Google Drive (with PyDrive)
This is the new and working method, please follow this guide:
https://blog.notmyhostna.me/posts/duplicity-with-pydrive-backend-for-google-drive/
Backup to Google Drive (with gdata) [Deprecated due to Google API changes]
If you want to store your backups in your Google Drive just install the API library via dev-python/gdata
and add the Google Drive target. The user
is the part in front of the @ of your Google (Apps) email address, the password your the one you are using for that account. The path you specify after the domain part will be created automatically.
If you are using Gentoo make sure to switch your Python interpreter to python2.7
(by using eselect python
), 3.x is not supported by the gdata library yet.
TARGET='gdocs://user:password@example.com/backup-incoming/notmyhostna.me'
SOURCE
SOURCE='/'
Manual exclude parameters
If you want to manually ignore directories you can just create a .duplicity-ignore
file in that directory and it won't be included in the backup. This is a good option if you want to back up the entire /home/
directory but not the directory of temporary files in your own home directory.
FILENAME='.duplicity-ignore'
DUPL_PARAMS="$DUPL_PARAMS --exclude-if-present '$FILENAME'"
Backup History
With these parameters you'll be able to define how many full or incremental backups will be kept. You should read the documentation / comments and make sure it fits your environment.
MAX_AGE=1M
MAX_FULL_BACKUPS=2
MAX_FULLS_WITH_INCRS=1
MAX_FULLBKP_AGE=2M
DUPL_PARAMS="$DUPL_PARAMS --full-if-older-than $MAX_FULLBKP_AGE "
VOLSIZE=50
DUPL_PARAMS="$DUPL_PARAMS --volsize $VOLSIZE "
VERBOSITY=5
Exclude
The other important file is called exclude
and it defines the directories included or excluded from your backup. It's very simple:
- /etc/.git/
+ /etc/
+ /home/user2/imporant.txt
+ /home/dewey/
+ /var/www/
+ /root/.duply/
- **
Every line starting with +
will be included in the backup, everything with -
will be skipped. - **
will exclude everything not matched by the parent rules. The order matters so make sure you don't exclude /home/someuser/
and later on add /home/someuser/coup.txt
- it won't be included that way.
Public Key Authentification
We want to be able to log into the master server without entering our password so the backup task will be able to run automatically in the background. We achieve that by copying our public key (id_rsa.pub
to the remote server's authorized_keys
file. Luckily there's an easy way to do just that:
ssh-copy-id -i ~/.ssh/id_rsa.pub backup@example.com
Enter your password one last time and we are set. Try to login via ssh to see if it works and if you don't have to enter the password again we succeeded.
First Backup
If you want to see the available duply commands just use duply usage
. In our case we are going to use:
duply <server name> backup
for the first full backup. From now on just use
duply <server name> incr
to trigger incremental backups. If you want to see the list of backups stored on the remote host use duply <server name> status
and you'll see something like this:
Found primary backup chain with matching signature chain:
-------------------------
Chain start time: Sun Nov 30 21:12:22 2014
Chain end time: Tue Dec 2 00:00:04 2014
Number of contained backup sets: 5
Total number of contained volumes: 10
Type of backup set: Time: Num volumes:
Full Sun Nov 30 21:12:22 2014 1
Incremental Sun Nov 30 21:16:34 2014 1
Incremental Sun Nov 30 21:53:00 2014 6
Incremental Mon Dec 1 00:00:03 2014 1
Incremental Tue Dec 2 00:00:04 2014 1
Pretty!
Note: The first backup also exported the private and public gpg keys to the ~/.duply/<server name>
directory. Please don't skip the section called "Restore" at the end of the guide. We are going to deal with these files there.**
Automation
Nobody likes to do things manually so we are going to tell cronjob to do all the heavy lifting for us. Use crontab -e
to view your available cronjobs and add:
0 0 * * 7 /usr/bin/duply /root/.duply/<server name> full_verify_purge --force
0 0 * * 1-6 /usr/bin/duply /root/.duply/<server name> incr
Include (mySQL) Databases
If you want to backup mySQL databases too you'll have to grab a database dump, move it to some location included in your exclude
file and clean up that location after the backup. Duply got us covered there.
Just create a file called post
and pre
in your .duply/<server name>/
directory.
pre
/usr/bin/mysqldump --all-databases -u root -pXXXXXXXX | gzip -9 > /var/backups/sql/sqldump_$(date +"%d-%m-%Y").sql.gz
post
/bin/rm /var/backups/sql/sqldump_$(date +"%d-%m-%Y").sql.gz
If your dump takes a long time make sure the timestamps are still covered by your post
command and the archives don't build up in that directory.
Once you created these don't forget to add that directory to your exclude
file like that:
+ /var/backups/sql/
Restore
Important: After the first backup make sure to copy the whole .duply directory to some place safe. It'll now include your private key.
Your ~/.duply/<server name>
directory should now contain your config files and the public and private encryption key. It's very important to store this directory somewhere safe. If you don't have access to this directory it won't be possible to restore and decrypt the backup.
.
├── conf
├── exclude
├── gpgkey.XXXXXXXX.pub.asc
├── gpgkey.XXXXXXXX.sec.asc
├── post
└── pre
If you don't have a way to transfer scp
this to a safe place just encrypt it, move it to a public directory and download it.
Generate a password:
openssl rand -base64 32
Create an archive:
tar cvzf duply-<server name>.tar.gz .duply
Encrypt it:
openssl enc -aes-256-cbc -salt -in duply-<server name>.tar.gz -out duply-<server name>.tar.gz.enc -k <your password>
Decrypt it on the target machine:
openssl enc -d -aes-256-cbc -in duply-<server name>.tar.gz.enc -out duply-<server name>.tar.gz -k <your password>
If you want to store the private key in the keychain on your main machine just import gpgkey.XXXXXXXX.sec.asc
to GPG Keychain (Mac only) or do it via the command line:
gpg --allow-secret-key-import --import gpgkey.XXXXXXXX.sec.asc
It'll now show up if you run gpg --list-secret-keys
on that machine.
Now if you want to restore a server just install duply and duplicity on the new machine, restore the .duply
directory, set up the public key authentication and use the following commands:
Restore a single file:
duply <profile> fetch <src_path> <target_path> [<age>]
Restore a directory:
duply <profile> restore <target_path> [<age>]
The default value for <age>
is $now
, but you may also enter values like 1M
or 10H
, depends on how frequently your backups run.
Optional
If you want to take this to the next level and also backup your backup master server you could use something like Tarsnap which I already wrote about some time ago: Backup your server with Tarsnap
That's it!