Backups

by Apr 5, 2019IT, Wordpress

Backups….. you *know* you need them.  They are your protection from mistakes, accidents and even malicious intent.  If they are working correctly, they let you quickly and easily recover for any mishap you may run into.  In order to be the most effective, you will want lots of copies (I recommend 3 copies, with at least 2 being offsite from your main server).  You will want them to be complete.  You will want them to be kept around for a long time.  One report I read (sorry, can’t remember the reference) indicated that hackers can wait for up to a year after breaking into a website before trying to take any action. And most importantly, you want them to work! (This means you have tested them!)

So, how do you back up your WordPress sites?  There are lots and lots of plugin options.  I’ve seen good reviews of things like BackupBuddy, Update, Duplicator, and more.  I just don’t like plugins if I can avoid them.  I much prefer a more fine grain control over everything, and in general I try to keep the number of plugins installed on my sites to a minimum.  So instead, I fall back on one of my all time favorite tools with regards to WordPress.  That is WP CLI.  I like the command line.  I have lots of control over what I backup, how I backup things and how often I back up things.

For my WordPress sites, I went with a simple shell script, coupled with a crontab entry and a single file that lists what sites need backed up.

Here it is, in it’s entirety

#!/bin/bash
PATH=$PATH:/usr/local/bin
# Loop through all sites,
echo "Starting backups @$(date)"
for site in $(cat $HOME/etc/sites); do
cd $HOME/backups
BKDIR="$HOME/backups/$site/"
if [ ! -d "$BKDIR" ]; then
#make sure new directory exists
mkdir -p $BKDIR
fi
echo "Starting backup for $site"
cd $BKDIR
wp @$site db export --add-drop-table - |gzip > database.sql.gz
tar czf wordpress.tgz -C /var/www/html $site
echo "Finished backukp for $site"
echo
done
view raw backup hosted with ❤ by GitHub

Super simple, super easy.  The contents of the ‘$HOME/etc/sites’ file is a list, one per line, of each site I want to backup.  I also have an entry for each site in my WP CLI config.yml file, which allows the CLI to work without extra configuration or options. The script simply steps through each site listed. It creates a backup directory for the site (if it doesn’t already exist), then dumps the database in it’s entirety to a file, and backs up the entire WordPress directory to a single compressed file.

So, that takes care of creating a backup. Voila! But, how do I get my extra copies? Where is the offsite storage? What about more than 1 copy of a backup? There seems to be some things missing. Have no fear! I’ve got that covered also. I’ve setup 2 extra servers to hold copies of the backups. They serve a couple of different purposes. First, I created a server I name ‘backup’. This actually hosts (and serves) the latest backup copy of each site. So, for example, this site is ‘https://www.azurenight.com’ then my backup site is available at ‘https://backup.azurenight.com’. On this server, I have a cronjob that will take the copy of the backup created above for each site, and stores it off in a named directory. $HOME/backups//// I can determine how many copies I want to keep easily. On this site, I use the following script

#!/bin/bash
# Current backup variables
DAY=`date +%d`
MONTH=`date +%m`
YEAR=`date +%Y`
# old backup vairables
OLDDAY=`date -d"400 days ago" +%d`
OLDMONTH=`date -d"400 days ago" +%m`
OLDYEAR=`date -d"400 days ago" +%Y`
PATH=$PATH:/usr/local/bin
# clean up anything older than 400 days
#find $HOME/backups -type f -mtime +400 | xargs rm -f
#enable this if you want to be extra sure you are cleanup up older files
# Loop through all sites,
echo "Starting backups @$(date)"
for site in $(cat $HOME/etc/sites); do
cd $HOME/backups
BKDIR="$HOME/backups/$site/$YEAR/$MONTH/$DAY"
OLDBKDIR="$HOME/backups/$site/$OLDYEAR/$OLDMONTH/$OLDDAY"
if [ ! -d "$BKDIR" ]; then
#make sure new directory exists
mkdir -p $BKDIR
fi
if [ -d "$OLDBKDIR" ]; then
#clean up old directory
rm -rf $OLDBKDIR
fi
echo "Starting backup for $site"
cd $BKDIR
scp prod:~/backups/$site/database.sql.gz .
scp prod:~/backups/$site/wordpress.tgz .
echo "Finished copying backukp for $site"
echo "Updating local copy of $site"
tar xzf wordpress.tgz -C /var/www/localhost/htdocs
gunzip -c database.sql.gz | wp @$site db import -
$HOME/bin/$site-backup
echo "Finished updating local copy of $site"
echo
done
view raw backup-remote hosted with ❤ by GitHub

You may notice the similarity between this script and my other one. Yes, it is intentional…and makes things easier to work with as I move between environments. The biggest differences here are:

  • copying the backups from production to this host, saved off into a separate directory (per day)
  • automatically restoring that copy into this environment (both the *full* wordpress directory contents, including core, uploads, plugins, themes AND restoring the database)
  • invoking a site specific ‘cleanup’ script that has the job of transforming content/files/etc to match the new hosting environment

This is an example of my site specific ‘cleanup’ script

#!/bin/bash
wp @azurenight search-replace 'https://www.azurenight.com' 'https://backup.azurenight.com'
sed -i 's/\/var\/www\/html/\/var\/www\/localhost\/htdocs/' /var/www/localhost/htdocs/azurenight/.htaccess
perl -p -i -e 'print "define('"'"'JETPACK_STAGING_MODE'"'"',true);\n" if /^.*Happy blogging.*$/' /var/www/localhost/htdocs/azurenight/wp-config.php

It handles the job of replacing all URL references from my primary site, to the ‘backup’ URL.
Then fixes the .htaccess file to specify the correct path to the it’s location (which fixes WordFence). And finally, automatically adds a configuration to the config.php file to tell JetPack that it is in staging mode.

I can easily verify operation by visiting my backup @ ‘https://backup.azurenight.com’ I’m sure there is more I can do too, like maybe adding a norobots file or more, but this gets me a safe and effective backup, offsite from my primary host that is also quickly and easily verified!

I then mirror this same process, only instead of using ‘backup’ as my prefix, I have a ‘dev’ prefix hosting site. https://dev.azurenight.com. On this host, I use the same setup of scripts. Biggest difference from my ‘backup’ site is that here, I *do not* automatically restore the backup I copy each night. I also store a more limited number of backup copies. I instead created a small wrapper around the restore process, that allows me to restore any given site ‘on-demand’. This also gives me a hosting site that I can use for development. But that is a wholly different can of worms, that deserves it’s own post.

I hope this information helps someone else get safe and effective backups working!