Ubuntu Script to Backup Data to Amazon S3
Sunday, 13 July 2008 00:00

I'm sure most administrators are familiar with the importance of backing up your data. One option that is starting to gain popularity is backing up data to Amazon's S3 service. For those who may not be familiar with this service, it is essentially a hard drive that you can access via web services to store data. The great part about the service is that you pay for what you use, which allows you to start small and scale up over time. I would definitely recommend signing up for an account and trying out some simple scenarios, as it will only end up costing a few pennies!

Here, I attempt to outline a simple script that I've written for my Ubuntu server to backup important data to my Amazon S3 account.

Step 1: Sign Up for S3
This step shouldn't be too difficult, so I won't outline all the steps here. Simply head over to Amazon's S3 page. You will first need to sign up for an Amazon Web Services account (if you don't already have one), and after that you must enable your account to use the S3 service. After signing up for an account, be sure to head to the "AWS Access Identifiers" page and make note of your Access Key ID and Secret Access Key. These will be used later.

Step 2: Install S3Fox (Optional)
S3Fox is a Firefox plugin that allows you to browse, upload, and download the files stored within your S3 account. I recommend installing this to help view and ensure that all your scripts are working correctly. After installing S3Fox, you will need to provide the two identifiers from step 1. You may also want to take a moment to upload and download a sample file to make sure your S3 account is working correctly.

Note: S3Fox should most likely be installed on your desktop machine, not your server.

Step 3: Install Command Line Tools
Amazon S3 uses web services to upload and download data. Since we want our backup script to run from the command line, we need to help of some tools. I recommend using s3-bash which, as the name implies, are some bash scripts that have worked well for me. Simply unzip/untar the s3-bash archive to a location of your choosing. I didn't run into any dependency problems when using these scripts, since I think Curl and OpenSSL are included with Ubuntu.

I recommend manually running the s3-put script to upload a file to your S3 account, and then use S3Fox to verify it was properly uploaded.

Again, you will need to provide s3-bash with your two identifiers to allow it to operate correctly. One hitch that I ran into is that the secret access key must be stored in a file, but you have to be very careful to make sure that a new line isn't appended to the end of the file. Start by copying+pasting your key into a new file. Then use ls -la to check the file sizes. Your secret access key file should be 40 bytes in size. If not, you need to remove the newline from the end.

This was tricky for me to do, since I don't know Linux very well. It seems my text editor always added an extra newline, so I ended up using the split command to get the file's first 40 bytes:

split -b 40 aws
rm
mv awsaa

Step 4: Gather Backup Data
Next, write your own script that gathers all your data to a local directory. I recommend naming the new directory based upon the date, and then go ahead and drop any files you want to backup into that directory. For me, I would run a MySQL dump and 7zip the file. I also backup some of my Subversion repostories and 7zip those as well. I won't be able to give an example of this portion, because everybody is going to have different data to backup.

Lastly, have your script create a symlink to the most recent backup directory. This makes it easier for the S3 script to find which files need to be uploaded to your S3 account.

ln -s /lastbackup

For future reference, say that this script was saved as backup-local.

Step 5: Upload to S3
Now for the fun part. The script I put together will loop through all files in the lastbackup directory and upload each one to S3:

cd /lastbackup
stamp=$(date +%Y%m%d-%H%M%S)
for i in *; do
s3-put -k -s -T $i //backup-$stamp/$i
done

Save this as a new file as backup-s3. Then, create another script called backup that first calls your backup-local and then calls backup-s3.

Step 6: Schedule Your Backup
I used crontab to schedule my backup script to run every night. To configure crontab, run:

sudo crontab -e

Then, add a line to this file to have it run the combined backup script at whichever interval you prefer:

# m h dom mon dow command
00 06 * * *

Summary
That's all there is to it! You now have your Ubuntu server scheduled to back up any important data directly to your S3 account. For next steps, you may want to add scripts that periodically purge old backup files (from both your local server and S3), but I plan to manually do this using S3Fox, since I'm nervous about accidentally deleting the wrong backup files.

Of course, I'm sure there are a number of other ways to accomplish these backups, but this was the route that I chose. Feel free to leave any comments if you have feedback or suggestions!