How to Backup Linux to Amazon S3 Using s3cmd

S3cmd is a program that allows you to backup your Linux box to Amazon S3. Amazon S3 allows you basically unlimited storage and, as long as you have the bandwidth, you can use it from any location. There are two options in a backup that you can use: you can either copy all the files over to an S3 bucket (called put) or you can use the sync command to sync file changes on a regular basis.

(Screencast Located at Bottom)

Step one is to download the program. If you don’t have your own Linux box/Linux server, you can signup on Linux Academy and practice this on your own instance. Once you’re ready and at the Linux command prompt, type “apt-get install s3cmd”.

Once this is installed, you need to make sure you have an Amazon S3 bucket – available by signing up at http://aws.amazon.com. Select “security credentials” from the top right corner of the login screen; this will provide you access to your API keys.

Now you can configure your s3cmd program to connect to your Amazon S3 bucket. At the command prompt, type “s3cmd –configure”. Enter through all the defaults and save your connection settings. Once this process is finished, you should arrive back at the command prompt. At this point you can verify your install worked by typing “s3cmd ls” which will list all the available buckets in your Amazon S3 account.

Lets say you want to backup /home/pinehead folder to the Amazon S3 bucket. You have two options; first you can put the files into the bucket which will simply copy all the contents over and overwrite any existing files with the new ones. :$ s3cmd put -r /home/pinehead s3://pinehead-bkup. Note: we use -r when we are copying over folders so it copies recursively.

s3://pinehead-bkup is the S3 bucket that you are backing up to. To verify the copy worked, you can see the contents of the bucket by typing :$ s3cmd ls s3://pinehead-bkup. Doing this, you see your pinehead folder inside of the bucket. If you only wanted to copy the contents of the pinehead folder and not the folder itself, we would add “/” at the end of the path. :$ s3cmd put -r /home/pinehead/ s3://pinehead-bkup.

A real backup will sync the changes and not copy the contents. So, if you are ready to do that, just replace “put” with “sync” and run your command. Once you’ve done so, change or add some files/folders in your directory. Then run the command again and you’ll notice only the changed or added files/folders were synced to the Amazon S3 bucket.


This is what it looks like when the file changes.


This is what it looks like when no changes have occurred and nothing has synced.

To make this a backup, you need to schedule it to run on a regular basis. In this case, we’ll create a daily cron job.
:$ crontab -e
Add the following line: 0 5 * * * s3cmd sync -r /home/pinehead s3://pinehead-bkup

Of course, you will use your own folder name and your own bucket name respectively. This will run your backup everyday at 5am and only sync the changes that occurred on the file system.

Here’s a video if you prefer that type of thing

11 thoughts on “How to Backup Linux to Amazon S3 Using s3cmd

  1. Thank you for this excellent walk through – while not on Ubuntu, I was able to get the s3cmd package downloaded and installed at s3tools.org and once installed, the rest of your tutorial made the backup a breeze!

  2. Thanks for the tutorial! Would you like to share how much data you backup, and what it costs you? I’m finding it hard to understand what Amazons pricing model would result in for me.

  3. this is cool and useful… but this is an “asynchronous mirror”, not a backup. the reason for the distinction is, if you delete a file locally, and your “backup” runs, it will delete (sync), your remote file too. same if you corrupt a file or you are hit with ransomware. maybe s3 will backup your mirror automatically but i’m under the impression it does not.

    1. If you enable versioning on your bucket that should help here. Bad thing about doing “backups” in this way is that it’s hard to get a snapshot of the entire filesystem on a given date. Picking out single files from a few days back is fine though.

      Amazon has a script that “backs up” EFS to a given “date buckets” using hardlinks which is similar to what you would want to do here.. It would probably work given a bit of tweaking to retrieve S3 backed up FS to a given date: https://github.com/awslabs/data-pipeline-samples/tree/master/samples/EFSBackup

Leave a Reply

Your email address will not be published. Required fields are marked *