Recent Projects

Automated Heroku Backups

It’s easy to enable automatic nightly PostgreSQL database backups from Heroku to Amazon S3. Nick Merwin and Derek Perez have shown us a couple of techniques for this sort of thing already, but I’ve got another one for you.

Start by adding the following to your Rakefile:

It’s OK if you already have a cron task defined. Did you know that rake tasks append behavior by default? Weird, I know. Anyway…

Add right_aws to your .gems file:

echo right_aws >> .gems

Commit your changes and push to Heroku:

git add .
git commit -m 'heroku backups'
git push heroku master

Enable the cron:daily addon:

heroku addons:add cron:daily

Provide Heroku with your Amazon S3 keys:

heroku config:add s3_access_key_id=YOUR_ID s3_secret_access_key=YOUR_KEY

Run the cron Rake task manually, for testing purposes:

heroku rake cron

If all goes well, a new private bucket named APP_NAME-heroku-backups will be created and will contain a backup file named APP_NAME-YEAR-MONTH-DAY-HOURMINUTESECOND.dump.

Confirm that the backups are working by downloading the archive and attempting to reload it into a freshly created database:

createdb NEW_DB
pg_restore -d NEW_DB BACKUP_FILE

It may complain about some users and roles from Heroku that are missing, but I think those warnings are safe to ignore. In any case, I’d suggest trying to use this new database with your local app in development, just to make sure it’s working as expected.

From here on out, Heroku will automatically invoke the cron Rake task automatically for you on a daily basis. This will create a new backup file that will be stored on S3. The backups on S3 aren’t being rotated or deleted automatically, which is only a minor annoyance for me. Please do let me know if you take the time to set up some kind of backup rotation, though.

It is highly recommended that you periodically confirm that the backup task is running properly:

heroku logs:cron

You should also periodically verify that the backups are valid like we did when we downloaded the backup file, used pg_restore, and tested the rehydrated database with the app that we have running locally.

Additionally, I’d recommend using Hoptoad in conjunction with Toadhopper so that you can receive a notifications if/when something goes wrong with your backups.

Add toadhopper to your .gems file:

echo toadhopper >> .gems

Uncomment the relevant lines in the heroku:backup Rake task:

# rescue Exception => e
#   require 'toadhopper'
#   Toadhopper(ENV['hoptoad_key']).post!(e)

Commit your changes and push to Heroku:

git add .
git commit -m 'todhopper for Heroku backups'
git push heroku master

Provide Heroku with your Hoptoad API key:

heroku config:add hoptoad_key=YOUR_HOPTOAD_KEY

Run the cron Rake task again, just to make sure it’s not broken:

heroku rake cron

This way, you’ll be notified if something goes wrong with your backup task. I’d still suggest manually verifying the backup files as frequently as possible, though. There’s no such thing as a “set it and forget it” database backup. Viva due diligence!

Finally, please note that I’m not sure if large databases will work with this backup method due to Heroku’s filesystem restrictions. It’s working for me, but your milage may vary.

10 Responses to “Automated Heroku Backups”

  1. Eric Davis says:

    Great work, I’ve been planning to build something like this for my app.

  2. Jay Godse says:

    This is cool! What seems really nice is that because Heroku is an EC2/S3 platform, data transfer from Heroku to S3 should be free. Is that correct?

    So you have a script to restore the data from the backup back to the Heroku application or to a different Heroku application (such as the staging or testing server)?

    What I also like about this is if there are data problems or software bugs related to data, you can just dump the data into a staging/test heroku application and reproduce the problem.

  3. Trevor says:

    @Jay, I’m not sure about the data transfer fees, and I haven’t made a script to restore the data automatically. You should pull down the db archive file and load it, then you could use “heroku db:pull/push” if you want to move things around.

  4. Jim Gay says:

    Trevor, this is a great idea.

    It’s got me wondering if there’s a way to make another separate Heroku app which checks your db dumps periodically, loads them, and reports any errors.

  5. Trevor says:

    @Jim, I suppose you could do that. Lemme know if you try it.

  6. Trevor says:

    Note that you’re free to adapt this script to use alternative S3 gems, or to not use S3 at all. There’s an interesting comment in this thread:

    http://groups.google.com/group/heroku/browse_thread/thread/39f34dbc4ab632d5

    …where Andy Shipman shows how he’s backing up his database to Dropbox using this same basic technique.

  7. Eric Davis says:

    @Trevor: Hope you don’t mind but I’ve gem’d this code up. Now just add “heroku_s3_backup” to your .gems file. I also changed how it’s invoked: instead of using a Rake task you can just call HerokuS3Backup.backup.

    I think this will make it more flexible, e.g. you can have a button on your admin dashboard to do a manual S3 backup…

    http://rubygems.org/gems/heroku_s3_backup

  8. Will Koffel says:

    Seems like you could leverage all the work heroku has done for you in the form of their “Bundles” to just capture everything, code and data.

    http://docs.heroku.com/backups

    You might need to run this on an EC2 instance or some other machine you have (I don’t think it can push remotely, nor do I think that you have temporary storage available on Heroku), but worth pointing out as a sanctioned alternative to pg_dump

  9. Will Koffel says:

    One more note, late breaking news I have on good authority suggests that Heroku is about to release an official tool for dumping PG databases to S3. So check if that’s live before rolling your own!

  10. Jon Atack says:

    Excellent, Trevor, working great for me. Thank you!

Leave a Reply