17

I often need to clone production data to investigate bugs. Even with a trivial database size heroku db:pull (taps) takes 5+ minutes and seems to have a high chance of failing. Is there an alternative method to pull the database?

Libraries for alternative processes / articles would also be appreciated.

Jack Kinsella
  • 4,491
  • 3
  • 38
  • 56

6 Answers6

23

Check out pgbackups. It has replaced the Heroku bundle command and will give you a the postgres equivalent of mysqldump. This is far more civilized than Taps for large datasets.

heroku pgbackups:capture

Will create a dumpfile and store it. To download the dumpfile you need the url which you get with

heroku pgbackups:url b001 (or whatever the id number of the backup is)

That will return an url from which you can download your dump. You can paste it into Firefox if you want or use curl/wget like they suggest. The use pg_restore to load the dump file into your database as they say in the docs:

pg_restore --verbose --clean --no-acl --no-owner -h localhost -U test_user -d myapp_development /home/mike/Downloads/b001.dump

pg_restore: connecting to database for restore

mikewilliamson
  • 24,303
  • 17
  • 59
  • 90
12

I created a shell script that automates this process (based on Mike Williamson's answer).

https://gist.github.com/921535

#!/bin/bash

# Best use case is to create a file "update_local_db.sh" in your project folder and then     
# call the command with bash update_local_db

# Follow me: @jackkinsella

function LastBackupName () { 
  heroku pgbackups | tail -n 1 | cut -d"|" -f 1
}

# This part assumes you have a low limit on no. of backups allowed
old_backup=$(LastBackupName)
heroku pgbackups:destroy $old_backup 

heroku pgbackups:capture 
new_backup=$(LastBackupName)
curl $(heroku pgbackups:url $new_backup) > temporary_backup.dump
pg_restore --verbose --clean --no-acl --no-owner -h localhost -U REPLACE_WITH_YOUR_USER -d REPLACE_WITH_YOUR_DB_NAME temporary_backup.dump 
rm -f temporary_backup.dump
Jack Kinsella
  • 4,491
  • 3
  • 38
  • 56
  • 2
    Great script, so much faster than db:pull and a solution to the problem of not finding taps. I'm getting the output of heroku pgbackups separated by spaces rather than "|" so need to use 'cut -d" " . I'd also prefer to destroy my old Heroku backup after I save a new one, rather than before. Since Heroku now has a liberal policy on saving backups, we can skip the deletion entirely. – Mike Blyth Aug 21 '12 at 07:48
  • I had to change the delimiter to a space. ie. heroku pgbackups | tail -n 1 | cut -d" " -f 1 – joshs Jan 15 '13 at 17:16
3

Mike's correct - PGBackups is the way to do this. When you create a backup with PGBackups, you get access to a standard pg_dump file. Here's the relevant section of the Dev Center PGBackups article.

Ben Scofield
  • 6,398
  • 2
  • 23
  • 22
1

This post is quite old right now.

The newest and easiest method right now is using Heroku's pg:pull/pg:push

Lomefin
  • 1,173
  • 12
  • 43
  • 1
    True. but the point is that `pg:pull` is much slower than downloading a DB dump and importing. – EasyCo Oct 24 '17 at 23:58
0

An update to Jack's script, with Heroku's recommendation as of Jan 2015.

The first part is due to running on different computers, hence my Postgres dbs has different names.

#!/bin/bash

# Run the following command: bash update_local_db.sh

# Getting computer name, which is the same as username in Postgres db
echo "Please enter name of Computer"
read input_variable
echo "You entered: $input_variable"

# Make a backup on Heroku
heroku pgbackups:capture --app APP_NAME
echo "== Created a new backup =="

# Download the backup and name it latest.dump
curl -o latest.dump `heroku pgbackups:url --app APP_NAME`
echo "== Downloaded the backup =="

# Restore local db with latest.dump
pg_restore --verbose --clean --no-acl --no-owner -h localhost -U $input_variable -d my_db_name latest.dump
echo "== Replaced db with downloaded =="

# Delete downloaded db latest.dump
rm -f latest.dump
echo "== Deleted downloaded db =="
echo "== Done! :) =="
Community
  • 1
  • 1
Fellow Stranger
  • 32,129
  • 35
  • 168
  • 232
0

Here is a script I wrote that utilizes pg:pull, as mentioned by Lomefin, to pull down a db from Heroku and replace a local one with it:

#!/bin/bash

export MAIN_DB=NAME_OF_LOCAL_DB
export TMP_DB=NAME_OF_TEMPORARY_DB

function delete_db () {
    psql -d ${1} -c "SELECT pg_terminate_backend(pg_stat_activity.pid)
FROM pg_stat_activity
WHERE pg_stat_activity.datname = '$1'
  AND pid <> pg_backend_pid();" || true

    dropdb ${1} || true
}

delete_db ${TMP_DB}

heroku pg:pull DATABASE_URL ${TMP_DB} || exit 1

delete_db ${MAIN_DB}

psql -c "ALTER DATABASE $TMP_DB RENAME TO $MAIN_DB;"

Since pg:pull clones to a new database, your work will not be interrupted (only once it renames the db, which takes a fraction of a second). The script can, of course, be easily customized to your liking.

fgblomqvist
  • 2,244
  • 2
  • 32
  • 44