14

Folks,

I'm trying to set up a regular backup of a rather large production database (half a gig) that has both InnoDB and MyISAM tables. I've been using mysqldump so far, but I find that it's taking increasingly longer periods of time, and the server is completely unresponsive while mysqldump is running.

I wanted to ask for your advice: how do I either

  1. Make mysqldump backup non-blocking - assign low priority to the process or something like that, OR

  2. Find another backup mechanism that will be better/faster/non-blocking.

I know of the existence of MySQL Enterprise Backup product (http://www.mysql.com/products/enterprise/backup.html) - it's expensive and this is not an option for this project.

I've read about setting up a second server as a "replication slave", but that's not an option for me either (this requires hardware, which costs $$).

Thank you!

UPDATE: more info on my environment: Ubuntu, latest LAMPP, Amazon EC2.

Alex Weinstein
  • 9,823
  • 9
  • 42
  • 59

5 Answers5

8

If replication to a slave isn't an option, you could leverage the filesystem, depending on the OS you're using,

I've used ZFS snapshots on a quite large MySQL database (30GB+) as a backup method and it completes very quickly (never more than a few minutes) and doesn't block. You can then mount the snapshot somewhere else and back it up to tape, etc.

Rich Adams
  • 26,096
  • 4
  • 39
  • 62
5

Edit: (previous answer was suggestion a slave db to back up from, then I noticed Alex ruled that out in his question.)

There's no reason your replication slave can't run on the same hardware, assuming the hardware can keep up. Grab a source tarball, ./configure --prefix=/dbslave; make; make install; and you'll have a second mysql server living completely under /dbslave.

EDIT2: Replication has a bunch of other benefits, as well. For instance, with replication running, you'll may be able to recover the binlog and replay it on top your last backup to recover the extra data after certain kinds of catastrophes.

EDIT3: You mention you're running on EC2. Another, somewhat contrived idea to keep costs down is to try setting up another instance with an EBS volume. Then use the AWS api to spin this instance up long enough for it to catch up with writes from the binary log, dump/compress/send the snapshot, and then spin it down. Not free, and labor-intensive to set up, but considerably cheaper than running the instance 24x7.

timdev
  • 61,857
  • 6
  • 82
  • 92
  • Thanks for the suggestion, but -1. Direct quote from the question: "I've read about setting up a second server as a "replication slave", but that's not an option for me either (this requires hardware, which costs $$)." – Alex Weinstein Jul 13 '11 at 16:55
  • I noticed that as soon I finished my answer, and edited it. To be fair, I didn't say anything about a second machine in my initial answer ;-) – timdev Jul 13 '11 at 17:03
  • Taking my -1 back. Interesting idea. I'm worried about perf implications of setting this up on a single box - I'm guessing this approach will slow down the mysqlserver write perf by more than half (second instance, overhead of transfers). What do you think? I'm running on an EC2 standard instance. – Alex Weinstein Jul 13 '11 at 17:03
  • The amount of extra load will depend on how write-heavy your application is. Every app is different, so I can't give you firm guidance there. – timdev Jul 13 '11 at 17:08
2

Try mk-parallel-dump utility from maatkit (http://www.maatkit.org/)

regards,

Nehal Dattani
  • 933
  • 6
  • 15
  • 3
    mk-parallel-dump has been deprecated by Percona. It still works but Percona no longer supports it for production use. In fact, this URL explicitly states not to use mk-parallel-dump as a backup program : http://www.maatkit.org/doc/mk-parallel-dump.html – RolandoMySQLDBA Jul 21 '11 at 16:40
1

Something you might consider is using binary logs here though a method called 'log shipping'. Just before every backup, issue out a command to flush the binary logs and then you can copy all except the current binary log out via your regular file system operations.

The advantage with this method is your not locking up the database at all, since when it opens up the next binary log in sequence, it releases all the file locks on the prior logs so processing shouldn't be affected then. Tar'em, zip'em in place, do as you please, then copy it out as one file to your backup system.

An another advantage with using binary logs is you can restore up to X point in time if the logs are available. I.e. You have last year's full backup, and every log from then to now. But you want to see what the database was on Jan 1st, 2011. You can issue a restore 'until 2011-01-01' and when it stops, your at Jan 1st, 2011 as far as the database is concerned.

I've had to use this once to reverse the damage a hacker caused.

It is definately worth checking out.

Please note... binary logs are USUALLY used for replication. Nothing says you HAVE to.

whiskeyfur
  • 736
  • 1
  • 5
  • 14
0

Adding to what Rich Adams and timdev have already suggested, write a cron job which gets triggered on low usage period to perform the slaving task as suggested to avoid high CPU utilization.

Check mysql-parallel-dump also.

Bitmap
  • 12,402
  • 16
  • 64
  • 91