Backup & compress specific tables/columns from extremely large MySQL database onto different server

Question

My goal is to run a daily backup (overriding the previous day's backup) of specific portions of my database so I could easily download it and import it and have all my important data if anything goes wrong.

I currently have a 20GB web server (Ubuntu) and my database is ~11GB and growing (slowly), so I know I'll need to fire up a second web server to store the backup. (And I'll eventually need to upgrade my primary server once it becomes ~20GB.)

My data is currently set up in a few indexed tables, but I don't need to back up all the data, so I'd like to run a query that selects only what I need and rebuilds a new database (.sql). This would help keep the size down, but the file is still going to be very large, so I'd also like to compress this file, would GZIP be the way to go? This would also neatly package the entire database into one file, which is something I need.

In addition, since I'll probably be using the second server to request the data from the first server, how do I ensure that the request doesn't time out?

TL;DR Need to run a daily back up of an enormous (10+ GB) database onto another server while removing certain tables/columns in the process and compressing to optimize hard disk & bandwidth usage so I can easily download & import the backup (one file) if need be.

I haven't heard of that, but in looking at mysql replication (which looks awesome btw), they say it doesn't protect if someone accidentally deletes something, so that would be an issue. In addition, I'm trying to make it so I can download and import just one file, and I don't think replication would be one file. — Andrew, May 08 '14 at 14:09
You can't move it yourself. If 5 other users vote to move it, it'll happen. Or a moderator can do it. http://meta.stackoverflow.com/questions/251751/how-can-i-move-a-thread-from-stack-overflow-to-dba-stack-exchange — Bill Karwin, May 08 '14 at 14:16
@Andrew MySQL 5.6 has [delayed replication](https://dev.mysql.com/doc/refman/5.6/en/replication-delayed.html). You can have the database slaves lag for 24 hours, relative to the master. If something is deleted from the master, it'll still be on the slave for a day. — RickN, May 08 '14 at 14:25
Would I be able to package the database into one file? I'd like to be able to easily download it to my local server if need be. — Andrew, May 08 '14 at 14:29
You can use `rsync` to copy several backup files at the same time, and if you do that against local copies you can achieve huge speed-up at the same time, since it just shifts changes across the network. — halfer, May 08 '14 at 14:53

Bill Karwin · Accepted Answer · 2014-05-08T14:29:57.617

2

Mysqldump can output selected tables, and you can pipe the output to gzip:

$ mysqldump mydatabase table1 table2 table3 |
    gzip -c > dump.sql.gz

There's an option for mysqldump to dump a subset of rows.

$ mysqldump --where "created_at > '2014-03-01'" ...other options...

Of course that WHERE condition must be recognized by all tables you dump. That is, if you reference a column that doesn't exist in one of the tables, it's an error.

Mysqldump has no option for selecting a subset of columns.

Re your comment:

$ (mysqldump ...first... ; mysqldump ...second...) | gzip -c > dump.sql.gz

edited May 08 '14 at 14:29

answered May 08 '14 at 14:14

Bill Karwin

538,548
86
673
828

I corrected a mistake in my example. The `--databases` option forces a dump of *all* tables in the named database. – Bill Karwin May 08 '14 at 14:19
I would need to trim one of the tables with a WHERE condition but it wouldn't apply to the other, so I guess I could export the two tables separately. Would there be an efficient way to package them together and then compress after exporting both? Also, can I run a mysqldump remotely? I can't store the backup on the primary server, it would need to be transmitted and then stored. – Andrew May 08 '14 at 14:27
The edit looks like it would handle packaging the tables the way I need, but can I run that from a different server, and would it be able to handle a database of this size without messing up? – Andrew May 08 '14 at 14:34
You should be able to run it remotely using `ssh`. Or else you could set up a cron job to do it. Many sites have much larger databases, so I don't think that will be a problem. – Bill Karwin May 08 '14 at 14:43
Yeah, I was going to set up a cron job on the second server to automate this whole thing. My only concern is keeping a connection open for that long -- I assume it's going to take a few hours to dump the database. I'll give it a shot tonight, but this looks like it should work. Would it matter if the database changes while I'm dumping it? How does that get handled? – Andrew May 08 '14 at 14:55
Yes, I encourage you to test. I don't know what type of hardware your server is running, if you have a very slow machine then it will take longer. But I expect it to take well under 1 hour. – Bill Karwin May 08 '14 at 15:05
Really? Under an hour? My primary server is hosted by DigitalOcean and my secondary one will be too. Primary server has 1GB memory/1 core processor (10$/month plan), secondary will probably have 512MB memory, 1 core processor (5$/month plan). Trying to keep costs low of course. – Andrew May 08 '14 at 15:10
I have a 5GB database that only takes 45 minutes to *restore* on an X-Small Amazon instance, or a Linux VM running on my Macbook Pro. Restoring a mysqldump usually takes longer than creating the dump. But comparisons are not as useful as testing of course. – Bill Karwin May 08 '14 at 15:17
Yup, I'll give it a go later tonight and I'll check back in to let you know how it works. (And green your answer if all goes well of course.) Thanks a lot – Andrew May 08 '14 at 15:23
I created a second server and successfully connected them privately. I can use mysqldump from the command line with "sudo mysqldump -P 3306 -h -u -p " and the database dumps into the terminal, but as soon as I add " | gzip -z > dump.sql.gz" I get "errno 32 on write." Any idea what's causing this?
– Andrew May 08 '14 at 23:20
I think we're good here! Thanks a lot for your help, definitely couldn't have done this without you. – Andrew May 09 '14 at 00:33

score 0 · Answer 2 · answered May 08 '14 at 14:17

0

For your compression, you can run whatever you use to export your DB through a pipe into gzip like so:

mysqldump -uMyUser DatabaseName | gzip -1 > dbdump-`date +'%F-%T'`.gz

That last little bit is to give your resulting gzip a nice timestamp. If you value space more than CPU cycles, you could experiment with bzip2 and see if that gives you some savings, but in general a .gz should be fine.

Check the mysqldump manpages for information on setting which tables you wish to include (e.g. via --tables)

answered May 08 '14 at 14:17

Josh from Qaribou

6,776
2
23
21

Would I be able to trim the data within the tables I select? For example, I have a table called "users" that lists all my users, but I want to remove any users that have column "finished" = '0' – Andrew May 08 '14 at 14:19
Yes, you can set a --where condition. – Josh from Qaribou May 08 '14 at 14:27

score 0 · Answer 3 · answered May 08 '14 at 15:08

0

You could use MySQL Replication to pull data to the computer/server where you want the backup to be. Then have it create the dump/backup on the back up server so you aren't transferring a 10gb file across the network.

answered May 08 '14 at 15:08

Pitchinnate

7,517
1
20
37

Backup & compress specific tables/columns from extremely large MySQL database onto different server

3 Answers3