6

One of my clients want us to give a safe estimate of how much disk space does a compressed SQL backup file take. They need to size their server hard disks accordingly.

We can rouhgly estimate the SQL database backup size (based on estimated number of records and etc), but what about when it is compressed?

I know this is hard to answer, but just from your experience, what would be a typical compression ration of an SQL Backup file?

PersianGulf
  • 602
  • 8
  • 21
user1034912
  • 1,345
  • 3
  • 14
  • 20
  • 3
    Run a backup, compress it, and find out for yourself. There is simply no way anyone other than *you* can give a sane answer to this question. – EEAA Mar 05 '13 at 00:39
  • 1
    You don't normally back up indexes either and if you have more than a few of them their total size can exceed that data. – Ladadadada Mar 05 '13 at 07:26
  • 2
    Why was this closed as not real question? Even when it can't be reasonably answered, I don't find it ambiguous, vague, incomplete, overly broad, nor rhetorical, and it's not difficult to tell what is being asked – Mario Trucco Jun 30 '16 at 14:17

2 Answers2

15

It's impossible to answer this question without having even a small representative sample of the data in the database. If your DB is full of mpeg videos then even using PAQ isn't going to compress the data more than a few percent.

If your DB is full of the same byte repeated then it's going to compress to an extremely small size, maybe a ratio of 99.9% or better. Your data is probable somewhere inbetween, so it'll compress somewhere between 0.1% and 99.9%.

The backup industry likes to advertise that "most" data can be 50% compressed. I've found this to be optimistic, but not terribly far off. I've found our MSSQL DBs generally compress less than 30% however.

Chris S
  • 77,945
  • 11
  • 124
  • 216
  • 3
    Also if you have grown your data files to be substantially larger than their data sets then you can get (seemingly) awesome compression ratios, that decrease as your free space is used. – Mark Henderson Mar 05 '13 at 04:43
6

This is tough to answer but I just did a dump today on one of my databases (running PostgreSQL 9.2:

select pg_size_pretty(pg_database_size('mydb'));       
pg_size_pretty 
----------------
5660 MB
(1 row)

And after the dump:

% ls -lh

-rw-r--r--  1 pgsql  pgsql   363M Mar  4 16:42 mydb-20130304.sql.gz

So that's a 93.5% decrease in size.

phrac
  • 77
  • 1
  • 10
    That's incredibly useless.... – Chris S Mar 05 '13 at 02:29
  • FWIW, I ran that same experiment and got a 92.7% reduction: select pg_total_relation_size('mytable') returns 5659951104 (5.7GB). A dump file of the same table (format=custom, default compression level) is size 413748266 (414MB). The table in question has many repeated text strings and numeric values, making it eminently compressible. – user207863 Feb 19 '16 at 15:34
  • I find this useful. Sure, the poster could have a database full of nothing but zeroes, or it could be a database of random bytes sequences, or it could be 50% of each, etc etc. But most of the world's databases are tables of Students and Products and stuff, so I think "90%" is a useful starting point - certainly more useful than just saying "It depends". – Jason Crease Jun 27 '22 at 13:44