-1

MYSQL's InnoDb storage engine supports compression. However it seems to be only compressing each row separately... which ignores redundancy between rows...

Can multi-row compression be enabled with mysql?

My raw table size is 1.2 GB... however the optimal per-row entropy (pen and paper) would give about .6 GB which is what happens when I enable innodb compression. However if I mysqldump|bzip2 then I get a bzip'ed file as small as 94 MB. This is a factor of 6.3 improvement! How can I enable multi-row compression?

user3338098
  • 907
  • 1
  • 17
  • 38
  • 1
    What are the data types of each field and what data are you exactly storing? This is important as it could save you tons of bytes. For example, if you're storing a single-digit number in a field defined as a bigint or a large char then you're wasting space for nothing. –  Mar 20 '15 at 21:19
  • raw row size is exactly 62 bytes, however no more than half of it is actually used the rest is all zero padding... – user3338098 Mar 23 '15 at 15:35
  • the data is continuously sampled data that is almost the same between rows... 100.1 100.2 100.1 100.3 100.1 100.1 100.2.... etc... – user3338098 Mar 23 '15 at 17:10

1 Answers1

0

bzip2 uses a different compression algorithm, which is like comparing apples to oranges.

From the MySQL documentation:

You can get a rough estimate of how efficiently your data can be compressed by using a utility that implements LZ77 compression (such as gzip or WinZip) on a copy of the .ibd file for an uncompressed table. You can expect less compression from a MySQL compressed table than from file-based compression tools, because MySQL compresses data in chunks based on the page size, 16KB by default.

The documentation states that MySQL doesn't currently support larger page sizes for InnoDB.

David K-J
  • 930
  • 7
  • 14
  • `mysqldump|gzip --fast` is only 185MB which is still 3 and half times better than innodb compression... gzip has a compression block size of 32KB which shouldn't make to much of a difference if each row is a mere 62 bytes in raw uncompressed form... which leads me to conclude that innodb does no compression between rows... – user3338098 Mar 23 '15 at 17:08
  • I tried tokudb and it got me down to 224 MB which is 2.8 times better than innodb, and i'll accept twice the size of optimal as an acceptable compromise. – user3338098 Mar 23 '15 at 21:53