2

Have the MonetDb's developers tested any other compression algorithm on it before?

Perhaps they have tested other compression algorithms ,but it's really had a negative performance impact.

So why haven't they improved this database's compression performance?

I am a student from China. MonetDb is really interesting me and I want to try to improve its compression performance.

So, I should make sure that any body have done this before.

It would be my grateful if you could answer my question.

That is because i really need this.

Thank you So much.

Chris McCauley
  • 25,824
  • 8
  • 48
  • 65

2 Answers2

2

MonetDB only compresses String (Varchar and char) types using dictionary compression and only if the number of unique strings in a column is small.

Integrating any other kind of compression (even simple ones like Prefix-Coding, Run-length Encoding, Delta-compression, ...) need a complete rewrite of the system because the operators have to be made compression-aware (which pretty much means creating a new operator).

The only thing that may be feasible without a complete rewrite is having dedicated compression operators the compress/decompress data instead of spilling to disk. However, this would be very close to the memory compression apple implemented in Mavericks

Holger
  • 1,648
  • 1
  • 16
  • 26
  • But i have another question. If i program a Software layer cover on the monetdb,and compress the data before it write into the monetdb decompress the data after it read out from the monetdb.Could you give me some advice if this plan can work or not? Whether the performance can be accepted? – user2994562 Nov 18 '13 at 01:35
  • I think you are misunderstanding the idea of Stackoverflow. It is not a discussion forum but a QA system. If you think your question has been answered you should accept the answer and pose a new question if you have one. However, your follow-up question is not very specific and hard to answer. Think up a clear, concise question and post it! – Holger Nov 19 '13 at 09:14
  • Ok. I'm not familiar with this kind of communicate method, sorry for that. Thank you for you help again. – user2994562 Nov 20 '13 at 01:51
2

MonetDB compresses columns using PFor compression. See http://paperhub.s3.amazonaws.com/7558905a56f370848a04fa349dd8bb9d.pdf for details. This also answers the your question about checking other compression methods.

The choice for PFOR is because of the way modern CPU's work, but really any algorithm that doesn't work with branches but with (only) arithmetics will do just fine. I've hit similar speeds with arithmetic coding in the past.

atlaste
  • 30,418
  • 3
  • 57
  • 87