9

I'm looking at solutions to store a massive quantity of information consuming the less possible disk space.

The information structure is very simple and the queries will also be very simple. I've looked at solutions like Apache Cassandra and relations databases but couldn't find a comparison where disk usage is mentioned.

Any ideas on this would be great.

Hugo Palma
  • 3,376
  • 3
  • 21
  • 22
  • Your question does not really provide enough information to answer. How much data is a massive quanity? How many writes a second do you anticipate? Do you need low latency read access or will you be accessing items in batch? What indexes will you need for retrieving the data later? – Eric Hauser May 05 '10 at 18:15
  • Just buy a bigger hard drive. – justin.m.chase Apr 21 '10 at 17:00
  • Sorry, doesn't really answer my question. I'm looking for way to optimize disk usage. – Hugo Palma Apr 21 '10 at 17:02
  • lol, that's actually a relevant point. How much data are we really talking about here? 10GB? 100GB? 1TB? – BradC Apr 21 '10 at 17:04
  • The goal is to deploy the database on a shared hosting site which has disk space limits. Increasing them has a serious impact on the monthly fee so it's not as easy as buying a new hard drive. It has impact on the fixed monthly cost of the solution. – Hugo Palma Apr 21 '10 at 17:11

4 Answers4

3

Speaking about Apache Cassandra - it's just a disk space hog. 200 MB of logs resulted in 1.2 GB files produced by Cassandra - and the keyspace was just 4 columns with 200 length strings.

sha1dy
  • 980
  • 2
  • 11
  • 27
2

Take a look at Oracle Berkeley DB - very simple robust database (key/value):

"Berkeley DB enables the development of custom data management solutions, without the overhead traditionally associated with such custom projects. Berkeley DB provides a collection of well-proven building-block technologies that can be configured to address any application need from the handheld device to the datacenter, from a local storage solution to a world-wide distributed one, from kilobytes to petabytes."

Oleg Razgulyaev
  • 5,757
  • 4
  • 28
  • 28
2

Redis might worth a check if you can store your data in key-value

Sundar
  • 1,204
  • 1
  • 14
  • 17
0

Newest version of Microsoft's SQL Server (2008) supports several levels of compression (row compression and page compression, in addition to backup compression). Might be worth investigating.

Some relevant resources:

BradC
  • 39,306
  • 13
  • 73
  • 89
  • Does compressed data stay in read/write mode ? MySQL also has data compression but all compressed data is readonly. – Hugo Palma Apr 21 '10 at 16:58
  • PostgreSQL also supports on the fly compression: http://www.postgresql.org/docs/current/static/storage-toast.html – janneb Apr 21 '10 at 17:29