4

I understand that CouchDB views are pre-computed, now I'm wondering what the storage cost is per view. How can this be estimated? Is it the raw JSON size of the emitted data?

To be more specific, it's BigCouch (Cloudant).

  • 1
    CouchDB tends to eat disk for breakfast since it quite aggressively trades disk space for performance. It's very hard to estimate disk use for "general data", but count on it being several times the JSON size. Just did a check on a local database, it currently seems to be around 45x(!) the raw data size, admittedly not well compacted. – Joachim Isaksson Aug 31 '13 at 17:54

1 Answers1

2

I can't give you a rule for estimation, but you have to consider several factors here

  • CouchDB uses append-only storage, so your database (and view) files will grow also if you update data. To free unused space again, compaction is needed.
  • The data vs on-disk sizes can be extracted using the _info endpoint of a design-document
  • CouchDB uses a B-tree data structure for indexing, so a view requires the space of serialized JSON + some overhead for the tree
  • Since version 1.2 CouchDB by default compresses the database and view files with the snappy algorithm
  • If you are interested in the internals, there are discussions here, here, here and here.
Red15
  • 755
  • 1
  • 5
  • 17
Stefan Kögl
  • 4,383
  • 1
  • 27
  • 34