0

I need to store documents that have several arrays of possibly hundreds of thousands of integers to a CouchDB database. I have done some testing with node.js and the nano package by putting random numbers to arrays. I first tested by using integers with a max value of 60000 (should fit in 2 bytes) and then by using a max value of 255 (should fit in 1 byte).The document size was about two times bigger when using the larger values, so it seems that CouchDB dynamically uses memory depending on the integer value. Is this correct?

The problem is that, judging by document size, it seems that CouchDB uses two bytes when the max value is 255 and five bytes when the max value is 60000. This results to unnecessary disk space usage. Is there a way to specify that I want to use 16bit integers?

Jonathan Hall
  • 75,165
  • 16
  • 143
  • 189
user2563661
  • 314
  • 1
  • 3
  • 13
  • That clarifies a bit. However, I am wondering why values that are larger than 255 take more diskspace? For example, I tested that 256 uses about double the diskspace that 255 uses. – user2563661 Oct 02 '18 at 10:37
  • How are you testing that? – Jonathan Hall Oct 02 '18 at 11:32
  • I first set file_compression to none for testing. Then I create a new database and insert one document with an array of 1000000 elements, all equal to 255: { data: [255, 255, 255 ...] }. The data size for this database is now 2000819. If I create a new database and do the same with values of 256: { data: [256, 256, 256 ...]}, the data size is now 5001552. So it's actually more than twice larger. – user2563661 Oct 02 '18 at 12:43
  • Interesting. It seems CouchDB is obviously using a custom on-disk format. I suppose that shouldn't surprise me. – Jonathan Hall Oct 02 '18 at 13:07
  • Maybe integers smaller than 256 are actually saved on disk as one byte integers and one extra byte is needed for something else totalling in 2 bytes? Larger integers are then saved as floating points of 4 bytes + 1 extra byte totalling in 5 bytes? This would explain why 2 byte integers such as 256 or 60000 use more space than expected. – user2563661 Oct 02 '18 at 13:23

0 Answers0