1

I want to store a large number of entries in couchbase cache document. It would be in the following format.

[
  "6bc2e1db-3082-47b7-894c-85a28c183ab1": [
    "fr-FR"
    ..
    .. 17 entries

  ],
  "dfb348d2-68d1-47b5-8bc8-11178b84df48": [
    "fr-FR"
    ..
    .. 17 entries
  ],
  "f3601362-6fa4-47b1-b5fd-d5a285c65b34": [
    "fr-FR",
    "bb-BB"
    ..
    .. 15 entries
  ],
  "a92694ea-69f4-46de-814b-185849de4e3f": [
    "fr-FR",
    "aa-AA"
    ..
    .. 10 entries
  ]
  ..
  ..
  .. 3000000 entries
]

There would be around total of 300,000 keys. Each key would at max contain a list of 17 entries. These 17 entries are constant. One key may contain one constant and another one may contain all 17 constants. So, the worst case is that all 300,000 keys contain all 17 constants.

I calculated the size which comes out to be around 120 MB. While calculating the size, I counted the size of 17 entries each time for the 300,000 keys.

As we know, Java has a concept of String pool where strings are immutable. So, it's possible that a 100 variables may refer to same String object on the heap which helps Java save a lot of memory. My question is that does couchbase cache or any other cache implements same concept of String pool and immutability? If it does, it would reduce the size of my cache doc to just around 7 MB.

Anmol Garg
  • 113
  • 1
  • 6
  • 1
    Couchbase documents are limited to 20 MiB... which might affect your planning for worst-case document sizes :-) – dnault Nov 03 '22 at 23:34

1 Answers1

1

By default, the Couchbase Java SDK uses Jackson to deserialize documents. Jackson already interns property names (unless you disable that feature), but you're asking about array values. You could provide a custom Jackson deserializer that interns your entry strings.

dnault
  • 8,340
  • 1
  • 34
  • 53