0

A while back, I asked this question about a ChronicleMap being used as a Map<String,Set<Integer>>. Basically, we have a collection where the average Set<Integer> might be 400, but the max length is 20,000. With ChronicleMap 2, this was causing a rather vicious JVM crash. I moved to ChronicleMap 3.9.1 and have begun to get an exception now (at least it's not a JVM crash):

java.lang.IllegalArgumentException: Entry is too large: requires 23045 chucks, 6328 is maximum.
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.allocReturnCode(CompiledMapQueryContext.java:1760)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.allocReturnCodeGuarded(CompiledMapQueryContext.java:120)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.alloc(CompiledMapQueryContext.java:3006)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.initEntryAndKey(CompiledMapQueryContext.java:3436)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.putEntry(CompiledMapQueryContext.java:3891)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.doInsert(CompiledMapQueryContext.java:4080)
    at net.openhft.chronicle.map.MapEntryOperations.insert(MapEntryOperations.java:156)
    at net.openhft.chronicle.map.impl.CompiledMapQueryContext.insert(CompiledMapQueryContext.java:4051)
    at net.openhft.chronicle.map.MapMethods.put(MapMethods.java:88)
    at net.openhft.chronicle.map.VanillaChronicleMap.put(VanillaChronicleMap.java:552)

I suspect this is still because I have values that are far outliers to the mean. I assume ChronicleMap determined the maximum number of chunks to be 6328 based on the average value I gave the builder, but didn't expect there to be a gigantic value which needed 23045 chunks.

So my question is: what's the best way to go about solving this? Some approaches I'm considering, but still not sure on:

  1. Use ChronicleMapBuilder.maxChunksPerEntry or ChronicleMapBuilder.actualChunkSize. That said, how do I deterministically figure out what those should be set to? Also, this will probably lead to a lot of fragmentation and slower performance if it's set too high, right?
  2. Have a "max collection size" and split the very large collections into many smaller ones, setting the key accordingly. For example, if my key is XYZ which yields a Set<Integer> of size 10000, perhaps I could split that into 5 keys XYZ:1, XYZ:2, etc. each with a set of size 2000. This feels like a hack around something I could just configure in ChronicleMap though, and results in a lot of code that feels like it shouldn't be necessary. I had this same plan mentioned in my other question, too.

Other thoughts/ideas are appreciated!

Community
  • 1
  • 1
Depressio
  • 1,329
  • 2
  • 20
  • 39
  • How many entries do you specify in your map, via `entries()`? – leventov Dec 14 '16 at 06:15
  • A few thousand, I believe (don't have exact numbers in front of me). Before I create the map, though, I can figure out exact statistics of what I'm putting in there because I have a `Map>` in memory already. `entries()` is set based on the map's size; I calculate the average entry from the average size of the sets (well, I use `averageValue()` with the set closest to the average size, but above). – Depressio Dec 14 '16 at 17:08

1 Answers1

1

If you don't specify maxChunksPerEntry() manually, the maximum size of entry is limited with the segment tier size, in chunks. So what you need is to make segment tier size larger. The first thing you can try to do is configuring actualSegments(1), if you are not going to access the map from multiple threads within the JVM concurrently. You have additional control over those configurations via ChronicleMapBuilder.actualChunkSize(), actualChunksPerSegmentTier() and entriesPerSegment().

By default ChronicleMapBuilder chooses the chunk size between 1/8 and 1/4 of the configured average value size. So if your segment tier size is 6328 chunks, your segment(s) is configured to contain about 1000 entries. If your average value set size has 400 elements and the maximum is 20,000, the difference between average and max should be about 50 times, but from the stack trace it looks like one of your entries is well more than 2000 times larger than the average. Probably you have not accounted something.

Also for such big values I suggest to develop and use more memory efficient value serializer, because the default one will generate a lot of garbage. E. g. it could use a primitive IntSet which implements Set<Integer> from fastutil or Koloboke or Koloboke Compile libraries.

Also I suggest to use the latest version available now, Chronicle Map 3.9.1 is already outdated.

leventov
  • 14,760
  • 11
  • 69
  • 98
  • OK, so I actually ran some debug and got real numbers for the collection... was way off on the number of entries. The `entries()` is set to 18,236, the average collection size is 440, and the maximum is 75,453. It's failing when trying to add a collection of size 23,099 (so not even the max). It is possible to access it through multiple threads in the JVM (and other JVMs), so `actualSegments(1)` is off the table. I'm still unsure of how to deterministically set `actualChunkSize()` or any of the other related methods. – Depressio Dec 15 '16 at 17:49
  • I debugged into it further. I screwed up. I'm not actually using the calculated average size collection, but instead a default of 100 because of some invalid logic in my code. This is probably why things aren't correct. Nonetheless, your comment about things not quite adding up is absolutely correct... the story wasn't correct. – Depressio Dec 15 '16 at 19:20