I am using Chronicle Map to temporarily store / lookup a very large number of KV pairs (several billion in fact). I don't need durability or replication, and I'm using memory mapped files, rather than pure off-heap memory. Average key length is 8 bytes.
For smallish data sets - up to 200 million entries - I get throughput of around 1m entries per second i.e it takes approx 200 seconds to create the entries, which is stunning, but by 400 million entries, the map has slowed down significantly and it takes 1500 seconds to create them.
I have run tests on both Mac OSX/16GB Quad Core/500GB SSD and Proliant G6 server running Linux with 8 cores/64GB ram/300GB Raid 1 (not SSD). The same behaviour exhibits on both platforms.
If it helps, here's the map setup:
try {
f = File.createTempFile(name, ".map");
catalog = ChronicleMapBuilder
.of(String.class, Long.class)
.entries(size)
.averageKeySize(8)
.createPersistedTo(f);
} catch (IOException ioe) {
// blah
}
And a simple writer test:
long now = -System.currentTimeMillis();
long count = 400_000_000L;
for (long i = 0; i < count; i++) {
catalog.put(Long.toString(i), i);
if ((i % 1_000_000) == 0) {
System.out.println(i + ": " + (now + System.currentTimeMillis()));
}
}
System.out.println(count + ": " + (now + System.currentTimeMillis()));
catalog.close();
So my question is - is there some sort of tuning I can do to improve this, e.g. change number of segments, use a different key type (e.g. CharSequence), or is this simply an artefact of the OS paging such large files?