I would like to store a couple billion key,value pairs in MapDB. I have specific requirements:
key = long (8 bytes) value = maximum 5 entries of (UUID + from-date + to-date), so 5 * 32 bytes = 160
Requirement is to have single threaded app: load by key, and change one of the dates in the entries. The size of the value will not change.
I wonder what the most efficient setup is to do gets and puts. Do I gain performance by pre-allocating a byte array of 160 and use that as the value? Or doesn't it matter and just use a byte array flexible in size?
Currently I have
DB db = DBMaker
.newFileDB(dbFile)
.asyncWriteEnable()
.asyncWriteFlushDelay(100)
.transactionDisable()
.make();
Pump:
BTreeKeySerializer keySerializer = BTreeKeySerializer.ZERO_OR_POSITIVE_LONG;
Map<Long, Item> map = db.createTreeMap("map")
.pumpSource(source)
.keySerializer(keySerializer)
.make();
where source is
Iterator<Fun.Tuple2<Long, byte[]>> source = new Iterator<Fun.Tuple2<Long, byte[]>>()
Loading
Map<Long, byte[]> map = db.<Long, byte[]>getTreeMap("map");
After using the datapump to load the map with 20,000,000 items (performance degrades over time), the lookups are a bit disappointing:
200,000 lookups in 199,999 ms. (about 1000 / second)
the lookup increases dramatically after running my test app a second time:
200,000 lookups in 7,597 ms.
Is there anything I can do to improve the performance given the fixed size requirements of the keys/values? Any options I can enable?
I used a TreeMap because it has a data pump. Would performance increase significantly when using a HashMap?
Cheers!