I guess I'm looking for a sparse array implementation, but I really need this to be efficient in terms of memory usage, and one peculiarity of my data that an implementation could take advantage of is that the indices are populated such that if the value for an index i
is present, the indices i-1
and i+1
are also likely to have values present, and similarly if the value for i
has no value present, i-1
and i+1
are likely to not have values present.
I'm working in Java, and I need the index type to be long
rather than the more usual int
, if this makes a difference. I have approximately 50 million objects that will need to be stored. I've looked into Trove4J's TLongObjectHashMap
, unfortunately this will require around 1.6GB for the hash table alone, and I really need to improve on this.
Can anyone point me towards something that can optimize for long runs of sequentially allocated identifiers? Logarithmic performance of insert/get is acceptable to me, so perhaps something tree-based?