Am I wrong to assume a TreeMap's array's initial size should be able to be set?
Yes, that assumption is incorrect. A TreeMap
doesn't have an array. A TreeMap
uses binary nodes with 2 children.
If you are suggesting that the number of children in a tree node should be a parameter, then you need to figure out how that impacts on search time. And I think that it turns the search time from O(log2N)
to O(log2M * log2(N/M))
where N
is the number elements and M
is the average number of node children. (And I'm making some optimistic assumptions ...) That's not a "win".
Is there a different reason that it is so slow?
Yes. The reason that a (large) TreeMap
is slow relative to a (large) HashMap
under optimal circumstances is that lookup using a balanced binary tree with N entries requires looking at roughly log2N
tree nodes. By contrast, in an optimal HashMap
a lookup involves 1 hashcode calculation and looking at O(1)
hashchain nodes.
Notes:
TreeMap
uses a binary tree organization that gives balanced trees, so O(log2N)
is the worst case lookup time.
HashMap
performance depends on the collision rate of the hash function and key space. In the worst case where all keys end up on the same hash chain, a HashMap
has O(N)
lookup.
- In theory,
HashMap
performance becomes O(N)
when you reach the maximum possible hash array size; i.e. ~2^31 entries. But if you have a HashMap
that large, you should probably be looking at an alternative map implementation with better memory usage and garbage collection characteristics.