If the keys I wish to use are guaranteed to be unique (or at least the assumption can be made that the keys are unique), does using a 'vanilla' ConcurrentHashMap provide the best performance,
You would typically use ConcurrentHashMap
if the Map
is a potential concurrency bottleneck. If your application is single threaded or if there is no contention, ConcurrentHashMap
is slower than HashMap
.
or does a hashing function or put method need to be modified to avoid needless hashing?
The hash function gets evaluated once per "probe" of the hash table; e.g. once per get
or put
operation. You can reduce the cost of the hash function by caching the result, but this costs you an extra 4 bytes of storage per key object. Whether caching is a worthwhile optimization depends on:
- what the relative cost of hashing is compared with the rest of the application, and
- the proportion of calls to
hashCode()
that will actually make use of the cached value.
Both of these factors are highly application specific.
(Incidentally, the long term cost of using the identity hashcode as the hash value is also an extra 4 bytes of storage.)
Also, does a numeric key have any performance benefit over a non-numeric key (such as a String or POJO with a proper hashing function)?
The hash function is likely to be cheaper in the numeric case, but whether it is worth it depends on whether there are application-specific downsides of using a numeric key. And, as above, the relative costs are application specifics. For instance, the cost of String.hashCode()
is proportional to the length of the string being hashed.