2

It probably has been asked before but I come across this situation time and time again, that I want store a very small amont of properties that I am absolutely certain will never ever exceed say 20 keys. It seems a complete waste of CPU and memory to use a HashMap with all the overhead to begin with, but also the bad performance calculating an advanced hash value for each key lookup. If there are only <20 keys (probably more like 5 most of the time). I am absolutely certain that calculating a hash value takes hundred times more time than just iterating and comparing ...no?

There is this talk about premature optimization, but I don't totally agree here. I am on Android mostly, and any extra CPU/memory will opt for more juice for other stuff. Not necessarily talking about the consumer market here. The use-case here is very well-defined and doesn't change much, furthermore; it would be trivial to replace a very cheap map with a HashMap in case (something that will never happen) there will be a very large amount of new keys suddenly.

So, my question is; which is the very cheapest, most basic Map I can use in Java?

JohnyTex
  • 3,323
  • 5
  • 29
  • 52
  • 1
    How about just using an array? – hatchet - done with SOverflow Oct 31 '14 at 23:51
  • 1
    @hatchet It won't be more efficient than an `HashMap` (which, that said, holds an array) and will be unreadable, especially if the array is not completely filled – Dici Oct 31 '14 at 23:54
  • @Dici - for 5 elements? It seems like driving your car 30 feet down the driveway to get your mail. If OP is worried about both space and time cost, array will win for space, and will be indistinguishable for time, but quite possibly faster as well. – hatchet - done with SOverflow Oct 31 '14 at 23:56
  • 1
    @hatchet He can play with the initial capacity and the load factor. Such a small overhead is nothing compared to the ugliness of the code he would have to write if he uses an array manually – Dici Oct 31 '14 at 23:58
  • 2
    This is probably pointless, because truthfully, I have a hard time believing it will really matter in your app. And if it doesn't matter, readability/maintainability wins. On that i agree with @Dici. – hatchet - done with SOverflow Nov 01 '14 at 00:03

3 Answers3

2

To all your first paragraph : no ! There won't be a dramatic memory overhead since as far as I know, an HashMap is initialized with 16 buckets and then doubles its size each time it rehashes, so in the worst case you would have 12 exceeding buckets for your map, so this is no big deal.

Concerning the lookup time, it is constant and equivalent to the time of accessing an element of an array, which is always better than looping over O(n) elements (even if n < 20). The only backdrop for HashMap is that it is unsorted, but as far as I am concerned, I consider it the default Map implementation in Java when I have no particular requirement about the order.

To conclude : use HashMap !

Dici
  • 25,226
  • 7
  • 41
  • 82
  • 1
    I agree lookup-time is constant, but how can calculating a hash be equivalent to traversing through an Array? – JohnyTex Oct 31 '14 at 23:58
  • 1
    It depends on your hash function, but it is generally not so costly. My point is you should not prevent you from using a high level data structure for peanuts optimisations. Code readability and functionality is your first priority. Plus, when traversing your array, you would have to call `equals` (several times), which can have a running time comparable with `hashCode` (that you call only once). – Dici Nov 01 '14 at 00:00
  • OK good answer! So I can safely continue my practice of always using HashMaps then. Good to know. – JohnyTex Nov 01 '14 at 00:04
  • 1
    If you don't care about ordering, yes. For an optimized use of `HashMap`, learn how the load factor and the initial capacity impacts the number of rehashes (which is what really takes time when using an `HashMap`). However to be honest, I almost never care about this and use the default constructor. It is just good to know the day when you need to optimize your code to reach a performance objective – Dici Nov 01 '14 at 00:09
  • 3
    Not trying to question the conclusion but you can't use O notation to argue about speed when n is just 20. When you compare O(1) vs O(N) you're comparing O(1 x a) vs O(20 x b) without knowing how big each side's constant factors a and b really are. – zapl Nov 01 '14 at 01:32
  • I do know the mathematic definition of the Big O notation, I just wanted to stress the fact that an array does not provide a good complexity for lookup by key. For those constants you are talking about, we do know their approximate value. Assuming `hashCode` and `equals` run in similar time, we can take the number of `hashCode/equals` calls as the unit. In that case, if there is no rehash, then `a = b = 1`, meaning that the worst-case complexity with the `HashMap` is 20 times lower than with the array – Dici Nov 01 '14 at 11:57
1

If you worry about hashCode() computation time on your keys, consider caching computed values, as, for example, java.lang.String does. See how caching hashcode works in Java as suggested by Joshua Bloch in effective java? question about on that.

Community
  • 1
  • 1
leventov
  • 14,760
  • 11
  • 69
  • 98
1

Caveat: I suggest you take seriously cautions about premature optimization. For most programmers in most apps, I seriously doubt you need to worry about the performance of your Map. More important is to consider needs of concurrency, iteration-order, and nulls. But since you asked, here is my specific answer.

EnumMap

If your keys are enums, then your very fastest Map implementation will be EnumMap.

Based on a bitmap representing the domain of enum objects, an EnumMap is very fast to execute while using very little memory.

IdentityHashMap

If you are really so concerned about performance, then consider using IdentityHashMap.

This implementation of Map uses reference-equality rather than object-equality. While there is still a hash value involved, it is a hash of the object's address in memory (so to speak, we do not have direct memory access in Java). So the possibly lengthy call to each key object’s own hashCode method is avoided entirely. So performance may be better than a HashMap. You will see constant-time performance for the basic operations (get and put).

Study the documentation carefully to see if you want to take this route. Note the discussion about linear-probe versus chaining for better performance. Be aware that this class partially breaks the Map contract which mandates the use of the equals method when comparing objects. And this map does not provide concurrency.


Here is a table I made to help compare the various Map implementations bundled with Java 11.

Table of map implementations in Java 11, comparing their features

Basil Bourque
  • 303,325
  • 100
  • 852
  • 1,154