11

I am reading the book Java Concurrecny in practice. On page 85 section 5.2.1 it talks about the ConcurrentHashMap and its advantages. However, in one part, the books claims that

the iterators returned by ConcurrentHashMap is weakly consistent. This means that this iterator can tolerate concurrent modification, traverses elements as they existed when iterator was constructed, and may (but not guaranteed to) reflect modifications to the collection after the construction of the iterator.

From why I understand the whole point of synchronization in concurrent programs is to allow thread accessing a shared resource in a consistent way, where as ConcurrentHashMap is not really fulfilling this. Then why using it at all?

Hossein
  • 40,161
  • 57
  • 141
  • 175

3 Answers3

19

The point is to avoid synchronization when you don't need it. If you don't mind seeing new elements in some cases and not seeing them in others, using ConcurrentHashMap's iterator can be significantly cheaper than either preventing other threads from adding items while you're iterating or taking a consistent snapshot when the iterator is created.

So yes, when you need synchronization and a consistent iterator, you'll need an alternative - but when you don't, you can take advantage of the more efficient iterator provided by ConcurrentHashMap.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Does ConcurrentHashMap have synchronizations itself in its implementations? – Hossein Dec 29 '12 at 17:26
  • 1
    @Hossein: I don't know offhand - but you can look at the source as easily as I can :) The docs suggest there may be synchronization in write access but not read: "However, even though all operations are thread-safe, retrieval operations do not entail locking, and there is not any support for locking the entire table in a way that prevents all access." – Jon Skeet Dec 29 '12 at 17:27
  • 1
    @Hossein: IIRC the `ConcurrentHashMap` is lock-free implementation so it does not have synchronization in traditional sense but use atomics and special algorithms to avoid mutexes. It makes it faster as mutexes can be too heavyweight in some applications (setting asside the cost of it itself it allows only 1 thread running at a time) but it is not fully syncronized. – Maciej Piechotka Dec 29 '12 at 17:32
  • 3
    @Maciej: `ConcurrentHashMap` is a lock striping implementation and the default concurrency level is 16. – Prince John Wesley Dec 29 '12 at 17:48
  • @PrinceJohnWesley: Ups. Sorry - my bad (I saw `ConcurrentHashMap` read `ConcurrentSkipListMap`). – Maciej Piechotka Dec 29 '12 at 17:50
2

ConcurrentHashMap is supposed to be thread-safe, by contract. However, it is not supposed to be consistent across threads.

When you iterate over a ConcurrentHashMap, the iterator grabs a copy of the hash map at the moment you asked for it (and this copy is made in a thread-safe manner) and you iterate over that copy. And yes, nothing guarantees you that while you are iterating over that copy, some map entries won't be removed. But the map entries which would be removed in such a manner woud still exist in your iterator.

fge
  • 119,121
  • 33
  • 254
  • 329
  • Such entries _might_ not _would_ exists in iterator. – Maciej Piechotka Dec 29 '12 at 17:33
  • No, "would". If it were "might" and it didn't, you'd get a `ConcurrentModificationException`. – fge Dec 29 '12 at 17:40
  • From Java SDK documentation "They do not throw ConcurrentModificationException`.". – Maciej Piechotka Dec 29 '12 at 17:48
  • Yes, because at the time of the copy, all entries for a given hash bucket are copied over -- but they may be removed from the map while you iterate. – fge Dec 29 '12 at 18:48
  • The `Iterator` of `ConcurrentHashMap` doesn't work against a copy. It's possible that it will return entries that were added after the `Iterator` was created, and even after you have already received some elements from it. I have tried it and *sometimes* that happens in reality (Java 8). Otherwise it wouldn't be *weakly* consistent. – ddekany Sep 21 '17 at 16:08
2

Everything depends on how strong consistency you need. If you need strong one - than mutexes or other subset of operations would provide what you need.

However sometimes you need much weaker requirements but for example you need say lock-free property as you need to guarantee the throughput. On the other hand you may not need iterators at all or have much weaker restrictions on them.

So if all you need is map shared across threads ConcurrentHashMap (say place in theatre to person booking it) would provide what you need - and it may be much faster if you have many working threads as you avoid mutex synchronization. On the other hand the speed comes at a cost - the view provided by iterator does not necessary corresponds to state of collection at it's creation and it might miss some elements created afterwords (in general in multiprocessor systems happens-before relationship is tricky).

Edit: As Prince John Wesley pointed out I thought about ConcurrentSkipListMap while seeing and writing. ConcurrentHashMap. Most of the points still stand except the parts about being lock-free (as well as everything that comes from it like guaranteed throughput etc.).

Maciej Piechotka
  • 7,028
  • 6
  • 39
  • 61