3

I am working on my first mutlithreaded program and got stuck about a couple of aspects of synchronization. I have gone over the multi-threading tutorial on oracle/sun homepage, as well as a number of questions here on SO, so I believe I have an idea of what synchronization is. However, as I mentioned there are a couple of aspects I am not quite sure how to figure out. I formulated them below in form of clear-cut question:

Question 1: I have a singleton class that holds methods for checking valid identifiers. It turns out this class needs to hold to collections to keep track of associations between 2 different identifier types. (If the word identifier sounds complicated; these are just strings). I chose to implement two MultiValueMap instances to implement this many-to-many relationship. I am not sure if these collections have to be thread-safe as the collection will be updated only at the creation of the instance of the singleton class but nevertheless I noticed that in the documentation it says:

Note that MultiValueMap is not synchronized and is not thread-safe. If you wish to use this map from multiple threads concurrently, you must use appropriate synchronization. This class may throw exceptions when accessed by concurrent threads without synchronization.

Could anyone elaborate on this "appropriate synchronization"? What exactly does it mean? I can't really use MultiValueMap.decorate() on a synchronized HashMap, or have I misunderstood something?

Question 2: I have another class that extends a HashMap to hold my experimental values, that are parsed in when the software starts. This class is meant to provide appropriate methods for my analysis, such as permutation(), randomization(), filtering(criteria) etc. Since I want to protect my data as much as possible, the class is created and updated once, and all the above mentioned methods return new collections. Again, I am not sure if this class needs to be thread-safe, as it's not supposed to be updated from multiple threads, but the methods will most certainly be called from a number of threads, and to be "safe" I have added synchronized modifier to all my methods. Can you foresee any problems with that? What kind of potential problems should I be aware of?

Thanks,

posdef
  • 6,498
  • 11
  • 46
  • 94

4 Answers4

2

It's hard to give general rules about synchronization, but your general understanding is right. A data-structure which is used in a read-only way, does not have to be synchronized. But, (1) you have to ensure that nobody (i.e. no other thread) can use this structure before it is properly initialized and (2) that the structure is indeed read-only. Remember, even iterators have a remove method.

To your second question: In order to ensure the immutability, i.e. that it is read-only, I would not inherit the HashMap but use it inside your class.

jmg
  • 7,308
  • 1
  • 18
  • 22
  • Avoiding inheritance is to avoid complications that might arise from overloading HashMap methods, is that right? – posdef Mar 02 '11 at 12:30
  • Avoiding inheritance is to avoid users of your class calling methods you did not overwrite and might change the hashmap. – jmg Mar 02 '11 at 13:57
2

Answer 1: Your singleton class should not expose the collections it uses internally to other objects. Instead it should provide appropriate methods to expose the behaviours you want. For example, if your object has a Map in it, don't have a public or protected method to return that Map. Instead have a method that takes a key and returns the corresponding value in the Map (and optionally one that sets the value for the key). These methods can then be made thread safe if required.

NB even for collections that you do not intend to write to, I don't think you should assume that reads are necessarily thread safe unless they are documented to be so. The collection object might maintain some internal state that you don't see, but might get modified on reads.

Answer 2: Firstly, I don't think that inheritance is necessarily the correct thing to use here. I would have a class that provides your methods and has a HashMap as a private member. As long as your methods don't change the internal state of the object or the HashMap, they won't have to be synchronised.

JeremyP
  • 84,577
  • 15
  • 123
  • 161
  • Regarding 1) That's exactly how I've set it up, the map is never returned or exposed, but rather methods are supplied to get values, based on keys. Regarding 2) I see your point, out of curiosity, is there a way to make an object (say hashmap) immutable AFTER it has been created and populated? I am wondering this mostly to avoid rewriting my class... – posdef Mar 02 '11 at 12:24
  • 1
    @posdef, you can wrap it with an unmodifiable map via [`Collections.unmodifiableMap()`](http://download.oracle.com/javase/6/docs/api/java/util/Collections.html#unmodifiableMap%28java.util.Map%29). – Péter Török Mar 02 '11 at 12:37
  • @Peter: Cheers! That is something I have been looking for a while.. :) Is there an equivalent way of generalizing this "read-only lock" to any object? – posdef Mar 02 '11 at 12:40
  • @posdef, similar methods exist for all collection interfaces, including `Collection` itself, but I don't know of other examples in the class library. The implementations simply throw a `NotImplementedException` from any write methods, so it is easy to implement such a wrapper for any interface you want. – Péter Török Mar 02 '11 at 12:54
1

If your maps are populated once, at the time the class is loaded (i.e. in a static initializer block), and are never modified afterwards (i.e. no elements or associations are added / removed), you are fine. Static initialization is guaranteed to be performed in a thread safe manner by the JVM, and its results are visible to all threads. So in this case you most probably don't need any further synchronization.

If the maps are instance members (this is not clear to me from your description), but not modified after creation, I would say again you are most probably safe if you declare your members final (unless you publish the this object reference prematurely, i.e. pass it to the outside world from the cunstructor somehow before the constructor is finished).

Péter Török
  • 114,404
  • 31
  • 268
  • 329
  • Thanks for your reply, as I have mentioned in an above comment, I implemented my code exactly as you mentioned in your first paragraph. I didn't quite get your second paragraph though; I understand what you mean by publishing the object prematurely, but I couldn't make sense out of the first part of the sentence.. Do you refer to my first or second question there? – posdef Mar 02 '11 at 12:28
  • @posdef, oops, I didn't notice "extends" in your second question. So my 2nd paragraph applies only to your 1st question. I agree with @Jeremy though that composition is probably a better solution than inheritance. – Péter Török Mar 02 '11 at 12:35
  • I see, I assume this is to avoid unnecessary/unintended artifacts of overridden methods? – posdef Mar 02 '11 at 12:38
  • @posdef, surely you can. Declaring it `final` only fixes the *reference* to the object, but you can still freely modify the state of the object. – Péter Török Mar 02 '11 at 12:40
1

Synchronization commonly is needed when you either could have concurrent modifications of the underlying data or one thread modifies the data while another reads and needs to see that modification.

In your case, if I understand it correctly, the MultiValueMap is filled once upon creation and the just read. So unless reading the map would modify some internals it should be safe to read it from multiple threads without synchronization. The creation process should be synchronized or you should at least prevent read access during initialization (a simple flag might be sufficient).

The class you descibe in question 2 might not need to be synchronized if you always return new collections and no internals of the base collection are modified during creation of those "copies".

One additional note: be aware of the fact that the values in the collections might need to be synchronized as well, since if you safely get an object from the collection in multiple thread but then concurrently modify that object you'll still get problems.

So as a general rule of thumb: read-only access does not necessarily need synchronization (if the objects are not modified during those reads or if that doesn't matter), write access should generally be synchronized.

Thomas
  • 87,414
  • 12
  • 119
  • 157
  • Uhm, let me try and get this straight, what do you mean by synchronizing the individual values? As far as I know there are two ways of synchronizing in java; blocks and methods. Am I missing something? – posdef Mar 02 '11 at 12:35