1

I have worker threads that generate objects and push them into a thread-safe Set. A processing thread periodically reads the Set and processes the elements.

While the object references themselves will be successfully retrieved from the set, the objects' variables are not thread-safe if accessed from the processing thread. Is there some pattern to do this, apart from making all the objects' internals volatile etc.? The objects may become more complex in the future, containing nested objects etc.

Assuming that no object will be externally modified by once placed into the Set, is there some way to "happens-before" whatever is currently in the Set before I begin processing it? The processing thread is already running and will not be created only after the Set has been populated.

The objects themselves are just data containers and have no inherent thread-safety. I can't make all the fields final since they may be modified multiple times before being placed into the Set.

Monstieur
  • 7,992
  • 10
  • 51
  • 77
  • Some solutions that i can propose is -: 1) make the class whose objects you are adding in set as immutable. 2) You can use CopyOnWriteArraySet which ensures the Happens-Before-relationship but this is good only for small size sets. – Puneet_Techie Jan 20 '15 at 10:41
  • Only the single processing thread will ever read from the `Set`. How can the fields be thread safe if they were populated from a worker thread? Primitive fields may be, but the fields could be changed to anything including complex objects. – Monstieur Jan 20 '15 at 10:42

2 Answers2

2

If you have a thread safe set, this will establish happens before writes so you don't have to worry about whether the object is thread safe or not. This assumes that your producer doesn't alter or read the object after putting it in the collection.

If you make the objects immutable, this will make the relationship clearer, however I am assuming that once you pass the object to the shared storage, the writing thread no long alters the object and only the consuming thread reads or alters the object.

BTW I would pass the tasks via a queue using an ExecutorService as it is more efficient and written for you.

Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • Since only *references* to objects are pushed into the `Set`, I see no reason for worrying about Thread-Safety.The read operation on the set will not get the updated value.. But why make the Class *immutable*? (Since nobody is altering the instances?) . Perhaps a `ReadWriteLock` could help? – TheLostMind Jan 20 '15 at 10:45
  • @TheLostMind you want to be able to read the object referenced in a consistent state. If you construct an object without final fields and pass it to another thread in an unsafe manner you might see any or all the fields uninitialised. – Peter Lawrey Jan 20 '15 at 10:48
  • How are the objects' fields guaranteed to be thread safe? If a non-volatile field is assigned from a worker thread and the object is pushed into the set, how will the processing thread see the field assignment? – Monstieur Jan 20 '15 at 10:49
  • What if I can't make the fields final? I *am* using an ExecutorService to receive the objects and populate the `Set`, but the processing has to occur only periodically. There are actually a minimum of three threads (worker, executor to receive objects, processing). – Monstieur Jan 20 '15 at 10:51
  • @Locutus When you use a thread safe collection, even something as simple as AtomicReference, it has to ensure *all* the previous writes which have occurred as established and any read after that point will see a thread safe view. The CPU doesn't "know" you want the collection to be thread safe but nothing else. Instead the CPU makes *all* operations thread safe if you use any thread safe operation. – Peter Lawrey Jan 20 '15 at 10:53
  • @Locutus You can make the fields `final` but it won't make any difference in this case. – Peter Lawrey Jan 20 '15 at 10:53
  • @PeterLawrey - I don't get it :( .. One thread changes state of an object and then pushes the *reference* of the changed object into the thread-safe set. Now, Are you trying to say that even though the reference is pushed, the fields are not set correctly (by the worker thread)? So the processing thread won't read the *updated/ changed* values?. – TheLostMind Jan 20 '15 at 10:54
  • @TheLostMind the only problem is if you don't use a thread safe collection. If you use final fields, you don't have to use a thread safe collection. This is something I use for object pools of immutable objects (a strategy never used by most developers I suspect) – Peter Lawrey Jan 20 '15 at 10:56
  • 1
    As far as I understand the worker thread is not modifying the objects, it is just pushing the objects in the set(as per the question). While after retrieval the objects can be manipulated, which @Locutus wants to avoid. Correct me if i am wrong. – Puneet_Techie Jan 20 '15 at 10:57
  • 2
    @Puneet_Techie Not the way I read "Assuming that no object will be externally modified by once placed into the Set" – Peter Lawrey Jan 20 '15 at 10:59
  • @PeterLawrey - Agreed, If you use *final* fields *transitively*, it will make the class *immutable*. So you don't have to use a thread-safe collection. So, If the OP uses a thread-safe collection then the problem will be The references will be pushed correctly(whether the reference will be visible instantaneously, is a different thing) but the worker thread might not have flushed the values of all the fields properly. So, a *partially constructed object* might be visible to the processing thread.Right? – TheLostMind Jan 20 '15 at 11:13
  • @TheLostMind you only see a partially constructed object if a field is not final and you use a collection without memory barriers (or anything else with a memory barrier) but even then it is not guaranteed. – Peter Lawrey Jan 20 '15 at 11:16
0

Volatile isn't quite the magic bullet in this case. Look at the possibility of switching to immutable objects for those passed between threads. Also, a threadsafe data structure that is queue based will give you better performance than most set implementations.