16

If I decide to use a non-thread-safe collection and synchronize its access, do I need to synchronize any mutation in the constructor? For example in the following code, I understand the reference to the list will be visible to all threads post-construction because it is final. But I don't know if this constitutes safe publication because the add in the constructor is not synchronized and it is adding a reference in ArrayList's elementData array, which is non-final.

private final List<Object> list;

public ListInConstructor()
{
    list = new ArrayList<>();
    // synchronize here?
    list.add(new Object());
}

public void mutate()
{
    synchronized (list)
    {
        if (list.checkSomething())
        {
            list.mutateSomething();
        }
    }
}
John Kugelman
  • 349,597
  • 67
  • 533
  • 578
Adam R
  • 565
  • 4
  • 11
  • Are you synchronizing initialization/access of the `ListInConstructor` instance in any way? – Snild Dolkow Nov 18 '15 at 16:39
  • 2
    No, I'm not synchronizing the ListInConstructor object in any way, but it is stored in a final field if that makes a difference. – Adam R Nov 18 '15 at 16:45

4 Answers4

13

Update: The Java Language Specification states that the freeze making the changes visible must be at the end of the constructor, which means your code is correctly synchronized, see the answers from John Vint and Voo.

However you can also do this, which definitely works:

public ListInConstructor()
{
    List<Object> tmp = new ArrayList<>();
    tmp.add(new Object());
    this.list = tmp;
}

Here we mutate the list object before assigning it to the final field, and so the assignment will guarantee that any changes made to the list will also be visible.

17.5. final Field Semantics

The usage model for final fields is a simple one: Set the final fields for an object in that object's constructor; and do not write a reference to the object being constructed in a place where another thread can see it before the object's constructor is finished. If this is followed, then when the object is seen by another thread, that thread will always see the correctly constructed version of that object's final fields. It will also see versions of any object or array referenced by those final fields that are at least as up-to-date as the final fields are.

The highlighted sentence gives you a guarantee that this solution will work. Although, as pointed out at the start of the answer, the original has to work too, but I'll leave this answer here as the specification is slightly confusing. And because this "trick" also works when setting non-final but volatile fields (from any context, not just constructors).

Community
  • 1
  • 1
biziclop
  • 48,926
  • 12
  • 77
  • 104
  • Can you clarify why you bolded that part of the quote? – Sotirios Delimanolis Nov 18 '15 at 17:23
  • @SotiriosDelimanolis At this exact moment, I don't really know myself. This sentence to me means that they will be as up-to-date as when the fields were set but the next section talks about freezing at the end of the constructor instead. – biziclop Nov 18 '15 at 17:35
  • @biziclop That entire page is confusing wrt to this question. I like you answer more than mine anyways. – John Vint Nov 18 '15 at 18:14
  • 1
    The bolded section does not apply here. That is about say the constructor of the final field object referencing a global field somewhere. If John could undelete his question and add the explanation given in 17.5.1? I'll probably forget to write an answer tonight and I'm certainly not going to write a good answer on my phone :-) – Voo Nov 18 '15 at 18:32
  • @Voo `That is about say the constructor of the final field object referencing a global field somewhere.` No, it definitely isn't. It's about what happens when the final field of the currently constructed object is seen by another thread. – biziclop Nov 18 '15 at 18:35
  • @biziclop Mhm for me it says if the constructor writes or reads a global field, when another thread sees the final field that was created by that constructor it will also see at least as new values on the fields accessed during constructor creation. – Voo Nov 18 '15 at 19:16
  • @Voo Hmm, could be, it isn't how it reads to me. That would be guaranteed by the simple fact that happens-before relationships are transitive. Having said that, the same can be said about my interpretation. I'm tempted to delete my answer and ask a new question instead :) – biziclop Nov 18 '15 at 19:30
  • 1
    I'll undelete my answer considering we aren't quite sure – John Vint Nov 18 '15 at 21:28
  • 2
    @biziclop Question is who could answer that one then, if we don't get lucky and Brian or someone comes along :-) I did add my interpretation to the question in any case, but it's certainly a tricky case! – Voo Nov 18 '15 at 22:20
  • @Voo It is embarrassing. I updated my answer to point at your answers too. I just wish I could get rid of the undeserved upvotes without deleting my answer completely. – biziclop Nov 18 '15 at 22:42
  • Does Java (yet) have any way of allowing "effectively immutable" objects to lazily generate things like arrays safely without having to add an extra level of indirection to every access? – supercat Nov 19 '15 at 00:34
  • @supercat What do you mean by "extra level of indirection"? A getter? Or a layer beyond the getter? – biziclop Nov 19 '15 at 10:19
  • @biziclop: If one creates an array and populates it, then uses `myLazyArray = new UnchangingArrayHolder(arr);` and h stores it in a public final field `arr`, Java will guarantee that if any thread does `UnchangingArrayHolder temp = myThing.myLazyArray;` and `temp` is not null, `temp.arr` will hold the data that was written to `arr` before the reference to it was stored [assuming nothing was written to it since] but all references on foreign threads must go through an `UnchangingArrayHolder` instance. – supercat Nov 19 '15 at 16:41
  • @supercat Since you have to initialize the construct safely anyhow when it is first accessed, as long as you're accessing the array through another class I don't see a real problem there. `int[] myArr = myThing.getMyArr()` could certainly fulfill your requirements if I don't misunderstand them (as long as `getMyArr` is correctly implemented) – Voo Nov 19 '15 at 18:12
  • @Voo: If code receives a `myThing` which hasn't been used in so long that no information related to it is in any layer of cache, then accessing one element of the array from scratch will incur three slowest-level cache misses--one to access the `myThing`, one to access the `UnchangingArrayHolder`, and one to access the array itself. If `myThing` could hold a direct reference to the array, the cost would be only two cache misses. Each of those cache misses is likely to take longer than 100 normal instructions, so going from 2 to 3 can be a major speed penalty. – supercat Nov 19 '15 at 19:25
  • @supercat I still don't see why you'd need an `UnchangingArrayHolder` to implement `getMyArr`, sure you couldn't make the array final then, but since you want to lazily initialize it anyhow you need some synchronization when accessing it in any case - but that doesn't involve the need for an additional indirection. – Voo Nov 19 '15 at 19:31
  • @supercat `If code receives a myThing which hasn't been used in so long that no information related to it is in any layer of cache, then accessing one element of the array from scratch will incur three slowest-level cache misses--one to access the myThing, one to access the UnchangingArrayHolder, and one to access the array itself` That is, unless Hotspot optimizes the whole thing away. :) – biziclop Nov 19 '15 at 19:42
  • @biziclop: If there were just one object which got dropped from cache, there would be no issue. The performance cost may be huge, however, if if one needs to access one array element from each of a million instances of `myThing`, then access another element from each, etc. The extra layer of indirection could impose a nearly 50% extra time penalty on code which is already slow. – supercat Nov 19 '15 at 19:47
  • @supercat That's why I mentioned Hotspot. Aggressive inlining can potentially get rid of any extra layers of indirections. You don't know what will happen exactly until you code it and test it. – biziclop Nov 19 '15 at 20:08
  • @biziclop HotSpot could inline the call, but it couldn't (well it could theoretically but it doesn't at the moment) change the layout of the class. – Voo Nov 19 '15 at 20:18
  • In any case, though I think this is an entirely different (though very interesting) question, I don't think this comment stream is the best platform to discuss it. – biziclop Nov 19 '15 at 20:20
5

According to the JLS

The usage model for final fields is a simple one: Set the final fields for an object in that object's constructor; and do not write a reference to the object being constructed in a place where another thread can see it before the object's constructor is finished.

Since the write to the List occurs prior to the constructor completing you are safe in mutating the list without additional synchronization.

edit: Based on Voo's comment I will make edit with the inclusion of final field freezing.

So reading more into 17.5.1 there is this entry

Given a write w, a freeze f, an action a (that is not a read of a final field), a read r1 of the final field frozen by f, and a read r2 such that hb(w, f), hb(f, a), mc(a, r1), and dereferences(r1, r2),

I interpret this as the action to modify the array happens-before the later derefencing of r2 which is the non-synchronized read after the freeze completes (the constructor exists).

John Vint
  • 39,695
  • 7
  • 78
  • 108
2

Because the object is not inherently immutable, you must safely publish the object. As long as you do this, there is no need to do the mutations in the constructor in a synchronized block though.

Objects can be "safely published" in a number of ways. An example is passing them to another thread via a properly synchronized queue. For more details see Java Concurrency in Practice section 3.5.3 "Safe publication idioms" and 3.5.4 "Effectively Immutable Objects"

Enno Shioji
  • 26,542
  • 13
  • 70
  • 109
  • And by "safely publish the object", we're talking about synchronizing access/initialization of the `ListInConstructor` instance, right? – Snild Dolkow Nov 18 '15 at 16:41
  • 1
    @SnildDolkow Or setting it as a value of a `volatile` or `final` field, which guarantee that the object will be seen at least as up-to-date as it was at the time of the write to the variable. – biziclop Nov 18 '15 at 16:42
  • 1
    This is perfectly safe in any case (yes also when assigning the instance to a non final, non volatile field). He'll either see the previous object or if he sees the newly created one he's guaranteed to see the fully constructed value since it's final (the only way he wouldn't is if he leaked the this reference before the constructor finishes) – Voo Nov 18 '15 at 16:52
  • @Voo It is guaranteed you'll see a correctly constructed `ArrayList`, but whether it will contain the extra element is not. – biziclop Nov 18 '15 at 16:55
  • 1
    @biziclop not how I read 17.5.1: "A freeze action on final field f of o takes place when c exits, either normally or abruptly." The freeze action does not happen when writing to f. The hb relationship is then defined in terms of f, so this seems safe to me. – Voo Nov 18 '15 at 17:11
  • @Voo Yes, this seems to suggest that the objects will be as up-to-date as they were at the end of the constructor and not as at the time of setting the field. Even though the previous section explicitly says that. – biziclop Nov 18 '15 at 17:16
  • @biziclop you mean the introduction section 17.5? I don't read anything in there as disagreeing with this interpretation, but it is somewhat of an edge case so it's a bit vague. – Voo Nov 18 '15 at 18:28
2

Ok so this is what JLS §17.5.1 has to say on the topic.

First of all:

Let o be an object, and c be a constructor for o in which a final field f is written. A freeze action on final field f of o takes place when c exits, either normally or abruptly.

So we know that in our code:

public ListInConstructor() {
    list = new ArrayList<>();
    list.add(new Object());
} // the freeze action happens here!

So now the interesting part:

Given a write w, a freeze f, an action a (that is not a read of a final field), a read r1 of the final field frozen by f, and a read r2 such that hb(w, f), hb(f, a), mc(a, r1), and dereferences(r1, r2), then when determining which values can be seen by r2, we consider hb(w, r2).

So let's do this one piece at a time:

We have hb(w,f), which means we write to the final field before leaving the constructor.

r1 is the read of the final field and dereferences(r1, r2). This means that r1 reads the final field and r2 then reads some value of this final field.

We further have an action (a read or write, but not a read of the final field) which has hb(f,a) and mc(a, r1). This means that the action happens after the constructor but can be seen by the read r1 afterwards.

And consequently it states that "we consider hb(w, r2)", which means that the write has to happen before the read to a value of the final field that was read with r1.

So the way I see it, it is clear that the object added to the list will have to be visible by any thread that can read list.

On a sidenote: HotSpot implements final field semantics by putting a memory barrier at the end of any constructor that contains a final field thereby guaranteeing this property in any case. Whether that's just an optimisation (Better to do to only a single barrier and that as far away from the write as possible) is another question.

Voo
  • 29,040
  • 11
  • 82
  • 156