7

Everybody says that immutable objects are thread safe, but why is this?

Take the following scenario running on a multi core CPU:

  • Core 1 reads an object at memory location 0x100 and it is cached in the L1/L2 cache of Core 1;
  • The GC collects this object at that memory location because it has become eligible and 0x100 becomes available for new objects;
  • Core 2 allocates an (immutable) object which is located at address 0x100;
  • Core 1 gets a reference to this new object and reads it at memory location 0x100.

In this situation, when Core 1 asks for the value at location 0x100 is it possible that it reads the stale data from its L1/L2 cache? My intuition says that a memory gate is still needed here to ensure that Core 1 reads the correct data.

Is the above analysis correct and is a memory gate required, or am I missing something?

UPDATE:

The situation I describe here is a more complex version of what happens every time the GC does a collect. When the GC collects, memory is reordered. This means that the physical location the object was located at changes and that L1/L2 must be invalidated. Roughly the same applies to the example above.

Since it is reasonable to expect that .NET ensures that after reordering memory, different cores see the correct memory state, the above situation will not be a problem too.

Pieter van Ginkel
  • 29,160
  • 8
  • 71
  • 111
  • 2
    There's a difference between an object's immutability and the immutability (or lack of) of the reference to it. – spender Mar 25 '11 at 14:14
  • @Pieter:Not sure how GC works in .NET but, why would it free the memory 0x100 since there is already a reference to the object? – Cratylus Mar 25 '11 at 14:16
  • @spender - That's true, but for sake of argument, lets say that we do guarantee that the address to the new (immutable) object is correct for Core 1. That should not change the scenario. The whole idea is that Core 1 wants to read the new object, but gets stale results from its L1/L2 cache. – Pieter van Ginkel Mar 25 '11 at 14:17
  • @user384706 - because the object that originally occupied `0x100` becomes eligible for collection. (updated the question) – Pieter van Ginkel Mar 25 '11 at 14:18
  • *Personally* I suspect the issue here lies in bullet 2's assumption about eligibility; re the cross-core L1/L2 conflict, that may depend on architecture; how *exactly* L1/L2 are shared between cores/HTs depend on the chip (assuming they are in the same CPU package). A deceptively complex question... – Marc Gravell Mar 25 '11 at 14:28
  • 1
    I agree with some of the other commenters. In this case, the actual object was immutable, the the _reference_ was not -- core 2 didn't mutate an immutable object, it created one, and put it at 0x100. Core 1's reference is just that -- a reference, and it is not really a part of the immutable object -- it's much more like a pointer. So yah -- I think you would want a memory gate, and I would probably opt for in Interlocked Exchange where core 2 swaps in the object. – JMarsch Mar 25 '11 at 14:29
  • @JMarsch - oh, for sure - the entire question about immutability is irrelevant, IMO. In my mind this is primarily about the reference and GC's behaviour with L1/L2 caching. Indeed, the chances of the next use of 0x100 being the same object type and correctly aligned are ... slim. – Marc Gravell Mar 25 '11 at 14:30
  • @Marc Gravell -- I see where you're going. Yah. That's a deep subject. Joe Duffy writes about that sort of thing alot. I've always assumed that when the GC performs a collection, there just has to be a memory gate. And that would clear up this scenario, wouldn't it -- the GC uses what, a release gate (or maybe a full fence??), and that means that core 1 will go back to main memory when it reads that reference. Does that sound plausible? – JMarsch Mar 25 '11 at 14:35
  • 2
    Maybe I'm misunderstanding something but this sentence seems bogus to me: "As far as I understand, it is totally possible for Core 1 in this situation to read the memory at location 0x100 from its L1/L2 cache, so it could read stale data." If I just changed what is at location 0x100, wouldn't that invalidate the cache line that contains that address? – R. Martinho Fernandes Mar 25 '11 at 14:36
  • 1
    @JMarsch as you say; it sounds not just plausible but an essential demand. – Marc Gravell Mar 25 '11 at 14:37
  • @Martinho Fernandes - **That is exactly the question I have**: is this guarenteed and if so, how? – Pieter van Ginkel Mar 25 '11 at 14:37
  • @Pieter: if that was your question, I believe that mixing it with .NET, GC, and immutability harmed it :(. I think it is solely a question about cache consistency in a multi-core environment. I'm not sure about my assertion about invalidation above, so could you please edit the question to make that clear, so someone that knows can say something on that matter? – R. Martinho Fernandes Mar 25 '11 at 14:43
  • @Martinho Fernandes - I wasn't sure how I could make the question clear. The reason I ask about immutable objects is just as an example. Do you have a specific suggestion because I'm at a loss. – Pieter van Ginkel Mar 25 '11 at 14:49
  • @Martinho and Pieter: The way the cache gets invalidated is with a gate. All the lock/compare exchange operations cause a gate. The GC would pretty much have to (though I haven't read the code). Creating the immutable object and putting it at 0x100 is not inherently threadsafe (again, I'd probably use an Exchange for htat), but after it is set (and with appropriate gating around the set), then reads from it should be threadsafe (assuming that the members that you are reading from it are also immutable) – JMarsch Mar 25 '11 at 16:22
  • @JMarsch - It hit me when I realized that a GC collect also reorders memory (see my updated question) and the same situation applies there. Your analysis sounds very plausible. Thank you for your answer. – Pieter van Ginkel Mar 25 '11 at 17:02
  • @Pieter yah, I think the memory fence takes care of that. By the way, I'm pretty sure that the only time memory reordering occurs is on a Gen 2 collection. I thought that the reording was another reason that Gen 2 collections are so much more expensive than Gen 0 or 1. Either way, the GC probably uses a fence on all collections, thus signalling that the caches need to be handled. – JMarsch Mar 25 '11 at 21:18

4 Answers4

3

The object's immutability isn't the real question in your scenario. Rather, your description's issue revolves around the reference, list, or other system which points to the object. It would of course need some sort of technique to make sure the old object is no longer availble to the thread which may have tried to access it.

The real point to immutable object's thread safety is that you don't need to write a bunch of code to produce thread safety. Rather the framework, OS, CPU (and whatever else) do the work for you.

John Fisher
  • 22,355
  • 2
  • 39
  • 64
  • I think I wasn't clear in my question. I've made a few updates to it which should explain my question better. – Pieter van Ginkel Mar 25 '11 at 14:24
  • @Pieter: Maybe you misunderstood my answer? Any required memory gate is already in place -- and yes it is likely required. However, it's deep within the system you're programming on top of, so you can write your code as if no memory gate is needed. – John Fisher Mar 25 '11 at 14:29
  • @John Fisher - How do you mean the memory gate is already in place? – Pieter van Ginkel Mar 25 '11 at 14:35
  • @Pieter: The garbage collector, Operating System, Framework, and/or class that you're using to manage the objects have the thread synchronization in them already. As a developer, you don't need to care about it. – John Fisher Mar 25 '11 at 15:18
  • @John Fisher - I missed an important point and thus asked the wrong question. The same thing happens because of reordering of memory because of a GC collect. When memory is reordered, the data at the old memory location becomes invalid and of course, there must be a mechanism to invalidate the L1/L2 cache in that situation too. The same applies to my example, albeit my example doesn't pose the question clearly. Thank you for sending me in the right direction. – Pieter van Ginkel Mar 25 '11 at 15:37
3

I think what you're asking is whether, after an object is created, the constructor returns, and a reference to it is stored somewhere, there is any possibility that a thread on another processor will still see the old data. You offer as a scenario the possibility that a cache line holding instance data for the object was previously used for some other purpose.

Under an exceptionally weak memory model, such a thing might be possible, but I would expect any useful memory model, even a relatively weak one, would ensure that dereferencing an immutable object would be safe, even if such safety required padding objects enough that no cache line be shared between object instances (the GC will almost certainly invalidate all caches when it's done, but without such padding, it would be possible that an immutable object created by core #2 might share a cache line with an object that core #1 had previously read). Without at least that level of safety, writing robust code would require so many locks and memory barriers that it would be hard to write multi-processor code that wasn't slower than single-processor code.

The popular x86 and x64 memory models provide the guarantee you seek, and go much further. Processors coordinate 'ownership' of cache lines; if multiple processors want to read the same cache line, they can do so without impediment. When a processor wants to write a cache line, it negotiates with other processors for ownership. Once ownership is acquired, the processor will perform the write. Other processors will not be able to read or write the cache line until the processor that owns the cache line gives it up. Note that if multiple processors want to write the same cache line simultaneously, they will likely spend most of their time negotiating cache-line ownership rather than performing actual work, but semantic correctness will be preserved.

supercat
  • 77,689
  • 9
  • 166
  • 211
1

You're missing that it would be a bad garbage collector indeed that let such a thing happen. The reference on core 1 should have prevented the object from being GCd.

Ernest Friedman-Hill
  • 80,601
  • 10
  • 150
  • 186
  • I think I wasn't clear in my question. The idea here is that Core 1 wants to read the new object, but the data in L1/L2 is of an object that Core 1 previously had access to, but was collected in the mean time. – Pieter van Ginkel Mar 25 '11 at 14:20
  • That's not the way the garbage collector works. To give a thread a reference to an immutable object requires keep its reference stored somewhere. So you can hand it to the other thread. Which automatically ensures it can ever be collected. Stale data is of course quite possible, that's never not a problem. – Hans Passant Mar 25 '11 at 15:08
1

I'm not sure that a memory gate would change this scenario, as that would surely only affect subsequent reads... and then the question becomes reads from where? If it is from a field (which must at a minimum be static or an instance fields for some instance still on the stack or otherwise reachable), or local variable - then by definition it isn't available for collection.

Re the scenario where that reference is only now in the registers... that is far trickier. Intuitively I want to say "no that isn't a problem", but it would take a detailed look at the memory model to prove it. But handling references is such a common scenario that simply: this has to work.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • Sorry, I wasn't clear in my question. The idea here is that the object Core 1 reads at the last step is the correct object and the issue is not whether the reference is correct (see comments and updated question). Core 1 actually wants to read the object created by Core 2. My question is whether it is guaranteed that it actually reads the fresh data it expects to read. – Pieter van Ginkel Mar 25 '11 at 14:23