C# Garbage Collector, Threading and Compiler/Jitter optimization

Question

Let's assume that our program has a central point (an instance of a Document class), where all kinds of information are referenced. Now we have two threads. Both threads have access to our "document" and "document" contains a reference to let's say "params" (an object that holds some kind of information). So if we have a reference to "document" we can use "document.params" to get our params object.

Thread 1 does the following:

Params tempParams = document.params; // get a local reference to documents.params
int a = tempParams.a; // read data from params
// thread 1 (this thread) gets interrupted by thread 2
int b = tempParams.b; // read data from params
int c = tempParams.c; // read data from params

Thread 2 does the following:

Params newParams = new Params();
... // fill newParams with new parameters
lock(obj) {
    document.params = newParams; // update params in document
}

So the content of "params" is never changed, but if a change is needed a new copy is generated and the reference "document.params" get's updated to the new Params block which is an atomic action.

Now the big question is:

Is it possible that the jitter might optimize the code of thread 1 that way that tempParams is not a memory address but a CPU register? And if Thread 2 updates the document.params reference, there is no reference in memory that points to the old "Params" block, since the reference in Thread 1 is only in a CPU register. And if just in this moment the garbage collector starts, how can it see that the old "Params" block is still in use?

The other question would be: might it happen that the jitter optimizes away the tempParams variable and uses document.params.a/b/c directly. In this case Thread 1 would see the swap of the Params object which is not intended. The use of tempParams shall ensure that Thread 1 access a/b/c from the same Params object that was in document.params when the reference was copied.

The worst that could happen is that Thread 1 will still read the *old* `document.params` the next time, even though you've changed it from Thread 2 in the meantime. That's why you want to use locks or memory barriers around every access to a shared field. Or avoid using shared fields altogether :D — Luaan, Aug 05 '15 at 12:18
Just updated the lock for updating the params field of document. The missing lock in Thread 1 is intended and it is ok that there might be a delay before Thread 1 sees the changes done by Thread 2. — bebo, Aug 05 '15 at 12:22
There is exceedingly little point in writing programs that are guaranteed to have a bug when you debug them. — Hans Passant, Aug 05 '15 at 12:25
As long as you're content with eventual consistency, it's *probably* safe. Make sure you don't rely on the order of operations outside of that lock, though (e.g. setting a flag along with the assignment or something). It's theoretically possible that Thread 1 will *never* see the updated value, but it probably isn't an issue in realistic code. It is the kind of thing that could plausibly happen in future versions of C#/JIT/CLR, though. — Luaan, Aug 05 '15 at 12:25
@Hans Passant: Where exactly do you see a bug? It's working. But it might be working by accident and not by rightness. — bebo, Aug 05 '15 at 20:55

Jon Skeet · Accepted Answer · 2015-08-05T13:53:38.650

4

Is it possible that the jitter might optimize the code of thread 1 that way that tempParams is not a memory address but a CPU register?

I suspect that's possible - but that won't stop the garbage collector from treating it as a use of the reference. If it did, that would be a GC bug.

The other question would be: might it happen that the jitter optimizes away the tempParams variable and uses document.params.a/b/c directly.

That would be a JIT bug, IMO. There's no guarantee that thread 1 will see a change to document.params (so that's still a risk to consider in different scenarios), but given that it has copied the reference into a local variable (tempParams) and that variable never changes its value, all accesses via tempParams will address the same object. (There's no risk that tempParams.a will read from one object but tempParams.b will read from a different one.)

Just to bring some of the commentary below into this answer - there's some discussion around whether it's valid for a JIT to "optimize" code such that it appears to change the value of a local variable. This MSDN article certainly suggests it would be valid, for example. I saw something similar and blogged about it a long time ago. I'm 99% sure I talked to someone (possibly Joe Duffy) around whether that effective read introduction was valid by ECMA-335, and their impression was that it wasn't. However, I can't find any definite documentation for that, and ECMA-335 is at least unclear on the matter.

The ECMA-335 (CLI) specification is definitely laxer than the CLR 2.0 model that MS has implemented for some time now, but I don't believe it's quite that lax. If you can't rely on local variables being isolated from change, it's very hard to write any valid code, IMO.

edited Aug 05 '15 at 13:53

answered Aug 05 '15 at 11:52

Jon Skeet

1,421,763
867
9,128
9,194

1) According to this MSDN (https://msdn.microsoft.com/en-US/us-en/library/Ee787088%28v=VS.110%29.aspx) there is no counter. Same eg here http://www.levibotelho.com/development/how-does-the-garbage-collector-work/. --- 2) Is there a way to force the jitter to make a local copy and don't optimize it away? – bebo Aug 05 '15 at 12:01
@bebo: I didn't suggest there *was* a counter. I said that the garbage collect will still count it as a use of the reference - i.e. it will still notice that it's in use. I'll update the answer to say "treat it as a use of the reference" to avoid ambiguity. Basically you don't need to worry about it being GC'd early. For 2) my point is that it *will* make a local copy - or at least act as if it has done so. – Jon Skeet Aug 05 '15 at 12:06
For thread, the compiler will copy the params value in temporary storage (whatever is the optimization process) and it will "release" document.Params for the GC at thread termination or as soon as the instruction "temparams=null" is executed. You don't have to worry about changes done by Thread2 unless these changes are made before thread1 starts execution. – Graffito Aug 05 '15 at 12:10
@Jon Skeet: Is there any official documentation about this. The program is working as exprected, but I always have a bad feeling without knowing where this behaviour is defined. If it is not documented, it might be a potitial bug. – bebo Aug 05 '15 at 12:11
2

@bebo: It's documented that the object won't be garbage collected while your code is still using it... and you clearly *are* still using it, so that's the first part. As for the second part, you're basically asking if a local variable can be observed to change its value with no basis... I don't know that it's explicitly documented that that *won't* happen, but it seems like a reasonable assumption to make... – Jon Skeet Aug 05 '15 at 12:13
1

For second question, I think it is possible to read a from one object and b from another. Joe duffy in his book[Concurrent Programming on Windows] said that it is perfectly legal for an optimizer(Compiler/JIT/Processor or whatever) to alter a program such that meaning of the program doesn't change if executed by single thread. So this contradicts with what you say in second part of the answer. – Sriram Sakthivel Aug 05 '15 at 12:27
@SriramSakthivel: That would involve the value of a *local variable* being changed from a different thread. I don't believe that's possible. I'm not sure what you mean by "meaning of the program doesn't change if executed by a single thread". – Jon Skeet Aug 05 '15 at 12:29
@JonSkeet I see what you mean, but [that's the way it works](http://stackoverflow.com/questions/7664046/allowed-c-sharp-compiler-optimization-on-local-variables-and-refetching-value-fr). Maybe microsoft implementation of CLR doesn't do that, but it is still legal. By that I mean, when your program is executed completely in single thread, there is no different in the behavior. – Sriram Sakthivel Aug 05 '15 at 12:45
@SriramSakthivel: I've seen that discussed before in response to the MSDN article talked about, and refuted by those who know about these things. (I'm pretty sure I asked Joe Duffy himself about it, but I can't remember for sure.) Basically, I disagree with the assertion that the ECMA memory model allows local variables to be observed changing value in this situation. – Jon Skeet Aug 05 '15 at 12:50
@JonSkeet There are some discrepancies in that. Joe Duffy in his book talk about the very same optimization (Pg:517). But he says that VC++ compiler can do this optimization but .Net memory model doesn't allows this. I'm not sure what he meant by .Net memory model. It is still a question that only msft implementation prohibits that or all implementations. – Sriram Sakthivel Aug 05 '15 at 13:31
@SriramSakthivel: I'll look at my copy when I'm at home... but I'd have to check *exactly* what he's saying, because it could be just a very small different which would be crucial here. But looking at https://msdn.microsoft.com/en-us/magazine/JJ883956.aspx does suggest that, yes. Hmm. (Ah, but that's *not* Joe...) – Jon Skeet Aug 05 '15 at 13:34
@JonSkeet Hard thing is that article was reviewed by Joe Duffy too(See bottom of the page) :) I believe it's true then; but it suggests that Joe's book is wrong :( – Sriram Sakthivel Aug 05 '15 at 13:51
@SriramSakthivel: Joe may well have meant the MS .NET memory model, which is stricter than the ECMA memory model... although both are unfortunately unclear. I find it very hard to see a world in which read introductions like this can be viewed as sane... for example, suppose I'm reading via a property which is inlined by the JIT - how could I (as an API consumer) know whether or not it's actually backed by a volatile field? I very much doubt that the ECMA authors *intended* that degree of laxness, even if it's unclear. I'll stir a bit to make sure that any ECMA update is clearer :) – Jon Skeet Aug 05 '15 at 13:56

C# Garbage Collector, Threading and Compiler/Jitter optimization

1 Answers1