C/C++ volatile has a very narrow range of guarantee uses: to interact with the outside world directly (signal handler written in C/C++ are "outside" when they are called asynchronously); that's why volatile object accesses are defined as observables, just like the console I/O and the exit value of the program (return value of main).
A way to see it is to imagine that any volatile access is actually translated by I/O on a special console, or terminal or pair of FIFO devices named Accesses and Values where:
- a volatile write
x = v;
to object x of type T is translated to writing to the FIFO Accesses a write order specified as a 4-uplet ("write", T, &x, v)
- a volatile read (lvalue to rvalue conversion) of
x
is translated to writing to Accesses a 3-uplet ("read", T, &x)
and waiting for the value on Values.
This way, volatile is exactly like an interactive console.
A nice specification of volatile is the ptrace semantic (that nobody but me uses, but it's still the nicest volatile specification ever):
- a volatile variable can be examined by the debugger/ptrace after the program has been stopped at a well defined point;
- any volatile object access is a set of well defined PC (program counter) points such that a breakpoint can be set there(**): an expression doing a volatile access translates to a set of addresses in the code where breaking causes a break at a defined C/C++ expression;
- the state of any volatile object can be modified in arbitrary ways(*) with ptrace when the program is stopped, limited only to the legal values of the object in C/C++; changing the bit pattern of a volatile object with ptrace is equivalent with adding an assignment expression in the C/C++ at the C/C++ well defined breakpoint, so it's equivalent with changing C/C++ source code at run time.
It means that you have a well defined ptrace observable state of the volatile objects at these points, period.
(*) But you may not set a volatile object to an invalid bit pattern with ptrace: the compiler can assume that any object has a legal bit pattern as defined by the ABI. All uses of ptrace to access volatile state must follow the ABI specification of objects shared with separately compiled code. For example a compiler can assume that a volatile number object doesn't have a negative zero value if the ABI doesn't allow it. (Obviously negative zero is a valid state, semantically distinct from positive zero, for IEEE floats.)
(**) Inlining and loop unrolling can generate many points in assembly/binary code corresponding to a unique C/C++ point; debuggers handle that by setting many PC level breakpoints for one source level breakpoint.
ptrace semantic doesn't even imply that a volatile local variable is stored on the stack and not in register; it implies that the location of the variable, as described in the debugging data, is modifiable either in addressable memory via its stable address in the stack (stable for the duration of the function call obviously) or in the representation of the saved registers of a paused program, which is in temporary complete copy of the registers as saved by the scheduler when a thread of execution is paused.
[In practice all compilers provide a stronger guarantee than ptrace semantic: that all volatile objects have a stable address even if their address is never taken in C/C++ code; this guarantee is sometimes not useful and strictly pessimistic. The lighter ptrace semantic guarantee is extremely useful in itself for automatic variable in register in "high level assembly".]
You can't examine a running program (or thread) without stopping it; you cannot observe from any CPU without synchronization (ptrace provides such synchronization).
These guarantees hold at any optimization level. At minimum optimization, all variables are in fact practically volatile and the program can be stopped at any expression.
At higher optimization level, computations are reduced and variables can even be optimized out if they hold no useful information for any legal run; the most obvious case is a "quasi const" variable, which isn't declared const, but used a-if const: set once and never changed. Such variable carries no information at runtime if the expression that was used to set it can be recomputed later.
Many variables that carry useful information still have a limited range: if there is no expression in a program that can set a signed integer type to a mathematical negative result (a result that is truly negative, not negative because of overflow in 2-complement system), the compiler can assume that they don't have negative values. Any attempt to set these to a negative value in the debugger or via ptrace would be unsupported as the compiler can generate code that integrate the assumption; making the object volatile would force the compiler to allow any possible legal value for the object, even if only assignments of positive values are present in the complete code (the code in all paths that can access that object, in every TU (translation unit) that can access the object).
Note that for any object that is shared beyond the set of collectively translated code (all TU that are compiled and optimized together), nothing about the possible values of the object can be assumed beside the applicable ABI.
The trap (not trap as in computing) is to expect Java volatile-like semantic in at least single CPU, linear, ordered semantic programming (where there is by definition no out of order execution as there is only of POV on the state, the one and only CPU):
int *volatile p = 0;
p = new int(1);
There is no volatile guarantee that p
can only be null or point to an object with value 1: there is no volatile ordering implied between the initialization of the int
and the setting of the volatile object, so an async signal handler or a breakpoint on the volatile assignment may not see the int
initialized.
But the volatile pointer may not be modified speculatively: until the compiler obtains the guarantee that the rhs (right hand side) expression will not throw an exception (thus leave p
untouched), it cannot modify the volatile object (as a volatile access is an observable by definition).
Going back to your code:
INTENABLE = 0; // volatile write (A)
my_var += 5; // normal write
INTENABLE = 1; // volatile write (B)
Here INTENABLE
is volatile so all accesses are observable; the compiler must produce exactly those side effects; the normal writes are internal to the abstract machine and the compiler need only to preserve these side effects WRT to producing the correct result, without accounting for any signals which are outside the abstract semantics of C/C++.
In term of ptrace semantics, you can set a breakpoint at point (A) and (B) and observe or change the value of INTENABLE
but that's all. Although my_var
may not be optimized out completely as it accessible by outside code (the signal handing code) but there is nothing else in that function that can access it, so the concrete representation of my_var
doesn't have to match its the value according to the abstract machine at that point.
It's different if you have call to an truly external (not analyzable by the compiler, outside the "collectively translated code") do-nothing function in between:
INTENABLE = 0; // volatile write (A)
external_func_1(); // actual NOP be can access my_var
my_var += 5; // normal write
external_func_2(); // actual NOP be can access my_var
INTENABLE = 1; // volatile write (B)
Note that both of these calls to do-nothing-possibly-do-anything external functions are needed:
external_func_1()
possibly observes the previous value of my_var
external_func_2()
possibly observes the new value of my_var
These calls are to external, separately compiled NOP functions that have to be made according to the ABI; thus all globally accessible objects must carry the ABI representation of their abstract machine value: the objects must reach their canonical state, unlike the optimized state where the optimizer knows that some concrete memory representation of some objects have not reached the value of the abstract machine.
In GCC such do-nothing external function can be spelled either asm("" : : : "memory");
or just asm("");
. The "memory"
is vaguely specified but clearly means "accesses anything in memory whose address has been leaked globally".
[See here I'm relying on the transparent intent of the specification and not on its words as the words are very often badly chosen(#) and not used by anyone to build an implementation anyway, and only the opinion of people count, the words never do.
(#) at least in the world of common programming languages where people don't have the qualification to write formal or even correct specifications. ]