Distinguishing volatile vs persistent variable, does it affect correctness?

Question

From my understanding of persistent programming models, it is up to the programmer to correctly distinguish the volatile variables as opposed to the persistent variables. Persistent variables would require some kind of atomic updates so that if a power failure occurs, the recovery program can clean up the program's data into a consistent state.

For example, if I insert a node to the beginning of a linked list with a persistent memory backed program, I would have to do the following:

Create the new node with the new data inside it
link the "next" pointer to the current head of the linked list
then update the head pointer to point to the new node.

Each step has to be performed with either undo-log or redo-log to assist the recovery program for consistent state of the data. Also, each step must be persisted through flushes and fences.

Programmers want to identify which variables must survive the failure and which are not important so that they do not need the overhead ("logging" and "flush & fence"). This is much harder than it sounds like and there is much research and tools regarding it.

Now here is my question. Assuming that my understanding of the persistent programming model is correct, would treating all the volatile variables (e.g., loop counter) as persistent variables ever be incorrect? I understand that it would cause significant overhead and it is identified as "performance bug". I would appreciate if there are any cases where persisting a volatile variable would affect the correctness of the recovery program.

Thank you.

score 0 · Answer 1 · answered Jan 27 '20 at 21:23

Allocating volatile variables from persistent memory is totally normal. Usually, you'd want to allocate volatile variables that are not performance-sensitive from persistent memory to make use of its much higher capacity. The performance overhead would mostly come from the inferior latency and bandwidth of persistent memory compared to DRAM. Objects that are infrequently accessed or have high temporal locality are good candidates for persistent memory. Those that are infrequently accessed are likely to have a small or negligible impact on performance. Those that have high temporal locality are likely to be accessed from the cache hierarchy, and so the performance of persistent memory is irrelevant past the startup phase.

You'll have to use the right APIs, though. A persistent memory programming model that supports volatile semantics would offer memory allocation and deallocation APIs that are different from those used for persistence semantics. This important distinction is required because the volatile regions from persistent memory are supposed to be automatically reclaimed by the system when the application terminates.

Security could be a concern if the persistent memory hardware doesn't have built-in support for encryption. "Volatile" state could be accessible to a malicious user or application.

Distinguishing volatile vs persistent variable, does it affect correctness?

1 Answers1