A type of instruction that enforces ordering of a given set of operations.
Questions tagged [memory-fences]
106 questions
145
votes
5 answers
76
votes
1 answer
When are x86 LFENCE, SFENCE and MFENCE instructions required?
Ok, I have been reading the following Qs from SO regarding x86 CPU fences (LFENCE, SFENCE and MFENCE):
Does it make any sense instruction LFENCE in processors x86/x86_64?
What is the impact SFENCE and LFENCE to caches of neighboring cores?
Is the…

user997112
- 29,025
- 43
- 182
- 361
51
votes
4 answers
Java 8 Unsafe: xxxFence() instructions
In Java 8 three memory barrier instructions were added to Unsafe class (source):
/**
* Ensures lack of reordering of loads before the fence
* with loads or stores after the fence.
*/
void loadFence();
/**
* Ensures lack of reordering of stores…

Alexey Malev
- 6,408
- 4
- 34
- 52
36
votes
2 answers
When is a compiler-only memory barrier (such as std::atomic_signal_fence) useful?
The notion of a compiler fence often comes up when I'm reading about memory models, barriers, ordering, atomics, etc., but normally it's in the context of also being paired with a CPU fence, as one would expect.
Occasionally, however, I read about…

etherice
- 1,761
- 15
- 25
34
votes
5 answers
What is the difference between using explicit fences and std::atomic?
Assuming that aligned pointer loads and stores are naturally atomic on the target platform, what is the difference between this:
// Case 1: Dumb pointer, manual fence
int* ptr;
// ...
std::atomic_thread_fence(std::memory_order_release);
ptr = new…

Cameron
- 96,106
- 25
- 196
- 225
22
votes
4 answers
Is a memory barrier an instruction that the CPU executes, or is it just a marker?
I am trying to understand what is a memory barrier exactly.
Based on what I know so far, a memory barrier (for example: mfence) is used to prevent the re-ordering of instructions from before to after and from after to before the memory barrier.
This…

Christopher
- 729
- 7
- 12
20
votes
1 answer
Cost of using final fields
We know that making fields final is usually a good idea as we gain thread-safety and immutability which makes the code easier to reason about. I'm curious if there's an associated performance cost.
The Java Memory Model guarantees this final Field…

maaartinus
- 44,714
- 32
- 161
- 320
18
votes
3 answers
Is atomic decrementing more expensive than incrementing?
In his Blog Herb Sutter writes
[...] because incrementing the smart pointer reference count
can usually be optimized to be the same as an ordinary increment
in an optimized shared_ptr implementation — just an ordinary increment instruction,
…

towi
- 21,587
- 28
- 106
- 187
17
votes
3 answers
Why is (or isn't?) SFENCE + LFENCE equivalent to MFENCE?
As we know from a previous answer to Does it make any sense instruction LFENCE in processors x86/x86_64? that we can not use SFENCE instead of MFENCE for Sequential Consistency.
An answer there suggests that MFENCE = SFENCE+LFENCE, i.e. that LFENCE…

Alex
- 12,578
- 15
- 99
- 195
16
votes
3 answers
In OpenCL, what does mem_fence() do, as opposed to barrier()?
Unlike barrier() (which I think I understand), mem_fence() does not affect all items in the work group. The OpenCL spec says (section 6.11.10), for mem_fence():
Orders loads and stores of a work-item executing a kernel.
(so it applies to a single…

andrew cooke
- 45,717
- 10
- 93
- 143
16
votes
2 answers
Why isn't a C++11 acquire_release fence enough for Dekker synchronization?
The failure of Dekker-style synchronization is typically explained with reordering of instructions. I.e., if we write
atomic_int X;
atomic_int Y;
int r1, r2;
static void t1() {
X.store(1, std::memory_order_relaxed)
r1 =…

Jason Ptacek
- 171
- 5
16
votes
4 answers
The cost of atomic counters and spinlocks on x86(_64)
Preface
I recently came across some synchronization problems, which led me to spinlocks and atomic counters. Then I was searching a bit more, how these work and found std::memory_order and memory barriers (mfence, lfence and sfence).
So now, it…

firda
- 3,268
- 17
- 30
13
votes
3 answers
Do memory fences slow down all CPU cores?
Somewhere, one time I read about memory fences (barriers). It was said that memory fence causes cache synchronisation between several CPU cores.
So my questions are:
How does the OS (or CPU itself) know which cores need to be synchronised?
Does it…

GreenScape
- 7,191
- 2
- 34
- 64
13
votes
2 answers
Where to places fences/memory barriers to guarantee a fresh read/committed writes?
Like many other people, I've always been confused by volatile reads/writes and fences. So now I'm trying to fully understand what these do.
So, a volatile read is supposed to (1) exhibit acquire-semantics and (2) guarantee that the value read is…

dcastro
- 66,540
- 21
- 145
- 155
12
votes
5 answers
Are volatile reads and writes atomic on Windows+VisualC?
There are a couple of questions on this site asking whether using a volatile variable for atomic / multithreaded access is possible: See here, here, or here for example.
Now, the C(++) standard conformant answer is obviously no.
However, on Windows…

Martin Ba
- 37,187
- 33
- 183
- 337