3

Consider the following function.

void incr(_Atomic int *restrict ptr) {
    *ptr += 1;
}

I'll consider x86, but my question is about the language, not the semantics of any particular implementation of atomics. GCC and Clang both emit the following:

incr:
        lock add        DWORD PTR [rdi], 1
        ret

Is it possible for a conforming implementation to emit simply

incr:
        add     DWORD PTR [rdi], 1
        ret

(The same thing you get if you remove _Atomic.)

Without restrict, this would be a miscompilation because add is not atomic, so (for instance) calling incr at the same time in two threads would race. However, since the pointer is restrict-qualified, I think it's not possible for a race to occur between incr and any other access to *ptr (without causing undefined behavior anyway).

I would feel confident making this optimization manually, but no compiler I know of does so automatically. Is it correct? Or have I misunderstood restrict?

Lundin
  • 195,001
  • 40
  • 254
  • 396
trent
  • 25,033
  • 7
  • 51
  • 90
  • 1
    What does restrict has has anything to do with this? It merely states that all access to that pointer inside the scope where it was declared is done through that pointer alone. There are no guarantees of what happens _outside_ that scope. I don't understand the question. – Lundin Mar 18 '21 at 17:31
  • I was playing with a related example in Rust, which has different (mostly stricter) aliasing rules around references. You can "downgrade" atomics to non-atomic if you have a unique reference, but the compiler doesn't do this optimization automatically even though it theoretically could. I just wondered if a C compiler would be allowed to do it, but obviously I misremembered the details of `restrict`, which I've never actually used in real life. – trent Mar 18 '21 at 17:52

2 Answers2

3

The optimization cannot be made by the compiler, because the behavior of restrict is defined only with respect to accesses to the same object via other pointers within incr.

From this reference, with my own emphasis:

During each execution of a block in which a restricted pointer P is declared (typically each execution of a function body in which P is a function parameter), if some object that is accessible through P (directly or indirectly) is modified, by any means, then all accesses to that object (both reads and writes) in that block must occur through P (directly or indirectly), otherwise the behavior is undefined...

The restriction on the pointer only applies to accesses made from incr, meaning that a conformant compiler should not be able to conclude anything about accesses made by other threads, or other functions that have their own pointers pointing to the same integer (which is perfectly fine a caller of incr might only have non-restricted pointers to it).

The open standard, section 6.7.3.1, also defines the formal behavior of restrict with respect to the block being executed, making no claim about the lifetime of any pointees. However, I think the standard's definition is too mathematically unwieldy to be worth presenting in full here.

nanofarad
  • 40,330
  • 4
  • 86
  • 117
  • I suspect the OP may have been reading tutorials. I checked a one and it referred to the lifetime of the pointer, not the lexical scope of the block. [Wikipedia](https://en.wikipedia.org/wiki/Restrict) also gets it wrong, maybe someone should edit it. – Barmar Mar 18 '21 at 17:24
  • @Barmar I will be happy to take a look at editing Wikipedia, but I probably won't be able to make a solid edit until tomorrow or perhaps the weekend. In particular, I'd like to familiarize myself with the expectations for citing and sourcing content there, and also read the whole article carefully to look for any other errors. – nanofarad Mar 18 '21 at 17:26
  • 1
    Unfortunately, fixing sites like [GeeksforGeeks](https://www.geeksforgeeks.org/restrict-keyword-c/) is not as easy. :) – Barmar Mar 18 '21 at 17:30
  • 2
    @Barmar Nor is it easy to fix 6.7.3.1... one of the worst written parts of the standard. They let some math muppet loose and turned it unreadable. Given how bad the standard is written, I'm not the slightest surprised if nobody gets it. – Lundin Mar 18 '21 at 17:42
2

Or have I misunderstood restrict?

Yeah - it makes no guarantees of what goes on outside the function where you used it. So it cannot be used for synchronization or thread-safety purposes.

The TL;DR about restrict is that *restrict ptr guarantees that no other pointer access to the object that ptr points at will be done from the block where ptr was declared. If there are other pointers or object references present ("lvalue accesses"), the compiler may assume that they aren't modifying that pointed-at restricted object.


This is pretty much just a contract between the programmer and the compiler: "Dear compiler, I promise I won't do stupid things behind your back with the object that this pointer points at". But the compiler can't really check if the programmer broke this contract, because in order to do so it may have to go check across several different translation units at once.

Example code:

#include <stdio.h>

int a=0;
int b=1;

void copy_stuff (int* restrict pa, int* restrict pb)
{
  *pa = *pb;      // assign the value 1 to a
  a=2;            // undefined behavior, the programmer broke the restrict contract
  printf("%d\n", *pa); 
}

int main()
{
  copy_stuff(&a, &b);
  printf("%d\n", a);
}

This prints 1 then 2 with optimization on, on all mainstream x86 compilers. Because in the printf inside the function they are free to assume that *pa has not been modified since *pa = *pb;. Drop restrict and they will print 2 then 2, because the compiler will have to add an extra read instruction in the machine code.


So for restrict to make sense, it needs to be compared with another reference to an object. That's not the case in your example, so restrict fills no apparent purpose.

As for _Atomic, it will always have to guarantee that the machine instructions in themselves are atomic, regardless of what else goes on in the program. It's not some semi-high level "nobody else seems to be using this variable anyway". But for the keyword to make sense for multi-threading and multi-core both, it would also have to act as a low level memory barrier to prevent pipelining, instruction re-ordering and data caching - which if I remember correctly is exactly what x86 lock does.

Lundin
  • 195,001
  • 40
  • 254
  • 396