1

Here are a couple of C and POSIX functions that need to fetch some data from or put some data into a buffer and tell the caller how much, so they take a pointer to the starting buffer address and write the adjusted pointer there on return:

size_t mbstowcs(wchar_t *restrict pwcs, const char *restrict s,
                size_t n); /* for comparison */
size_t mbsrtowcs(wchar_t *restrict dst, const char **restrict src,
                 size_t len, mbstate_t *restrict ps);
size_t mbsnrtowcs(wchar_t *restrict dst, const char **restrict src,
                  size_t nmc, size_t len, mbstate_t *restrict ps);
size_t iconv(iconv_t cd,
             char **restrict inbuf, size_t *restrict inbytesleft,
             char **restrict outbuf, size_t *restrict outbytesleft);

I expect the (principal) meaning of the restrict qualifiers on the buffers is that those functions assume the input and output buffers to not overlap; and for mbstowcs (that just takes the buffer addresses by value) that indeed seems to be the case.

But why do the rest of the functions take ELEMENT **restrict pointers and not ELEMENT *restrict * or ELEMENT *restrict *restrict pointers? To me these declarations as written would imply that it is the buffer addresses, the ELEMENT *s themselves as they are stored in memory, that must not alias, which is... probably a bit helpful, yes, but not that as important? These declarations make it look to me like the code

iconv_t cd = /* ... */;
char data[] = /* ... */, *inp = data, *outp = data;
size_t inn = sizeof data, outn = sizeof data;
iconv(cd, &inp, &inn, &outp, &outn);

is valid, which is surely not the intention?


In C17 subclause 6.7.3.1 (Formal definition of restrict), paragraph 4, we can find a passage that seems vaguely relevant:

[... L]et L be any lvalue that has &L based on P [a restrict-qualified pointer]. If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: T shall not be const-qualified. Every other lvalue used to access the value of X shall also have its address based on P [as defined in a preceding paragraph]. Every access that modifies X shall be considered also to modify P [italics mine], for the purposes of this subclause. [...]

but I cannot for the life of me figure out what the importance of that italicized sentence is.

Alex Shpilkin
  • 776
  • 7
  • 17
  • 1
    There's no "levels" of restrict , it applies only to the identifier in the declarator. The words "based on" allow for using different pointer depths of a restricted pointer to get to the object `L` – M.M Jan 18 '22 at 04:18
  • @M.M Huh, you’re right, that answers my question for the most part. I didn’t recognize that the notion of *based on* is infectious this way. At the same time, Clang (in C mode) warns about discarded qualifiers if I do `int *restrict p; { int **restrict q = &p; }` and is probably right for some reason. Weird. – Alex Shpilkin Jan 18 '22 at 08:42

1 Answers1

1

The semantics of applying a restrict qualifier to anything other than standalone objects of automatic duration whose addresses aren't taken are murky at best. The restrict qualifier offers the most value in cases where its semantics are clearest, and compilers are allowed to effectively ignore the qualifier in cases where the effort required to perform optimizations based upon it would exceed the benefit of such optimizations. Using the qualifier in cases with murky semantics and low payoff is apt to at best be a waste of time, since compilers are likely to simply ignore it, and if they don't ignore it they may interpret in a manner contrary to programmer expectation.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • Even more strongly, isn't it true that generally, a compiler is *not required* to do anything in particular based on the fact that a pointer P pointing to an object X has the `restrict` attribute? Because restrictedness is simply an assertion (promise) made by a programmer that object X is accessed only via pointer P? Information that the compiler *may* put to good use when optimizing (e.g. scheduling loads), if it so chooses. – njuffa Feb 07 '22 at 21:40
  • @njuffa: Compilers are free to ignore the qualifier even in cases where the effort required to perform optimizations based upon it would be minuscule compared to the payoff, but it's still generally worthwhile for programmers to include the qualifier in such cases because many compilers will process it usefully. In cases where the semantics are murky and benefits are slight, including such qualifiers in code is unlikely to have any effects, and may be just as likely to have bad effects as good. – supercat Feb 07 '22 at 22:45
  • Your overall take seems to closely match mine, +1. Now, I have closely worked with compiler engineers and discussed `restrict` and the "murkiness", as you call it, but I am **not** a compiler engineer myself. – njuffa Feb 07 '22 at 23:46
  • @njuffa: The problem with `restrict` is that for any particular base pointer P, it tries to unambigiuously partition all pointers into two categories: those based on P and those not based on P, rather than including a category for pointers that cannot be reliably classified into either of the above. It's possible to have simple rules identify most pointers that are definitely based on P, and simple rules identify most pointers that are not based on P, and most useful optimizations will involve pointers of those types. The way clang and gcc interpret the rules, however, ... – supercat Feb 08 '22 at 15:42
  • ...if there exists some pointer `Q` that is not based on `P` but might be coincidentally equal to it, then within the statement ``if (P==Q) P[0]=1;``, the pointer expression used in the lvalue `P[0]` would not be based upon the pointer `restrict P.` as it exists outside the conditional block. On the flip side, a pointer value like `R+(P==Q)` would, per the Standard's definition, be based upon `P`, though I doubt any compiler would recognize it as such. – supercat Feb 08 '22 at 15:45