7

Two months ago, I reported, as a clang++ bug, that the C++ program below sets z to 4294967295 when compiled with clang++ -O2 -fno-strict-enums.

enum e { e1, e2 } e;

long long x, y, z;

char *p;

void f(void) {
    e = (enum e) 4294967295;
    x = (long long) e;
    y = e > e1;
    z = &p[e] - p;
}

My bug report was closed as invalid because the program is undefined. My feeling was that using the option -fno-strict-enums made it defined.

As far as I know, Clang does not have documentation worthy of the name, because it aims at being compatible with GCC with respect to the options it accepts and their meaning. I read GCC's documentation of the option -fno-strict-enums as saying that the program should set the value of z to -1:

-fstrict-enums

Allow the compiler to optimize using the assumption that a value of enumerated type can only be one of the values of the enumeration (as defined in the C++ standard; basically, a value that can be represented in the minimum number of bits needed to represent all the enumerators). This assumption may not be valid if the program uses a cast to convert an arbitrary integer value to the enumerated type.

Note that only the option -fstrict-enums is documented, but it seems clear enough that -fno-strict-enums disables the compiler behavior that -fstrict-enums enables. I cannot file a bug against GCC's documentation, because generating a binary that sets z to -1, what I understand -fno-strict-enums to mandate, is exactly what g++ -O2 -fno-strict-enums does.

Could anyone tell me what -fno-strict-enums does in Clang (and in GCC if I have misunderstood what it does in GCC), and whether the value of the option has any effect at all anywhere in Clang?

For reference, my bug report is here and the Compiler Explorer link showing what I mean is here. The versions used as reference are Clang 10.0.1 and GCC 10.2 targeting an I32LP64 architecture.

Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
  • @JaMiT I'm sorry, which uninitialized variable? As for the second potential UB, overflow in conversions are implementation-defined, so in `(enum e) 4294967295 ` I expect the implementation-defined behavior to be applied when converting `4294967295` to the underlying type of `enum e`. The implementation-defined behavior is warp-around. – Pascal Cuoq Nov 07 '20 at 15:09
  • @JaMiT Variables in namespace scope (like e.g. global variables) without an explicit initializer will be "zero" initialized. – Some programmer dude Nov 07 '20 at 15:10
  • @JaMiT Well the compiler has to generate code for `f` that behaves correctly for all initial values of `p` that make the source code of `f` defined. If you insist on seeing a calling context for `f` that does not make the question meaningless, it can be `p = malloc(5000000000U); if (!p) abort(); f();` – Pascal Cuoq Nov 07 '20 at 15:14
  • @JaMiT If you have an argument explaining that the function `f` is undefined for all possible calling contexts, that would also answer my question. But I don't think that “p would have to point to an array of 4294967294 elements” is that argument, because `p` can do just that and the code generated by the compiler has to behave correctly when `p` does that. – Pascal Cuoq Nov 07 '20 at 15:24
  • @Eljay My question is “what does the clang++ option `-fno-strict-enums` do? Does it do anything?”. If you are convinced of this, perhaps you know the answer to that question? – Pascal Cuoq Nov 07 '20 at 15:25
  • 1
    `-fno-strict-enums` disables optimizations based on the strict definition of an enum’s value range. But violating ISO 14882 on the valid range of the enum is still undefined behavior. – Eljay Nov 07 '20 at 15:36
  • I've realized that my earlier comments were just echoes of the noise in the question, so I've retracted (deleted) them. I now see that the question is "what does `-fno-strict-enums` do in Clang?" and that the rant about the bug report is just background noise that should be ignored. – JaMiT Nov 07 '20 at 15:47
  • Your program is undefined because `sizeof(e)` is not required to be big enough to hold the value you're storing into it. That's why your bug was closed. – Nicol Bolas Nov 07 '20 at 15:58
  • @NicolBolas You think that [conv.integral] does not apply to a conversion to an enum type? – Pascal Cuoq Nov 07 '20 at 17:59
  • @PascalCuoq: Enumerations are not [integer types](https://timsong-cpp.github.io/cppwp/n4659/basic.fundamental#7). They can be promoted to integer types, but by themselves, they aren't integer types. And yes, [the conversion is undefined](https://timsong-cpp.github.io/cppwp/n4659/expr.static.cast#10). – Nicol Bolas Nov 07 '20 at 19:01
  • @NicolBolas Thank you for this reference, I am no good at navigating the C++ standard. – Pascal Cuoq Nov 07 '20 at 21:14
  • I erred when I provided an example of context in which the function `f` would be defined. I expect `e` to evaluate to `-1` in the expression `&p[e]`, and for this reason an example of valid context in which to call `f` is `char c; p = &c + 1; f();`. – Pascal Cuoq Nov 09 '20 at 09:51

1 Answers1

4

The effect of -fno-strict-enums is to cancel -fstrict-enums. That is, the compiler is not allowed to optimize using the assumption that a value of enumerated type can only be one of the values of the enumeration. I would like to emphasize that the word choice is "allowed", not "required". It can be difficult to see the impact of no longer allowing something that was not done in the first place. Still, I think I've found an example where this can be seen.

First, I would like to clarify "the values of the enumeration" in the context of the question. The enumeration e has two enumerators, with the values 0 and 1. The smallest number of bits required to represent these values is 1. Thus, the values of the enumeration are all values that can be represented by 1 bit. This happens to coincide with the values of the enumerators in this case, but is not guaranteed in other examples.

Next, let's remove one line from the question's code.

enum e { e1, e2 } e;

long long x, y, z;

char *p;

void f(void) {
    //e = (enum e) 4294967295;
    x = (long long) e;
    y = e > e1;
    z = &p[e] - p;
}

The line I removed interferes with the strict-enum flag. That flag allows the compiler to make an assumption that is not necessary when the compiler knows exactly what the value of e is. The compiler can reasonably choose to not assume that e can hold only 0 or 1 when quite clearly it was just given a different value. (This interference is not dependent upon 4294967295 being too large for a 32-bit signed integer, but merely on 4294967295 being a compile-time value. As another example, assigning (enum e) 2 to e would also cause this interference.)

Focus on the assignment y = e > e1. If -fno-strict-enums is in effect, the only optimization available is to replace e1 with 0. However, if we can assume that e can be only 0 or 1 (the values of the enumeration, which happen to also be the values of the enumerators), another optimization becomes available.

If e is 0, the following have the same value:

  • (long long) (e > e1)
  • (long long) (0 > 0)
  • (long long) false
  • (long long) e

If e is 1, the following have the same value:

  • (long long) (e > e1)
  • (long long) (1 > 0)
  • (long long) true
  • (long long) e

In either case, we can skip the comparison and simply cast e to a long long. This is reflected in the assembly generated by clang 10 for the line y = e > e1.

With -fstrict-enums

movq    %rax, y(%rip)

With -fno-strict-enums

xorl    %ecx, %ecx
testl   %eax, %eax
setg    %cl
movq    %rcx, y(%rip)

An optimization has been made with -fstrict-enums that was not allowed with -fno-strict-enums.

JaMiT
  • 14,422
  • 4
  • 15
  • 31
  • I understand your answer to be “-fno-strict-enums does not allow programs that do `(enum e) 4294967295`, it only disables [one] optimization that relies on the program not doing this. The program is still invalid if it does this”, and this may be an accurate description of what is happening in Clang, but this is not how GCC's documentation works for, for instance, `-fno-strict-aliasing`. If you take that option: again, only -fstrict-aliasing is documented, again, the documentation starts “ Allow the compiler to assume the strictest aliasing rules applicable…”, but what this means is… – Pascal Cuoq Nov 07 '20 at 21:21
  • …that, at least in the case of GCC, programs that violate strict aliasing but no other rules should be translated according to the intentions of the programmer. That option would be completely unusable if it just disabled one optimization based on strict aliasing but not others. – Pascal Cuoq Nov 07 '20 at 21:22
  • @PascalCuoq No, that does not look like my answer. I wrote nothing about what programs are allowed. One reason for that is that neither `-fno-strict-enums` nor `fstrict-enums` (nor `fno-strict-aliasing` nor any other flag controlling which optimizations are allowed) will cause an invalid program to become valid. An optimization might cause an invalid program to behave as intended, but that is as much a matter of luck as when undefined behavior behaves as intended. – JaMiT Nov 07 '20 at 23:27
  • @PascalCuoq You earlier insisted that your question is about what `-fno-strict-enums` does. Not why your example is invalid, but what `-fno-strict-enums` does (and I complied by dropping discussion of why your code is or is not invalid). Please try to approach answers with an open mind, dropping your preconceived idea that somehow it relates to the validity of a program. – JaMiT Nov 07 '20 at 23:31
  • I can assure you that the secondary explanations of `-fno-strict-aliasing` (for that option there are plenty, unlike `-fno-strict-enums`) describe it as an option that change the dialect accepted by the compiler, making things that are ordinarily UB defined (which a compiler is allowed to do). The developers of GCC have certainly seen these secondary sources and would have had ample time to correct the misunderstanding it if wasn't how one is supposed to understand and use `-fno-strict-aliasing`. Apart from this, I am approaching answers with an open mind and I am keen to understand. – Pascal Cuoq Nov 08 '20 at 09:04
  • What if I declare `enum e { e1 = 100, e2 = 200 };`? Will this take 1 bit or 8 bit? Can I safely use `-fstrict-enums` with this kind of enum? – Sourav Kannantha B Jan 28 '23 at 12:25
  • 1
    @SouravKannanthaB Apply your case to what I wrote: *The enumeration `e` has two enumerators, with the values **100** and **200**. The smallest number of bits required to represent these values is **8**. Thus, the values of the enumeration are all values that can be represented by **8 bits**.* (Note: "to represent" the values, not "to count" the values.) As for safety, that depends on the rest of your program. The values of your enumeration are 0 to 255; whether or not your program respects this is not something that can be known from just the enumeration's definition. – JaMiT Jan 28 '23 at 22:16