43

Say, we have

enum E
{
  Foo = 0,
  Bar = 1
};

Now, we do

enum E v = ( enum E ) 2;

And then

switch ( v )
{
  case Foo:
    doFoo();
  break;
  case Bar:
    doBar();
  break;
  default:
    // Is the compiler required to honor this?
    doOther();
  break;
}

Since the switch above handles every possible listed value of the enum, is it allowed for the compiler to optimize away the default branch above, or otherwise have an unspecified or undefined behavior in the case the value of enum is not in the list?

As I am expecting that the behavior should be similar for C and C++, the question is about both languages. However, if there's a difference between C and C++ for that case, it would be nice to know about it, too.

dragonroot
  • 5,653
  • 3
  • 38
  • 63
  • 3
    What language do you want this answered in? Enums work differently in C and C++ as far as I know. – fuz Nov 19 '15 at 19:55
  • I cannot compile `E v = ( E ) 2;` in C by MSVC. I get "error C2065: 'E' : undeclared identifier". I can only refer to `Foo` and `Bar`. – Weather Vane Nov 19 '15 at 19:55
  • 2
    @WeatherVane this is because E is not defined as a type. Either you `typedef` it or write `enum E v = 2;`. – ddz Nov 19 '15 at 19:58
  • Sorry, fixed it to compile both in C and in C++ – dragonroot Nov 19 '15 at 19:59
  • @dragonroot's amendment does now compile, thank you. – Weather Vane Nov 19 '15 at 20:02
  • 1
    If `v` is known at compile time to be 2, the compiler may even optimize the entire `switch` statement to only a `doOther()` call... – user12205 Nov 19 '15 at 20:02
  • 3
    @ace: Yeah, the interesting part would be if the `switch` was in a function compiled to a `.lib` (without link-time optimization enabled), then used from code that linked the `.lib`, so the compiler has no knowledge of the real value that will be used, just that it's provided as an argument. Anything where the compiler has complete information leads to weird optimizations that invalidate general cases. – ShadowRanger Nov 19 '15 at 20:05
  • I think at this point the question can be turned into, "is the exit status of `enum X { VALUE = 1 }; int main(void) { int result; enum X x = (enum X)2; case (x) { case VALUE: result = 0; break; default: result = 1; break; } return result; }` defined, and if so, what is it?" –  Nov 19 '15 at 20:09
  • @dragonroot I admire your ability to make polyglots, but please specify which of C and C++ you want this question to be answered in. – fuz Nov 19 '15 at 20:11
  • Edited to be more explicit that the question is about both C and C++ since it's reasonable to expect they should be quite similar in that particular case – dragonroot Nov 19 '15 at 20:24
  • @dyp It's not a duplicate because the question you refer to uses scoped enums, and here the question uses unscoped enums, and the rules are different ! – Christophe Nov 19 '15 at 21:41
  • Similar question from this month: http://stackoverflow.com/questions/33607809/can-an-out-of-range-enum-conversion-produce-a-value-outside-the-underlying-type – M.M Nov 20 '15 at 02:23
  • @Christophe In my answer, I've tried to cover both. Anyway, I've abstained from voting to close. – dyp Nov 20 '15 at 20:08
  • Similar question: [`Is enum { a } e = 1;` valid?](https://stackoverflow.com/q/70988698/1778275). – pmor Sep 11 '22 at 20:42

6 Answers6

29

C++ situation

In C++, each enum has an underlying integral type. It can be fixed, if it is explicitly specified (ex: enum test2 : long { a,b};) or if it is int by default in the case of a scoped enum (ex: enum class test { a,b };):

[dcl.enum]/5: Each enumeration defines a type that is different from all other types. Each enumeration also has an underlying type. (...) if not explicitly specified, the underlying type of a scoped enumeration type is int. In these cases, the underlying type is said to be fixed.

In the case of an unscoped enum where the underlying type was not explicitely fixed (your example), the standard gives more flexibility to your compiler:

[dcl.enum]/7: For an enumeration whose underlying type is not fixed, the underlying type is an integral type that can represent all the enumerator values defined in the enumeration. (...) It is implementation-defined which integral type is used as the underlying type except that the underlying type shall not be larger than int unless the value of an enumerator cannot fit in an int or unsigned int.

Now a very tricky thing: the values that can be held by an enum variable depends on whether or not the underlying type is fixed:

  • if it's fixed, "the values of the enumeration are the values of the underlying type."

  • otherwhise, it is the integral values within the minimum and the maximum of the smallest bit-field that can hold the smallest enumerator and the largest one.

You are in the second case, although your code will work on most compilers, the smallest bitfield has a size of 1 and so the only values that you can for sure hold on all compliant C++ compilers are those between 0 and 1...

Conclusion: If you want to ensure that the value can be set to 2, you either have to make your enum a scoped enum, or explicitly indicate an underlying type.**

More reading:

C situation

The C situation is much simpler (C11):

6.2.5/16: An enumeration comprises a set of named integer constant values. Each distinct enumeration constitutes a different enumerated type.

So basically, it is an int:

6.7.2.2./2 The expression that defines the value of an enumeration constant shall be an integer constant expression that has a value representable as an int.

With the following restriction:

Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined, but shall be capable of representing the values of all the members of the enumeration.

Sammy Taylor
  • 298
  • 1
  • 12
Christophe
  • 68,716
  • 7
  • 72
  • 138
  • 2
    Your explanation of the C situation may be a little misleading. 6.7.2.2/2 doesn't require that it *be* an int, merely that the values all be representable as ints. That is to say, `INT_MIN <= n <= INT_MAX` for all values `n` in the enum. 6.7.2.2/4 isn't a further restriction; it rather permits the implementation to use any of the integer types ( (signed or unsigned) char, int, long, or long int (6.2.5/4-7) ) as long as they're able to represent all the enum values. – Ray Nov 20 '15 at 02:23
  • 1
    Also, your answer doesn't mention what the behaviour of OP's code is: `( enum E)2` in fact causes undefined behaviour in C++, as resolved by [defect 1766](http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1766). The C++14 standard had different wording but that wording didn't make sense, so the committee resolved that it is actually undefined (and defect reports apply retroactively). – M.M Nov 20 '15 at 02:34
  • @M.M thanks for highlighting the amiguous wording. I've reworded the sentence. – Christophe Nov 20 '15 at 03:40
  • @Ray you are of course right: the wording of the last quote lets the possibility for the compiler to choose smaller integral types such as for example char, provided it can hold all the enumerated values (what I meant with "restriction"). It can also allow larger types such as long, but according to 6.7.2.2. the constant expression must be representable as a int – Christophe Nov 20 '15 at 03:54
4

In C an enum type is an integer type large enough to hold all the enum constants:

(C11, 6.7.2.2p4) "Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined,110) but shall be capable of representing the values of all the members of the enumeration".

Let's say the selected type for enum E is _Bool. A _Bool object can only store the values 0 and 1. It's not possible to have a _Bool object storing a value different than 0 or 1 without invoking undefined behavior.

In that case the compiler is allowed to assume that an object of the enum E type can only hold 0 or 1 in a strictly conforming program and is so allowed to optimize out the default switch case.

ouah
  • 142,963
  • 15
  • 272
  • 331
  • @Deduplicator C11 specifies the meaning of type compatibility in 6.2.7 but basically type compatibility implies the types can be used interchangeably (e.g, `signed int` and `int`) – ouah Nov 19 '15 at 20:28
  • Missing link: `_Bool` is an "unsigned integer type" in C. (`bool` isn't, in C++.) – T.C. Nov 20 '15 at 01:52
  • The idea behind the guarantee (in C, and for non-fixed C++) is that any combination of enumerators with `|` produces a valid value – M.M Nov 20 '15 at 02:29
0

In C enumerators have type int . Thus any integer value can be assigned to an object of the enumeration type.

From the C Standard (6.7.2.2 Enumeration specifiers)

3 The identifiers in an enumerator list are declared as constants that have type int and may appear wherever such are permitted.

In C++ enumerators have type of the enumeration that defines it. In C++ you should either expliicitly to specify the underlaying type or the compiler calculates itself the maximum allowed value.

From the C++ Standard (7.2 Enumeration declarations)

5 Each enumeration defines a type that is different from all other types. Each enumeration also has an underlying type. The underlying type can be explicitly specified using enum-base; if not explicitly specified, the underlying type of a scoped enumeration type is int. In these cases, the underlying type is said to be fixed. Following the closing brace of an enum-specifier, each enumerator has the type of its enumeration.

Thus in C any possible value of a enum is any integer value. The compiler may not optimize a switch removing the default label.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • 3
    The identifiers are some form of int, as they are constants, but the enum type itself is not an `int`. I think that's the crux. – MicroVirus Nov 19 '15 at 20:07
0

C++Std 7.2.7 [dcl.enum]:

It is possible to define an enumeration that has values not defined by any of its enumerators.

So, you can have enumeration values which are not listed in enumerator list.

But in your specific case, the 'underlying type' is not 'fixed' (7.2.5). The specification doesn't say which is the underlying type in that case, but it must be integral. Since char is the smallest such type, we can conclude that there are other values of the enum which are not specified in the enumerator list.

Btw, I think that the compiler can optimize your case when it can determine that there are no other values ever assigned to v, which is safe, but I think there are no compilers which are that smart yet.

Deduplicator
  • 44,692
  • 7
  • 66
  • 118
Kevin C
  • 48
  • 5
  • I thought this is a question about the standard, and I just quickly glanced to see that noone else is quoting it. And without the C++ standard you cannot really answer this question. – Kevin C Nov 19 '15 at 20:21
  • 2
    @Deduplicator [cppreference on enums](http://en.cppreference.com/w/cpp/language/enum) says: *Values of integer (..) can be converted, such as by static_cast, to any enumeration type. The result is unspecified (until C++17)undefined behavior (since C++17) if the value, converted to the enumeration's underlying type, is out of this enumeration's range. If the underlying type is fixed, the range is the range of the underlying type. If the underlying type is not fixed, the range is all values possible for the smallest bit field large enough to hold all enumerators of the target enumeration.*. – MicroVirus Nov 19 '15 at 20:21
  • Is this correct, the part about the bit field being the formal range? – MicroVirus Nov 19 '15 at 20:21
  • @MicroVirus: Not sure whether unspecified or undefined, but the full paragraph supports that. – Deduplicator Nov 19 '15 at 20:25
  • @Deduplicator Well if it's correct what's written then assigning `2` in this example invokes unspecified/undefined behaviour, as far as I can tell. – MicroVirus Nov 19 '15 at 20:29
  • From C++ standard N3291 (C++11 draft): (Footnote 93: 93) This set of values is used to define promotion and conversion semantics for the enumeration type. It does not preclude an expression of enumeration type from having a value that falls outside this range) – Kevin C Nov 19 '15 at 20:41
  • I think it's best answered here, courtese of dyp, [What happens if you static_cast invalid value to enum class?](http://stackoverflow.com/questions/18195312/what-happens-if-you-static-cast-invalid-value-to-enum-class) for C++ and that would mean that assigning `2` would be unspecified/undefined behaviour. – MicroVirus Nov 19 '15 at 20:53
  • I think that static cast has nothing to do with it. What if you cast a pointer to int into pointer to enum type? – Kevin C Nov 19 '15 at 20:55
  • @KevinC If you deref that pointer, you're in UBland in C++. There's no exception for aliasing between an enumeration type and its underlying type in the strict aliasing rule. – dyp Nov 19 '15 at 20:58
  • Ok, that migh be right, but now I'm confused. In one place the standard (in a footnote 93) explicitly explains that enums can have other values, but in reality I can't figure a way to give them those values. So now I'm not sure anymore. – Kevin C Nov 19 '15 at 21:01
  • Does simply incrementing an enum work? – Kevin C Nov 19 '15 at 21:02
  • I'm not sure either what that footnote is referring to. It might well refer to *the enumerators* being *the set of values*. – dyp Nov 19 '15 at 21:03
  • @KevinC No. Increment/decrement require its operand to be a modifiable lvalue of arithmetic or pointer ("to completely-defined object"-)type. While enumeration values can be converted to an arithmetic type (like `int`), the result is a prvalue, not an lvalue. – dyp Nov 19 '15 at 21:05
  • @KevinC http://wg21.cmeerw.net/cwg/issue1094 introduced that footnote. I think it supports my interpretation. See also: http://stackoverflow.com/a/27632572/ which points to http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_defects.html#1766 – dyp Nov 19 '15 at 21:12
  • Ok, but in [expr.static.cast]/10 it says that the conversion is unspecified if the value does not fall within the range. So, that unspecified value does not have to be from enumerator list. Note: I'm still on C++11 . – Kevin C Nov 19 '15 at 21:13
  • @KevinC Generally `switch` statements are required to bounds check their inputs, so getting rid of the default case wouldn't be much an optimization. This is one of the reasons GCC has a computed `goto` extension. – Jason Nov 19 '15 at 21:15
  • @KevinC Yes, a conversion from a value outside the representable range of the enum to the enum type can produce a value that is different from any enumerator. If I'm interpreting this rule correctly, it can (also) produce a value that is outside the *range of the enumeration*, and this might invoke UB via [expr]p4. In any case, since CWG 1766 is a defect, compiler implementers might retroactively implement it in their "C++11" modes. – dyp Nov 19 '15 at 21:20
  • Ok, so perhaps the best aswer is: it depends on the version of the standard. In original C++11, the original answer is correct, but if the defect 1766 is repaired, then the answer might not be correct (and frankly, I'm lost on any further implications). – Kevin C Nov 19 '15 at 21:27
  • This is not correct. Firstly, in this specific case the underlying type could be `bool`. But, more generally, even if the underlying type is `int`, the standard still specifies that if the underlying type is not fixed, the valid values for the enum may be a subset of the range of the underlying type. – M.M Nov 20 '15 at 02:31
0

Also, 7.2/10:

An expression of arithmetic or enumeration type can be converted to an enumeration type explicitly. The value is unchanged if it is in the range of enumeration values of the enumeration type; otherwise the resulting enumeration value is unspecified.

Paul
  • 13,042
  • 3
  • 41
  • 59
  • Then the question is: what is the range of an enumeration type? – MicroVirus Nov 19 '15 at 20:16
  • 7.2/2: An enumerator-definition with = gives the associated enumerator the value indicated by the constant-expression. The constant-expression shall be an integral constant expression (5.19). If the first enumerator has no initializer, the value of the corresponding constant is zero. An enumerator-definition without an initializer gives the enumerator the value obtained by increasing the value of the previous enumerator by one. So in case of `enum E { Foo = 0, Bar = 1 };` 2 is not in the range of enumeration values. – Paul Nov 19 '15 at 20:21
-1

In C and C++, this can work.

Same code for both:

#include <stdio.h>

enum E
{
  Foo = 0,
  Bar = 1
};

int main()
{
    enum E v = (enum E)2;    // the cast is required for C++, but not for C
    printf("v = %d\n", v);
    switch (v) {
    case Foo:
        printf("got foo\n");
        break;
    case Bar:
        printf("got bar\n");
        break;
    default:
        printf("got \n", v);
        break;
    }
}

Same output for both:

v = 2
got default

In C, an enum is an integral type, so you can assign an integer value to it without casting. In C++, an enum is its own type.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • 4
    Is this a language guarantee (in either C or C++)? Seems reasonable for the compiler to drop the `default` case if a `switch` handles all legal `case`s for an `enum`. The fact that any given compiler doesn't (or doesn't with specific flags) isn't useful for portability; I suspect this is undefined behavior, so the compiler can do whatever it wants. Might the use of C++11 strongly typed `enum`s affect this by giving better guarantees (assuming the weakly typed `enum` of C and C++ is allowed to hold undeclared values)? – ShadowRanger Nov 19 '15 at 19:59
  • As some of the other answers point out, but not very clearly, the compiler can optimize memory (more or less) by using the minimum byte width needed to represent the enumerated values. I ran into this 10 years ago when cross-compiling PalmOS code using GCC. For example, enumerated values in the range 0-255 only needed one byte of storage. Assigning a value of 256 to a variable of that enumerated type would obviously not work; I never tried it, so I don't know what the result would be. (My memory's hazy, but I think I had to disable the variable-width enumerations for the PalmOS code.) – Alex Measday Nov 20 '15 at 04:43