9

In this Code Golf post, there is the claim that “the second variable in a definition is always set to 1“, making this a well-formed line:

int i=-1,c,o,w,b,e=b=w=o=c;

And supposedly everything except i is set to 1 because c is automatically 1.

I thought I knew some C, and thought this was illegal (being UB and resulting if anything in random stack contents).

Does C really set c to 1?

Community
  • 1
  • 1
Felix Dombek
  • 13,664
  • 17
  • 79
  • 131
  • 1
    I really don't get this, `c` is assigned uninitialized to the rest of variables, I can be wrong but it seems undefined behavior. – David Ranieri Jan 30 '17 at 21:52
  • 3
    According to [this](http://meta.codegolf.stackexchange.com/questions/5486/is-an-answer-allowed-to-use-undefined-but-consistent-behaviour), consistent, undefined behavior is fine for code golfing, so maybe the creator of the program exploited that. – cadaniluk Jan 30 '17 at 21:54
  • If `i` has automatic storage duration, this is well-formed. If it's static storage duration, it's malformed because constant expressions are required to initialize the variables. – cadaniluk Jan 30 '17 at 21:55
  • 2
    Doesn't make sense to me either. [printing out c](https://ideone.com/HMjdnJ) shows it isn't 1 (happens to be 0 when I run it). Unless the coder was relying on some stack trickery that was disturbed by the `printf`. Best if you post a comment on the original answer to ask. And come back to answer your own question :-) – kaylum Jan 30 '17 at 21:55
  • 1
    The golfed program runs fine for me, even with `c` being `0`, not `1`. I can only conclude that, without delving too deep into the semantics of the program, the important thing is that the variables are all equal, not that they are all `1`. The creator happened to experience `c` being `1` on his machine, assumed that to be either consistent with the standard or to be consistent undefined behavior on ideone.com at least, and wrote the thing in good conscience. – cadaniluk Jan 30 '17 at 22:08
  • @Downvoter that's what I thought too. – Felix Dombek Jan 30 '17 at 22:09
  • @Downvoter OP here, it is consistent it seems, I just looped it on many IDEs and it always outputs 0. Unfortunately I need it outputting 1. It seems both work for now but my logic technically was created around them all being set at 1. I didn't catch the bug because my IDE seemed to consistently output that. Anyways I've edited my codegolf post to include the =1 now which is what I was actually originally trying to type (wasn't trying to exploit undefined behavior haha) Thanks everyone – Albert Renshaw Jan 30 '17 at 22:51

3 Answers3

5

This code exhibits undefined behavior.

The variables c, o, b, and w are uninitialized. That means their contents are indeterminate.

From section 6.7.9 of the C standard:

10 If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.

The indeterminate value of c is then assigned to several other variables. By reading the value of an uninitialized variable, the code invokes undefined behavior.

The initial value of c could be 1, but if so it's not a predictable value.

Also note that the above statement contains both initialization (for i and e) and assignment (for c, o, b, and w), so this statement won't compile at file scope.

I attempted to run the function in the linked post and it didn't pass the first test input. Undefined behavior.

dbush
  • 205,898
  • 23
  • 218
  • 273
5

I'm the OP from CodeGolf. It seems I simply had a typo, I meant to say int i=-1,c,o,w,b,e=b=w=o=c=1; That way the second defined int is always set to 1 and the others can be set to it. The confusion is that I originally had the variable that comes next (L=3) as just l (undefined) and I was setting all of the other variables to e=b=w=o=c=(L=3); which in my mind was going to set L equal to 3, return true for that (1), then set the rest to 1.

A few tests later I realized this was just setting them all to 3 and only worked with the specific string I was using to test my code. So I deleted them and changed it to just be L=3 hard coded and the others to be e=b=w=o=c=1;L=3. At some point I must have pressed cmd+z one too many times and removed the "=" and the "1" so I was just left with e=b=w=o=c;. Due to the consistent undefined nature of this (at least on my IDE) it was always defining them as 0 and therefor the bug went un-noticed.

Now that I've corrected it back, thanks to this post, the byte lengths are the same and there was no need for any of this tricky e=b=w=o=c=1 code anyways, I only thought the byte length was different because when I copy pasted my function into a byte counter it showed it was 2 bytes smaller (I didn't know I just had a typo and was missing 2 bytes).

My IDE is always defining those variables as 0. My code is designed to work with all of the variables being defined as 1, the fact that it works w/ 0 is coincidence. Also just because it happens on my IDE doesn't mean it will on others, though I have tested it on a few IDEs now online and run many loops and it does seem to always return 0. In any event, I've still updated my original code to set them to 1 as it should be (adding 2 bytes to my program).

Thanks for everyone's input

Albert Renshaw
  • 17,282
  • 18
  • 107
  • 195
  • 3
    With this kind of questions you expect an answer along the lines of "This is UB, burn the one who wrote this code on the altar of standard C++" or "This is an amazing trick, revealing a dark corner of C++". And then you came and were like "Hi guys, so... yeah, I made a CTRL-Z mistake..." which is... pretty anticlimactic... oh well. it happens :) – bolov Jan 30 '17 at 23:59
2

There is no such magic rule in the C standard, that second int object is to be set with 1. In fact, the value is indeterminate, in which case the code invokes unconditional UB.

C11 § 6.3.2.1/2 Lvalues, arrays, and function designators

If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.

But let's assume otherwise for a short moment. Here is just one example assembly, generated for x86-64 architecture on GCC 6.3, turned-off optimization, SysV ABI calling conventions:

    mov     DWORD PTR [rbp-4], -1
    mov     eax, DWORD PTR [rbp-8]     ; ???
    mov     DWORD PTR [rbp-12], eax 
    mov     eax, DWORD PTR [rbp-12]
    mov     DWORD PTR [rbp-16], eax
    mov     eax, DWORD PTR [rbp-16]
    mov     DWORD PTR [rbp-20], eax
    mov     eax, DWORD PTR [rbp-20]
    mov     DWORD PTR [rbp-24], eax

As far, as the compiler is concerned, there are neither no guarantees. The variable c is located on current stack frame at RBP-8 offset. Its initial value is whatever was kept previously on stack.

Grzegorz Szpetkowski
  • 36,988
  • 6
  • 90
  • 137
  • 1
    That's _one_ possibility of undefined behaviour. But it's undefined, so your program can crash, or all variables are set to different values, or all variables stay undefined with more fun later in your program. – gnasher729 Jan 30 '17 at 22:56