0

Could someone please tell me whether or not such a construction is valid (i.e not an UB) in C++. I have some segfaults because of that and spent couple of days trying to figure out what is going on there.

// Synthetic example  
int main(int argc, char** argv)
{
    int array[2] = {99, 99};
    /*
      The point is here. Is it legal? Does it have defined behaviour? 
      Will it increment first and than access element or vise versa? 
    */
    std::cout << array[argc += 7]; // Use argc just to avoid some optimisations
}

So, of course I did some analysis, both GCC(5/7) and clang(3.8) generate same code. First add than access.

Clang(3.8):  clang++ -O3 -S test.cpp

    leal    7(%rdi), %ebx
    movl    .L_ZZ4mainE5array+28(,%rax,4), %esi
    movl    $_ZSt4cout, %edi
    callq   _ZNSolsEi
    movl    $.L.str, %esi
    movl    $1, %edx
    movq    %rax, %rdi
    callq   _ZSt16__ostream_insertIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_PKS3_l

GCC(5/7) g++-7 -O3 -S test.cpp

    leal    7(%rdi), %ebx
    movl    $_ZSt4cout, %edi
    subq    $16, %rsp
    .cfi_def_cfa_offset 32
    movq    %fs:40, %rax
    movq    %rax, 8(%rsp)
    xorl    %eax, %eax
    movabsq $425201762403, %rax
    movq    %rax, (%rsp)
    movslq  %ebx, %rax
    movl    (%rsp,%rax,4), %esi
    call    _ZNSolsEi
    movl    $.LC0, %esi
    movq    %rax, %rdi
    call    _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
    movl    %ebx, %esi

So, can I assume such a baheviour is a standard one?

Some programmer dude
  • 400,186
  • 35
  • 402
  • 621
Dmitry
  • 1,065
  • 7
  • 15
  • 1
    If you ask a question about C++ and UB, then don't tag other languages. C is a totally different language with other semantic rules and other points of UB. – Some programmer dude Jan 15 '19 at 13:02
  • 2
    As for your question, unless `argc` is `-6` or `-7` then you will use an out-of-bounds index and that is of course UB. The expression used for the index must be fully evaluated first, and in C++ all variants of assignment are simple expressions. – Some programmer dude Jan 15 '19 at 13:04
  • My question is if a[i+=N] will always increment i first. – Dmitry Jan 15 '19 at 13:05
  • @Dmitry Yes of course; problems occur only if you use argc elsewhere in the same expression because the order of the evaluation of sub-expressions within an expression (and hence the propagation of side-effects) is usually not defined. – Peter - Reinstate Monica Jan 15 '19 at 13:07
  • @Dmitry Did you perhaps think of the semantics of `argc++`? That expression would have the original, *pre-* increment value as an index into argv; the new value would only be visible in the next statement. But the expression `argc += 1` (or `+= 7`) has the value of `argc` *after* the increment. – Peter - Reinstate Monica Jan 15 '19 at 13:16

4 Answers4

4

In case of a[i+=N] the expression i += N will always be evaluated first before accessing the index. But the example that you provided invokes UB as your example array contains only two elements and thus you are accessing out of bounds of the array.

taskinoor
  • 45,586
  • 12
  • 116
  • 142
  • 1
    Provided argc is smaller than -7 or larger than -6 (but yes, that is likely the problem). – Peter - Reinstate Monica Jan 15 '19 at 13:08
  • @PeterA.Schneider `argc` shouldn't be negative unless that is modified elsewhere inside `main`. – taskinoor Jan 15 '19 at 13:10
  • 1
    You got a point -- it is set by the runtime and I wouldn't know offhand how to make it appear negative to `main`. Perhaps by supplying INT_MAX+7 command line arguments, but I am afraid there is a limit to the command line length somewhere before that. – Peter - Reinstate Monica Jan 15 '19 at 13:18
4

By itself array[argc += 7] is OK, the result of argc + 7 will be used as an index to array.

However, in your example array has just 2 elements, and argc is never negative, so your code will always result in UB due to an out-of-bounds array access.

rustyx
  • 80,671
  • 25
  • 200
  • 267
2

Your case is clearly undefined behaviour, since you will exceed array bounds for the following reasons:

First, expression array[argc += 7] is equal to *((array)+(argc+=7)), and the values of the operands will be evaluated before + is evaluated (cf. here); Operator += is an assignment (and not a side effect), and the value of an assignment is the result of argc (in this case) after the assignment (cf. here). Hence, the +=7 gets effective for subscripting;

Second, argc is defined in C++ to be never negative (cf. here); So argc += 7 will always be >=7 (or a signed integer overflow in very unrealistic scenarious, but still UB then).

Hence, UB.

Stephan Lechner
  • 34,891
  • 4
  • 35
  • 58
1

It's normal behavior. Name of array actualy is a pointer to first element of array. And array[n] is the same as *(array+n)