-2

In C, given

char arr [10];
uint32_t i = -1;
char * p = &arr [1];

does (p + i) overflow/undefined or equals to &arr [0]? why the pointer arithmetic rules in C standard(6.5.6.8) so confusing?

The language defines pointer can do +, -, +=, -= operations with any integer type, what will happen when pointer add a negative int value? What if the representation of the pointer is 4 bytes but the integer operand is int64_t?

The C99 standard defines pointer arithmetic in array index term(6.5.6.8), according to my understanding, it states:

char * ptr = …;
char * new_ptr = ptr + int_expr;

assert( (new_ptr - ptr) == (int_expr) );

what’s the reason for the obscured, indirect definition?

alk
  • 69,737
  • 10
  • 105
  • 255
zhaorufei
  • 2,045
  • 19
  • 18
  • 4
    Are you sure you want `i` to be `unsigned`? I would not be able to hold negative numbers though. – alk Feb 24 '18 at 12:05
  • 1
    What exactly is confusing in the rules? – Gerhardh Feb 24 '18 at 12:05
  • 1
    Yes it wraps and the value will be much greater than 10 so p+I will refer to memory outside the array. Accessing it is undefined behavior. I can't elaborate more now because of my device. It is not equal to &arr[0] – user2736738 Feb 24 '18 at 12:10
  • Very much related to https://stackoverflow.com/questions/3473675/are-negative-array-indexes-allowed-in-c – Ilja Everilä Feb 24 '18 at 12:16
  • Thanks very much, I also have the fuzz feeling about it, after reading C99 standard, the new C standard, Pointers on C, the deep C secrets etc, but not found a resource clearly explained it. To make me further confusing, clang -Weverything -std=c99 -pedantic do NOT produce any warning about the above code. I'm looking for more details. – zhaorufei Feb 24 '18 at 12:28
  • @Gerhardh Question updated, thanks – zhaorufei Feb 24 '18 at 12:28
  • @alk I just want index to be very large, that's another way of 'uint32_t i = 0xFFFFFFFF' – zhaorufei Feb 24 '18 at 12:28
  • 2
    @zhaorufei But it is not mandatory for the compiler to prevent undefined behavior (this is almost impossible or would at least produce very defensive code that would lead to very poor performance). Some advanced code verifying tools may analyse such problems, but in general compilers do not. This is the same for integer overflow for example. – Jean-Baptiste Yunès Feb 24 '18 at 12:51
  • 1
    "*obscured, indirect definition*" what do you feel is "*obscured*" and/or "*indirect*"? – alk Feb 24 '18 at 12:59
  • If `a = b + c` then also "usual" arithmetic mandates `a - c == b`. – alk Feb 24 '18 at 13:00
  • "What if the representation of the pointer is 4 bytes but the integer operand is int64_t?" changes nothing. `p + some_big_value` points well outside the range of `char arr [10];` regardless of the _type_ of `some_big_value` or representation of `p`. This is _undefined behavior_. – chux - Reinstate Monica Feb 24 '18 at 13:40

4 Answers4

5

Assigning -1 to uint32_t converts it to UINT32_MAX (which is 4294967295) per reduced modulo, 6.2.5p9.

So your code is equivalent to:

char arr [10];
uint32_t i = UINT32_MAX;
char * p = &arr [1];

p points to the second element in the array arr. So p+i, i.e., p + 4294967295, yields a pointer that is certainly not within the array object. So it'd be undefined behaviour.

If you change the type of i to int32_t for example, then it can hold the negative value (as you might have intended in the first place). p + i, i.e., p - 1 would yeild a pointer to the first element in the array arr (equivalent to &arr[0]). There's no undefined behaviour because the resulting pointer p + i (== &arr[0]) is still pointing within the array object and is perfectly valid.

P.P
  • 117,907
  • 20
  • 175
  • 238
0

Yes it will overflow and no it won't be equal to &arr[0].

Because the variable i is of type uint32_t so it actually doesn't have the value -1 but a very large number 4294967295 which is 11111111 11111111 11111111 11111111 in binary or 0xFFFFFFFF in hexadecimal.

If you change the type of i to something like int then i will then have the value -1 and (p+i) will refer to arr[0]

nabil.douss
  • 634
  • 4
  • 10
0

Given your example with i being an unsigned data type, you will definitely point outside of your array arr as -1 is treated as 0xFFFFFFFF. But here the confusing part is probably not the pointer arithmetic, but the overflow during type conversion of your index variable.


On the other side, using a signed data type for i you would be on the safe side:

Pointer arithmetic is safe as long as you are within the bounds of one data object. You may also point one element after the last element of an array.

In C it is completely the same whether you write *(arr+i) or arr[i]. This means your example

char * p = &arr [1];

is same as

char * p = arr+1;

And from this you can derive that p + (-1) is equal to arr+1-1 == arr and is equal to &arr[0] which is perfectly fine pointing within bounds of that array.

Gerhardh
  • 11,688
  • 4
  • 17
  • 39
0

You are refering to:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression.

This is not confusing, this is just a definition of pointer arithmetic. And should be read as:

E array[N];
assert( (&array[X]+D) == &array[X+D] );

provided that X and X+D are both in [0,N] (you can point one past the last element).

D can be any integer expression. In your case it has unsigned integer type (-1 as uint32_t is UINT32_MAX) so it is undefined behavior as the result is out of the bounds of the array (1+UINT32_MAX>10).

If you had use int32_t, the the result would have point to the first element of the array:

char array[10];
assert( (&array[1]-1) == &array[0] );
Jean-Baptiste Yunès
  • 34,548
  • 4
  • 48
  • 69