In other word when doing
index = &array[x] - &array[0];
Is it always guaranteed (per C standard) that &array[0] <= &array[x], or is it dependent on the compiler? What are the C standard chapters relevant for this topic ?
In other word when doing
index = &array[x] - &array[0];
Is it always guaranteed (per C standard) that &array[0] <= &array[x], or is it dependent on the compiler? What are the C standard chapters relevant for this topic ?
The address ordering is guaranteed. The behaviour of relational operators is defined in C11 6.5.8p5:
[...] pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. [...]
Thus &array[x] >= &array[0]
is true always if x
is the index of an element, or one greater than the maximum index. (And if x
is not the index of an element, or one past the end of the actual array, then behaviour is undefined.)
But surprisingly the difference &array[x] - &array[0]
is defined only when
x
is an actual index of an element or one greater than the maximum index in the array andx
is not greater than PTRDIFF_MAX
as there is a peculiar corner case: C11 6.5.6p9 says that
9 When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements. The size of the result is implementation-defined, and its type (a signed integer type) is
ptrdiff_t
defined in the<stddef.h>
header. If the result is not representable in an object of that type, the behavior is undefined. In other words, if the expressions P and Q point to, respectively, the i-th and j-th elements of an array object, the expression (P)-(Q) has the value i-j provided the value fits in an object of typeptrdiff_t
.[...]
If the signed ptrdiff_t
is of same width as the unsigned size_t
, it is possible to have an array for which there exists an index x
greater than PTRDIFF_MAX
; then &array[x] >= &array[0]
still, but &array[x] - &array[0]
has completely undefined behaviour.
Here is a demonstration. My computer is x86-64 that runs 64-bit Ubuntu Linux, but it is also capable of running 32-bit programs. In 32-bit X86 Linux + GCC, ptrdiff_t
is a 32-bit signed integer, and size_t
is 32-bit unsigned integer. A program run in 64-bit Linux in 32-bit mode can easily allocate over 2G of memory with malloc, as the entire 4G address space is reserved for user mode.
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include <stddef.h>
int main(void) {
size_t size = (size_t)PTRDIFF_MAX + 2;
size_t x = (size_t)PTRDIFF_MAX + 1;
char *array = malloc(size);
if (! array) {
perror("malloc");
exit(1);
}
array[0] = 42;
array[x] = 84;
printf("&array[0]: %p\n", (void *)&array[0]);
printf("&array[x]: %p\n", (void *)&array[x]);
printf("&array[x] >= &array[0]: %d\n", &array[x] >= &array[0]);
printf("&array[x] - &array[1]: %td\n", &array[x] - &array[1]);
printf("&array[x] - &array[0]: %td\n", &array[x] - &array[0]);
printf("(&array[x] - &array[0]) < 0: %d\n", (&array[x] - &array[0]) < 0);
}
Then compiled for 32-bit mode and run:
% gcc huge.c -m32 -Wall && ./a.out
&array[0]: 0x77567008
&array[x]: 0xf7567008
&array[x] >= &array[0]: 1
&array[x] - &array[1]: 2147483647
&array[x] - &array[0]: -2147483648
(&array[x] - &array[0]) < 0: 1
The memory was allocated successfully, the starting address is at 0x77558008, &array[x]
is at 0xf7504008
, &array[x]
is greater than &array[0]
. The difference &array[x] - &array[1]
produced a positive result, whereas &array[x] - &array[0]
, with its undefined behaviour, now produced a negative result!
First of all, FWIW, quoting C11
, chapter §6.5.6/P9, (emphsis mine)
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements. [...]
So, you don't need to be bothered about the individual pointer value (positioning) itself. It's the difference that matters (i.e, something like |a-b|
)
That said, if it has to come to the "comparison", ( usage of relational operators, <
, >
, <=
, >=
), the standard says,
When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. [....] If the objects pointed to are members of the same aggregate object, [...] and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. [....]
So, for a statement like &array[x] <= &array[0]
, it will evaluate to 0
(FALSY), when x > 0
.
Yes, because &array[x]
is defined to be equivalent to array+x
.
A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).
The C11 standard defines the address difference between elements of an array as a number that depends on the relative (logical) order of the elements. As specified in the description of the additive operators:
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements. The size of the result is implementation-defined, and its type (a signed integer type) is ptrdiff_t defined in the header. If the result is not representable in an object of that type, the behavior is undefined. In other words, if the expressions P and Q point to, respectively, the i-th and j-th elements of an array object, the expression (P)-(Q) has the value i-j provided the value fits in an object of type ptrdiff_t. Moreover, if the expression P points either to an element of an array object or one past the last element of an array object, and the expression Q points to the last element of the same array object, the expression ((Q)+1)-(P) has the same value as ((Q)-(P))+1 and as -((P)-((Q)+1)), and has the value zero if the expression P points one past the last element of the array object, even though the expression (Q)+1 does not point to an element of the array object.
So the difference in your example is defined as x - 0
.
From the C11 specification (ISO/IEC 9899:2011 (E)) §6.5.8/5:
When two pointers are compared, ... If the objects pointed to are members of the same aggregate object, ... and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values.
That means that &array[x] <= &array[0]
will be false unless x
is equal to zero.
Given that traversing a array can also be achieve by incrementing a pointer, it would appear to be fairly fundamental that the absolute address of subsequent indexes increase.
char[] foobar;
char *foobarPtr = foobar;
foobar[0] == *foobarPtr++;
foobar[1] == *foobarPtr++;
https://www.tutorialspoint.com/cprogramming/c_pointer_to_an_array.htm
index = &array[x] - &array[0];
is syntactic sugar for
index = (array+x) - (array+0)
because in C any array is desugared as pointer.
Now given the pointer arithmetic
it will be rewritten as index = x
The relevant topics you can google for or search inside ISO9899 are pointer arithmetic
and desugaring arrays as pointers.