14

So far thought it is, but after I learned that the compiler may pad data to align it for architecture requirements for example I'm in doubt. So I wonder if a char[4][3] has the same memory layout as char[12]. Can the compiler put padding after the char[3] part to make it aligned so the whole array takes actually 16 bytes?

The background story that a function of a library takes a bunch of fixed length strings in a char* parameter so it expects a continuous buffer without paddig, and the string length can be odd. So I thought I declare a char[N_STRINGS][STRING_LENGTH] array, then conveniently populate it and pass it to the function by casting it to char*. So far it seems to work. But I'm not sure if this solution is portable.

Calmarius
  • 18,570
  • 18
  • 110
  • 157
  • 4
    Array is contiguously allocated . There can't be any padding in between array elements IMHO . – ameyCU Dec 13 '15 at 12:15
  • 4
    C arrays are required to be contiguous, with no padding between array elements. So neither `char [4][3]` nor `char [12]` may contain padding, and `sizeof` will be `12 * sizeof(char)` for both. – Tom Karzes Dec 13 '15 at 12:15
  • 3
    Also see [Can C arrays contain padding in between elements?](http://stackoverflow.com/questions/1066681/can-c-arrays-contain-padding-in-between-elements) – Tom Karzes Dec 13 '15 at 12:20
  • And why do you want to cast a `char [][]` to `char *` ? – ameyCU Dec 13 '15 at 12:21
  • @TomKarzes: i.e. `sizeof` will be `12`. – dreamlax Dec 13 '15 at 13:19

4 Answers4

11

An array of M elements of type A has all its elements in contiguous positions in memory, without padding bytes at all. This fact is not depending on the nature of A.

Now, if A is the type "array of N elements having type T", then each element in the T-type array will have, again, N contiguous positions in memory. All these blocks of N objects of type T are, also, stored in contiguous positions.

So, the result, is the existence in memory of M*N elements of type T, stored in contiguous positions.

The element [i][j] of the array is stored in the position i*N+j.

pablo1977
  • 4,281
  • 1
  • 15
  • 41
8

Let's consider

T array[size]; 
array[0]; // 1

1 is formally defined as:

The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2)))

per §6.5.2.1, clause 2 taken from the standard C draft N1570. When applied to multi-dimensional arrays, «array whose elements are arrays», we have:

If E is an n-dimensional array (n ≥ 2) with dimensions i × j × ... × k, then E (used as other than an lvalue) is converted to a pointer to an (n − 1)-dimensional array with dimensions j × . . . × k.

Therefore, given E = T array[i][j] and S = array[i][j], S is first converted to a pointer to a one-dimensional array of size j, namely T (*ptr)[j] = &array[i].

If the unary * operator is applied to this pointer explicitly, or implicitly as a result of subscripting, the result is the referenced (n − 1)-dimensional array, which itself is converted into a pointer if used as other than an lvalue.

and this rule applies recursively. We may conclude that, in order to do so, the n-dimensional array must be allocated contiguously.

It follows from this that arrays are stored in row-major order (last subscript varies fastest).

in terms of logical layout.

Since char [12] has to be stored contiguously and so has to char [3][4], and since they have the same alignment, they should be compatible, despite they're technically different types.

edmz
  • 8,220
  • 2
  • 26
  • 45
  • 1
    The central point of my doubt is: Are implementations free to pad *the end* of the array to round it up to alignment? If the answer is yes then I can imagine that `sizeof(char[3]) == 4` on a word aligned CPU, which makes it easier to address elements of a `char[4][3]` array. This means that the sub-arrays contain an implied padding while if you use `char[12]` the implementation is required to pack the data without padding. – Calmarius Dec 13 '15 at 14:51
  • If we're talking about alignment, `_Alignof(char [N])` shall be `_Alignof(char)`, which is 1. Since `sizeof(char[3]) == 3`, you'll need to rethink your questions as the starting point is incorrect. – edmz Dec 13 '15 at 15:39
  • @black: Given `struct x { char arr[3][4]; }; struct y { char arr[4][3]; };` would the Standard guarantee that `_Alignof(struct x)` and `_Alignof(struct y)` would be equal? – supercat Nov 30 '16 at 22:53
3

What you're referring to as types are not types. The type T you mention in the title would be (in this case) a pointer to a char.

You're correct that when it comes to structs, alignment is a factor that can lead to padding being added, which may mean that your struct takes up more bytes than meets the eye.

Having said that, when you allocate an array, the array will be contiguous in memory. Remember that when you index into an array, array[3] is equivalent to *(array + 3).

For example, the following program should print out 12:

#include <stdio.h>

int main() {
    char array[4][3];
    printf("%zu", sizeof(array));
    return 0;
}
fvgs
  • 21,412
  • 9
  • 33
  • 48
  • 1
    Arrays are *always* contiguous in memory, that's the whole idea of it. – Jens Gustedt Dec 13 '15 at 13:08
  • 3
    The `T` in the title is a type, not a variable. The rest of the question uses `char` for `T`. – interjay Dec 13 '15 at 13:16
  • @jens Good point, my use of "statically" was unnecessary and more likely to mislead the reader into considering a nonexistent dichotomy. – fvgs Dec 13 '15 at 13:17
  • Is there any guarantee that `sizeof(T[n]) == n*sizeof(T)` and can never be greater than that? – Calmarius Dec 13 '15 at 14:15
  • @Calmarius that depends very much on how you're defining T and n. – fvgs Dec 13 '15 at 14:41
  • Concrete example: Can `sizeof(char[3]) == 4`? – Calmarius Dec 13 '15 at 14:43
  • 2
    No it can't. The C standard defines `sizeof(char)` to be equal to one and you have an array of three chars. Therefore, the left hand side of that expression is equal to three. Compiling and running that expression will give you the same result. – fvgs Dec 13 '15 at 14:50
  • @JensGustedt: Given `int foo[3][3];`, it's clearly required that `foo[0][0]` to `foo[0][2]` be consecutive, and that `foo[1][0]` to `foo[1][2]` be consecutive, and likewise `foo[2][0]` to `foo[0][2]`. Most code wouldn't care whether `foo[0][2]` and `foo[1][0]` were consecutive, and in some cases significant optimizations could be achieved if they didn't have to be, but unfortunately C has no way to distinguish between packed and non-packed arrays in any case where code could legitimately tell the difference. Do note, however, that unless code either converts the address of `foo`... – supercat Dec 13 '15 at 20:48
  • ...to int* or char* (as opposed to the address of `foo[0]`), or uses sizeof or other means to determine the array's stride, a compiler would be allowed to use whatever stride it likes, since `foo[0]+3` would be a just-past pointer for `foo[0][2]`, and would not be a legitimate pointer to `foo[1][0]`. – supercat Dec 13 '15 at 20:51
-4

Strictly speaking a 2-D array is an array of pointers to 1-D arrays. In general you cannot assume more than that.

I would take the view that if you want a contiguous block of of any type then declare a contiguous 1D block, rather than hoping for any particular layout from the compiler or runtime.

Now a compiler probably will allocate a contiguous block for a 2-D array when it knows in advance the dimensions ( i.e. they're constant at compile time ), but it's not the strict interpretation.

Remember int main( int argc, char **argv ) ;

That char **argv is an array of pointers to char pointers.

In more general programming you can e.g. malloc() each row in a 2D array separately and swapping row is as simple as swapping the values to those pointers. For example :

char **array = NULL ;

array = malloc( 2 * sizeof( char * ) ) ;

array[0] = malloc( 24 ) ;

array[1] = malloc( 11 ) ;

strcpy( array[0], "first" ) ;
strcpy( array[1], "second" ) ;

printf( "%s\n%s\n", array[0], array[1] ) ;

/* swap the rows */

char *t = array[0] ;
array[0] = array[1] ;
array[1] = t ;

printf( "%s\n%s\n", array[0], array[1] ) ;

free( array[0] ) ;
free( array[1] ) ;
free( array ) ;
  • Given the nature of the question, it is important to note here that the char array given by `array[0]` is not contiguous with the char array given by `array[1]`. Moreover, it is not advisable to dynamically allocate arrays if they can be statically allocated. – fvgs Dec 13 '15 at 13:03
  • 8
    "_Strictly speaking a 2-D array is an array of pointers to 1-D arrays._" No, it is not! Arrays _**are not**_ pointers. – edmz Dec 13 '15 at 13:04
  • 3
    Don't promote pseudo 2D arrays, that is pointers to pointers. This is very wrong and should really banished from all books and introductory courses. – Jens Gustedt Dec 13 '15 at 13:10
  • @black As my example clearly shows ( I'd suggest you run it ), arrays act just like pointers. In fact a pointer can be used and addressed as an array, and vice versa. Very odd claim to say otherwise. – StephenG - Help Ukraine Dec 13 '15 at 13:26
  • 1
    @JensGustedt you might want to cite some source why they are wrong, it may be non-obvious to the poster, and future readers. – Jonathan Lisic Dec 13 '15 at 13:28
  • @jens-gustedt C coders have been using arrays of pointers to pointers for as long as a recall, and it is not only a legitimate technique but an invaluable one. Why someone would suggest banishing it from the books is beyond me. As I indicated, the argument to main() include pointers to pointers. How could that be wrong ? – StephenG - Help Ukraine Dec 13 '15 at 13:30
  • 3
    @StephenG, I don't say that pointers to pointers are bad, using them as a substitute for 2D arrays is. Since C99 2D dynamical arrays can simply be allocated by `double (*A)[n] = malloc(sizeof(double[n][n]));` Everything else is artificial, difficult to understand and error prone. – Jens Gustedt Dec 13 '15 at 13:40
  • 1
    Stephen, T *foo[n] is an array of pointers, whether T foo[n][m] is really not. Otherwise, the array would need more space than m*n*sizeof(T), because the pointers would have to be saved somewhere, too.# – Ctx Dec 13 '15 at 14:05