2

I am a mid-level programmer, working to learn Standard C. I’m currently struggling through a class exercise which involves using pointers to store different kinds of data types into an array of type char.

Suppose I have a large char array:

static char arr[1000];

As my professor explained it, I can consider this a chunk of local memory, where each element in the array has a granularity of one byte. That seems useful. Now suppose I want to take the first four bytes/elements and store an int:

int a = 100;
int* ptr = (int*)arr;
*ptr = a;

As I understand it, the second line creates an int* pointer, and then points it at the beginning of array arr. The third line writes the value of a into that location. Because ptr is a pointer of type int and because arr has plenty of space, this write four bytes / four element’s worth of data because sizeof(int) == 4. Watching this carefully through my debugger seems to confirm this.

So far, so good. Now let's say I wanted to expand this concept. Let’s say I wanted to store the following into my array, in this order:

int a = 100;
int b = 200;
char* str = “My dog has fleas”
int c = 300;

Which would logically look like this:

00 01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
--------------------------------------------------------------------------------------
[  100    ] [   200   ]  M  y     d  o  g     h  a  s     f  l  e  a  s \0 [   300   ]

I need to be able to store data into the array in this manner, and then later, knowing the array structure in advance, be able to read the array. Below is my code & output, sorry in advance for the long length. It compiles but does not work. Scrutinizing it with my debugger has been very confusing; I can't tell where (and how often) I'm going off-track. If anyone has any insight or advice, I will be very grateful.

int main(){

   static char arr[1000];

   int a = 100;
   int b = 200;
   char* str = "My dog has fleas";
   int c = 300;

   // Create pointers to load data:
   int* ptrA = arr;                     // points to start of array
   int* ptrB = ptrA + sizeof(int);      // points 4 bytes into array
   char* ptrStr = ptrB + sizeof(int);   // points 8 bytes into array
   int* ptrC = ptrStr + sizeof("My dog has fleas"); // points to after the string
                                       // (I don't know how to use sizeof() to measure the actual length of the string

   // Load data into my array
   *ptrA = a;       // Assign int 100 into the array?
   *ptrB = b;       // Assign int 200 into the array?
   *ptrStr = memcpy(ptrStr, str, sizeof("My dog has fleas"));       // Write "My dog has fleas" into the array?
   *ptrC = c;       // Assign int 300 into the array?

   // Knowing the array's structure, walk it and print results:
   char* walkIt = arr;
   int counter = 0;
   while (counter < 30) {
       if (counter == 0) {
           // we are pointing at what should be the first int
           int* tmpPtr1 = (int*)walkIt;
           printf("%d ", *tmpPtr1);
       }
       else if (counter == 4) {
           // we are pointing at what should be the second int
           int* tmpPtr2 = (int*)walkIt;
           printf("%d ", *tmpPtr2);
       }
       else if (counter == 8) {
           // we are pointing at what should be the string
           printf("%s ", walkIt);
       }
       else if (counter == 25) {
           // we are pointing at what should be the third int
           int* tmpPtr3 = (int*)walkIt;
           printf("%d ", *tmpPtr3);
       }
       walkIt++;        // Continue walking the array
       counter++;       // Don't walk too far
   }
   return 0;
}

Output is this:

100  0  0    
Pete
  • 1,511
  • 2
  • 26
  • 49
  • `memcpy(ptrStr, str, sizeof("My dog has fleas"));`. `sizeof` on a string does not give you the result you want. That will just return you the size of a pointer. Use `strlen` instead. Or use `strcpy` to do the copy. – kaylum Feb 08 '17 at 19:26
  • 1
    I'm not entirely sure of what the point of this exercise is, but your professor seems to be doing you a disservice. It is not valid in C to access all or part of an array of `char` as if it were an `int`, or any other non-`char` type, except under rather special circumstances that do not apply here. – John Bollinger Feb 08 '17 at 19:26
  • 1
    Your professor is wrong though, since doing what you are doing violates the [strict aliasing rule](http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) and it may fail under optimizations. It's better to use an `union` for this purpose. – Jack Feb 08 '17 at 19:26
  • Doesn't this violate strict aliasing unless you use only char* pointers? – synchronizer Feb 08 '17 at 19:27
  • @synchronizer, yes, yes it does. – John Bollinger Feb 08 '17 at 19:27
  • @JohnBollinger Yet is it not possible to take a char* to the int and copy over each byte to the array? – synchronizer Feb 08 '17 at 19:28
  • You *can* `memcpy()` into and out of the space occupied by the array. That's fine. But you cannot cast a pointer to or into the array to a pointer to non-character datatype, and access the pointed-to space via that pointer. (Where by "cannot" I mean that doing so produces undefined behavior.) – John Bollinger Feb 08 '17 at 19:29
  • @synchronizer, yes, if you have an `int` or any other object then you can access its bytes via a [`unsigned`] `char *`. But not the other way around. – John Bollinger Feb 08 '17 at 19:31
  • 1
    @Pete Did you actually check what size your ints are? You use size of int as the offset for the pointers, but hard coded values in your loop. – odin Feb 08 '17 at 19:32
  • Then I would use a `char*` if it is absolutely required to use pointer casting for the assignment, though I suspect that endianness might apply here... or maybe I am wrong. – synchronizer Feb 08 '17 at 19:32
  • @JohnBollinger Thanks John, the point of the exercise is to simulate malloc() and free(). The array represents memory on the heap, and this business of storing ints or strings (or whatevers) in the array is to simulate allocating memory for different kinds of data. Its an academic exercise, for better or worse. – Pete Feb 09 '17 at 02:29
  • @odin You're right, I shouldn't hardcode the sizes, but use sizeof() every time. Good catch. – Pete Feb 09 '17 at 02:31

1 Answers1

2

First of all your professor is wrong, while it's true that under the hood things maybe be in that way, dereferencing pointers obtained by casting a pointer to a different type violates strict aliasing rule, which is an assumption made by the compiler that two pointers of different types can't refer to the same memory thus allowing optimizations on such pointers.

Going back to your code, the problem is how you are calculating the offset from the base address, eg:

int* ptrB = ptrA + sizeof(int); 

Now, ptrA is of int*, and adding an integer offset to a pointer implicitly multiplies the offset by the size of the element pointed. Which means that you aren't adding sizeof(int) bytes but sizeof(int)*sizeof(int) bytes.

To force adding a specific amount of bytes you must cast the pointer to a char* so that adding sizeof(int) bytes just adds sizeof(int)*sizeof(char) == sizeof(int)*1 bytes.

int* ptrB = (char*)ptrA + sizeof(int);      // points 4 bytes into array

Mind that this code is unsafe and it's bound to fail, using an union would be a better solution.

Community
  • 1
  • 1
Jack
  • 131,802
  • 30
  • 241
  • 343
  • 2
    Casting does not violate the strict aliasing rule. Using the resulting pointer to access the space does. – M.M Feb 08 '17 at 19:42
  • @Jack - Ohhhhhhhhhhhhh, that does make a world of sense. Thanks, I'll implement. I suspect this is what my professor wanted me to learn all along. – Pete Feb 09 '17 at 21:15