PIC32 dereferencing pointers possible compiler bug

Question

I am working on some code for the PIC32MX795F512L using the XC32 compiler. I need to read data out of a buffer passed to a function as a void*. I need to read the data as an array of unsigned 32 bit integers. the problem is when I cast the void* to a uint32* and try to read a value at index 0 of that array, the general error handler gets called by the processor, while if I cast the void* to a uint8* and do some bit manipulation to get the same data, it works properly.

the code looks like this:

void foo(void* data, uint32 length)
{
    uint32 i,j;
    uint32 block;
    for(i = 0, j = 0; i < length; i+= sizeof(uint32),j++)
    {
        printfUART(DEBUG_UART,"Debug 0\r\n");
#if 0//working code
        block = ((uint8*)data)[i + 3];
        block <<= 8;
        block |= ((uint8*)data)[i + 2];
        block <<= 8;
        block |= ((uint8*)data)[i + 1];
        block <<= 8;
        block |= ((uint8*)data)[i + 0];
#else//not working code
        block = ((uint32*)data)[j];
#endif
        printfUART(DEBUG_UART,"Debug 1\r\n");
    }
}

if you change the #if 0 to #if 1 the code works as expected and I see "Debug 0" and "Debug 1" printed many times in the loop.

but if you leave it as is, "Debug 0" prints only once, and then the code jumps out of the loop into the cpu general error handler set by the compiler. Is this a bug in the XC32 ompiler, or is there something im missing?

no problem. I also think it's an alignment problem. Try printing the raw address of data. I bet it's not an even address. — Brandon Yates, Jul 02 '14 at 14:48
as RuslanGerasimov notes (I think), if `length` is not a multiple of 4 then you read beyond the input buffer. You should write code to make sure this doesn't happen, e.g. round `length` down to the nearest multiple of 4 before starting the loop. — M.M, Jul 04 '14 at 00:23
@Matt McNabb that is true in general, but for this particular function the size being passed in is a constant that is a multiple of 4, and it is just debug code anyways so I'm not worried about it. — , Jul 07 '14 at 16:02

undur_gongor · Accepted Answer · 2014-07-02T14:51:50.503

4

I think this is an alignment problem.

If data is not at an even address (e.g. because it is in fact an array of bytes), accessing it 32bit-wise could be not possible on your platform.

The C standard (ISO/IEC 9899:1999, 6.3.2.3) says:

A pointer to an object or incomplete type may be converted to a pointer to a different object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined.

edited Jul 02 '14 at 14:51

answered Jul 02 '14 at 14:46

undur_gongor

15,657
5
63
75

Doesnt that break the c standard though? I mean isnt what i wrote valid c? – Jul 02 '14 at 14:49
No, it does not. You cannot generally convert pointers freely. The only guarantee is that you can access all objects by pointer-to-char ("byte-wise"). – undur_gongor Jul 02 '14 at 14:50
Nah. Trying to dereference a null pointer is also valid C, though your OS likely won't like that either. – Brandon Yates Jul 02 '14 at 14:50
1

@S E check your compiler manual their are probably preprocessor macros you can use to force alignment of certain variables. – Brandon Yates Jul 02 '14 at 14:51
ok, that seems like a bad design to me but i guess i accept that that is not necessarily permitted by c. I cant accept for another few minutes though – Jul 02 '14 at 14:53
1

@BrandonYates There's not usually any operating system on a PIC32. I think the C runtime will fail on a void pointer dereference. – Austin Mullins Jul 02 '14 at 14:54
@AustinMullins fair enough, it would cause a CPU fault then. Same thing. THh point is it would compile as valid C. – Brandon Yates Jul 02 '14 at 14:56
correct there is no OS on my pic32, but dereferencing NULL still fails, because NULL is address 0 and 0 is a physical address not in one of the virtual address spaces where code runs – Jul 02 '14 at 14:57
1

@BrandonYates dereferencing NULL is not valid C either. Just because something doesn't give a compilation error doesn't mean that it is valid. – M.M Jul 04 '14 at 00:17
@undur_gongor Thanks a lot! – Tsikon Oct 30 '17 at 14:17

score 1 · Answer 2 · edited Jul 04 '14 at 08:55

1

Instead of block = ((uint32*)data)[j]; which may have alignment problems, you can achieve the intended effect with:

memcpy( &block, data + j, sizeof block );

Note that this is conceptually different to the version building it up from individual bytes. It depends on how your CPU represents integers (commonly called "endianness").

edited Jul 04 '14 at 08:55

undur_gongor

15,657
5
63
75

answered Jul 04 '14 at 00:21

M.M

138,810
21
208
365

Ruslan Gerasimov · Answer 3 · 2014-07-04T00:43:37.803

Here is your code below with debug output added. In main cycle it is running a bit out of array (on purpose) over the memory that it occupies and it works fine so far (it might be still platform incompatible though). I am pretty sure your foo() is OK and compiler is too, but you probably wrongly call it. Check all addresses and values with debug output and check ranges.

void foo(void* data, unsigned int length)
{
    unsigned int  i,j;
    unsigned int  block;
    for(i = 0, j = 0; i < length; i+= sizeof(unsigned int ),j++)
    {
        printf("Debug 0\r\n");
        printf("i: 0x%d\r\n", i);
        printf("j: 0x%d\r\n", j );
#if 0//working code
        block = ((unsigned char*)data)[i + 3];
        block <<= 8;
        block |= ((unsigned char*)data)[i + 2];
        block <<= 8;
        block |= ((unsigned char*)data)[i + 1];
        block <<= 8;
        block |= ((unsigned char*)data)[i + 0];
#else//not working code
        block = ((unsigned int *)data)[j];
#endif
        printf("data: 0x%x\r\n", data );
        printf( " ((unsigned int )data +j) = 0x%x\r\n", ((unsigned int )data +j) );
        printf( " ((unsigned int *)data)[j] = 0x%x\r\n", ((unsigned int *)data)[j]);
        printf("block: 0x%x\r\n", block );
        printf("Debug 1\r\n");
        printf("\r\n");
        printf("\r\n");
    }
}

int main(void)
{
 unsigned int array[10] = {0xFE,0xED,0xBA,0xBE,0xDE,0xAD,0xBE,0xEF,0xCA,0xFE};
 printf ("array: 0x%x\r\n",array);
 printf ("array[1]:%x\r\n",array[1]);
 foo((void*) array, 50);

}

its output:

array: 0x1d0fdde0 array[1]:ed Debug 0 i: 0x0 j: 0x0 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fdde0 ((unsigned int *)data)[j] = 0xfe block: 0xfe Debug 1

Debug 0 i: 0x4 j: 0x1 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fdde1 ((unsigned int *)data)[j] = 0xed block: 0xed Debug 1

Debug 0 i: 0x8 j: 0x2 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fdde2 ((unsigned int *)data)[j] = 0xba block: 0xba Debug 1

Debug 0 i: 0x12 j: 0x3 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fdde3 ((unsigned int *)data)[j] = 0xbe block: 0xbe Debug 1

Debug 0 i: 0x16 j: 0x4 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fdde4 ((unsigned int *)data)[j] = 0xde block: 0xde Debug 1

Debug 0 i: 0x20 j: 0x5 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fdde5 ((unsigned int *)data)[j] = 0xad block: 0xad Debug 1

Debug 0 i: 0x24 j: 0x6 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fdde6 ((unsigned int *)data)[j] = 0xbe block: 0xbe Debug 1

Debug 0 i: 0x28 j: 0x7 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fdde7 ((unsigned int *)data)[j] = 0xef block: 0xef Debug 1

Debug 0 i: 0x32 j: 0x8 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fdde8 ((unsigned int *)data)[j] = 0xca block: 0xca Debug 1

Debug 0 i: 0x36 j: 0x9 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fdde9 ((unsigned int *)data)[j] = 0xfe block: 0xfe Debug 1

Debug 0 i: 0x40 j: 0x10 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fddea ((unsigned int *)data)[j] = 0x0 block: 0x0 Debug 1

Debug 0 i: 0x44 j: 0x11 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fddeb ((unsigned int *)data)[j] = 0x0 block: 0x0 Debug 1

Debug 0 i: 0x48 j: 0x12 data: 0x1d0fdde0 ((unsigned int )data +j) = 0x1d0fddec ((unsigned int *)data)[j] = 0x0 block: 0x0 Debug 1

If you update with more details and/or new result anyone would get more understanding an make an answer better and you would get more solid answer.

When someone downvoted an answer it is good idea to leave a comment to give an idea what was appeared incorrect. — Ruslan Gerasimov, Jul 03 '14 at 21:31
possible reasons for downvotes: you introduce a buffer overflow where previously there wasn't one (???), you have a pointless casting of an `unsigned int` to `unsigned int`, and you don't address the actual cause of the problem (alignment issue); and you say you are "pretty sure" it is OK just because your version seems to work, but that conclusion is not justified. — M.M, Jul 04 '14 at 00:26
are you running this code on a pic32 microcontroller? also the data you are passing in was aligned as a 32 bit integer array, so this example doesnt show whether or not invalid alignment causes the issue. — , Jul 07 '14 at 15:48

score 0 · Answer 4 · answered Jun 16 '17 at 14:50

undur_gongor is correct, it is an alignment issue. Consider the following struct:

struct demo {
   uint8_t char_var;
   uint16_t short_var;
   uint32_t int_var;
};

If the packed structure were placed at an aligned address 0 then you have a 16 bit variable starting at an odd address 1 and a 32 bit variable at address 3. The PIC32 can only read a 16 bit variable from even addresses (0, 2, 4...) and a 32 bit variable from similarly aligned addresses (0, 4, 8).

When accessing a packed struct or similar construct, the compiler takes extra steps. The solution is generally to write single bytes which are not affected by alignment. As M.M mentioned, the easiest solution is to use memcpy.

PIC32 dereferencing pointers possible compiler bug

4 Answers4