5

I'm encountering a bug that has me stumped. I've narrowed it down to an issue with the pragma pack command in GCC (specifically RHEL Linux, GCC v.4.4.7) that can be recreated in the small sample case I've shown below. It looks like GCC is computing the wrong offset in this case, which will manifest itself as a crash within the loop. Removing the pragma pack also removes the fault - but in the real application this will cause many additional gigabytes of memory use and is not desirable.

In the example below, you will need to compile with optimizations enabled (O3) to experience the failure. I've also provided an example item (cMagic) in the structure that can be removed which will change the structure alignment and keep the bug from triggering.

I've taken a look at the generated assembly and believe this may be a compiler bug. Am I missing something else? Can anyone confirm this bug or provide any insights?

Crash.cpp:

/*  Platform Version Info:
 *     gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-16)
 *     uname: 2.6.32-504.16.2.el6.x86_64 #1 SMP Tue Mar 10 17:01:00 EDT 2015 x86_64 x86_64 x86_64 GNU/Linux
 *
 *  Compiling:
 *     Must use -O3 for compiling and linking
 *     CXX= g++ -g -O3 -fPIC -rdynamic -Wall -Wno-deprecated -DDEBUG
 *     CPP= g++ -g -O3 -fPIC -rdynamic -Wall -Wno-deprecated -DDEBUG
 *
 *  Notes:
 *     This appears to be an optimization and alignment issue.
 *     Getting rid of a byte in Place (cMagic) causes the program to complete successfully.
 *
 */


#include <stdlib.h>
#include <iostream>

using namespace std;

#pragma pack(push,1)  // Structures must be packed tightly
#define MAGICCONSTANT 17

struct Place {
   int iFoo;
   char cMagic;         // GCC doesn't like cMagic.  Disillusion it and everything is OK
   int aiArray[MAGICCONSTANT];
};


#pragma pack(pop)

int main(int argc, const char *argv[])
{
   Place *pPlace = new Place;   // Place must be on the heap... so new, calloc, malloc, etc

   for (int c = 0; (c < MAGICCONSTANT); c++) {
      pPlace->aiArray[c] = 0;
   }

   delete pPlace;

   cout << "Complete!" << endl;
   return 0;
}

Makefile:

CXX= g++ -g -O3 -fPIC -rdynamic -Wall -Wno-deprecated -DDEBUG
CPP= g++ -g -O3 -fPIC -rdynamic -Wall -Wno-deprecated -DDEBUG

OBJS=   Crash.o
SRCS=   Crash.cpp
TARG=   crash

debug:: ${TARG}

all:: ${TARG}

${TARG}: ${OBJS}
        ${CPP} -o ${TARG} ${OBJS} ${LDFLAGS} ${LIBS}

clean::
        rm -f ${TARG} ${OBJS} ${TARG}.core core

Disassembly Graph (Generated ASM Code):

Disassembly graph

phuclv
  • 37,963
  • 15
  • 156
  • 475
Scott 'scm6079'
  • 1,517
  • 13
  • 25
  • 1
    That's an old compiler... have you tried with a newer version to see if the bug persists? – JorenHeit Jul 27 '15 at 22:42
  • Current (july 2015) version of GCC is [5.2](https://gcc.gnu.org/gcc-5/); [GCC 4.4](https://gcc.gnu.org/gcc-4.4/) originated in 2009. Try compiling your code with a GCC 5.2 (which you can compile from its [released](https://gcc.gnu.org/releases.html) source code). – Basile Starynkevitch Jul 27 '15 at 22:44
  • 2
    For the record, the problem is the aligned SSE movdqa .. the offset is probably fine. I suppose the optimizer thinks that the array is aligned but the packing misaligns it. – Jester Jul 27 '15 at 22:45
  • 1
    Indeed, the autovectorizer appears to be vectorizing the loop, but doing it poorly. Turning it off would probably fix the crash. – Sebastian Redl Jul 27 '15 at 22:49
  • 2
    GCC 5.2 optimizes the loop out since it's not used. Adding a `volatile` it uses an unrolled sequence of `MOV` instructions which are again not susceptible. Increasing the `MAGICCONSTANT` to 129 produces a loop but still with `MOV` instructions so problem doesn't occur. – Jester Jul 27 '15 at 22:50
  • 1
    @Jester Thanks, that's very valuable feedback. The code is for an enterprise client locked to RHEL versions - so I'm not sure I'll be able to upgrade GCC majors. I'll start researching further. – Scott 'scm6079' Jul 27 '15 at 23:07

1 Answers1

4

Look at using __attribute__ ((packed)); instead of #pragma pack(1). IIRC, this version of GCC treats it a bit differently.

phuclv
  • 37,963
  • 15
  • 156
  • 475
  • That seems to have resolved the issue in this instance, although I'm not sure if that has the same effect as the #pragma. From the docs, "This attribute, attached to an enum, struct, or union type definition, specified that the minimum required memory be used to represent the type.". – Scott 'scm6079' Jul 28 '15 at 00:43
  • Good point. I'm curious enough to try and run it through a disassembler later and see what changed. – Krista Wolffe Jul 28 '15 at 00:48