I'm experiencing weird behavior when data is passed to a certain function in some generated code. The problem occurs whenever optimizations are enabled (-O1
, and higher). But not on -O0
.
The C code is generated by OpenModelica 1.13.0-dev, and compiled on Centos 6.9 32 bits using gcc 4.4.7. I know my setup is a bit old, but I can't do otherwise.
I was able to step into the code with gdb to get a backtrace of the faulty function with -O0
__OMC_DIV_SIM (threadData=0x83bb7e0, a=0.90000000000000002, b=1, msg=0xac4f6024 "PMECH1 - D * SLIP / 1.0 + SLIP", equationIndexes=0xbfffd4a0,
noThrowDivZero=1 '\001', time_=0, initial_=1 '\001')
And here's a backtrace of the same function with -O2
__OMC_DIV_SIM (threadData=0x83bb7a8, a=-9.2559642734470712e+61, b=5.298772688916812e-315, msg=0x1 <Address 0x1 out of bounds>,
equationIndexes=0x3ff00000, noThrowDivZero=-60 '\304', time_=3.7134892271125328e-314, initial_=0 '\000')
All researches for Address 0x1 out of bounds
are pointing to stack or memory corruption. I then ran my executable through valgrind.
==5351== Invalid read of size 1
==5351== [...]
==5351== by 0x7EA13F4: __OMC_DIV_SIM (division.h:66)
==5351== [...]
==5351== Address 0x1 is not stack'd, malloc'd or (recently) free'd
==5351==
==5351==
==5351== Process terminating with default action of signal 11 (SIGSEGV)
==5351== Access not within mapped region at address 0x1
==5351== [...]
==5351== by 0x7EA13F4: __OMC_DIV_SIM (division.h:66)
==5351== [...]
==5351== If you believe this happened as a result of a stack
==5351== overflow in your program's main thread (unlikely but
==5351== possible), you can try to increase the size of the
==5351== main thread stack using the --main-stacksize= flag.
==5351== The main thread stack size used in this run was 8388608.
The code is not so useful, since it's not for the user to read. I modified it a bit to make it more readable and to pinpoint the exact problem.
double value = 0.0;
if ((long)data->localData[0]->integerVars[0] /* TRIPI */ == ((long) 0))
{
double PMECH1 = data->localData[0]->realVars[24];
double D = data->simulationInfo->realParameter[21];
double SLIP = data->localData[0]->realVars[4];
double TELEC = data->localData[0]->realVars[35];
double H = data->simulationInfo->realParameter[24];
double div_1 = 0.0;
{
double a = PMECH1 - ((D) * (SLIP));
double b = 1.0 + SLIP;
div_1 = __OMC_DIV_SIM(threadData, a, b, "PMECH1 - D * SLIP / 1.0 + SLIP", equationIndexes, data->simulationInfo->noThrowDivZero, data->localData[0]->timeValue, initial());
}
// ...
}
Then here's the file that contains __OMC_DIV_SIM
#define DIVISION_SIM(a,b,msg,equation) (__OMC_DIV_SIM(threadData, a, b, msg, equationIndexes, data->simulationInfo->noThrowDivZero, data->localData[0]->timeValue, initial()))
int valid_number(double a)
{
return !isnan(a) && !isinf(a);
}
static inline modelica_real __OMC_DIV_SIM(threadData_t *threadData, const modelica_real a, const modelica_real b, const char *msg, const int *equationIndexes, modelica_boolean noThrowDivZero, const modelica_real time_, const modelica_boolean initial_)
{
modelica_real res;
if(b != 0.0)
res = a/b;
else if(initial_ && a == 0.0)
res = 0.0;
else
res = a / division_error_equation_time(threadData, a, b, msg, equationIndexes, time_, noThrowDivZero);
if(!valid_number(res))
throwStreamPrintWithEquationIndexes(threadData, equationIndexes, "division leads to inf or nan at time %g, (a=%g) / (b=%g), where divisor b is: %s", time_, a, b, msg);
return res;
}
I cannot test that code with a different gcc, but I was able to test another component with the same environment, and it passed without any trouble.
I am still clueless about what is corrupting the data given to the __OMC_DIV_SIM
. Could it be a bug in the compiler ? Or in the generated code?