0

I'm trying to port a legacy 32 bit parser code generated from flex.bison. I must use Visual Studio 2019 and compile to x64 target. A crash occures (reading access violation) while parsing the parameters in this code :

case 42:

{   registerTypedef( (yyvsp[(2) - (4)]), (yyvsp[(3) - (4)]) ); }
break;

Here is the called function definition :

void registerTypedef(char* typeref, char* typeName)
{
    //SG_TRACE_INFO("registerTypedef %s %s", typeName, typeref);

    std::string typeNameStr = typeName;
    std::string typeRefStr = typeref;
    TheSGFactory::GetInstance().SG_Factory::RegisterTypeDef(typeNameStr, typeRefStr);

The corresponding rule is the following :

declaration_typedef
: TYPEDEF TYPEDEF_NAME IDENTIFIER ';'   {   registerTypedef( $2, $3 ); }
| TYPEDEF basic_type IDENTIFIER ';' {   registerTypedef( $2, $3 ); }
;

It looks like the yyvsp is accessed with negative index (2) - (4) = -2. This should be OK as the same code is working perfectly with 32 bit compiler. The C99 standard seems to be OK with this also.

I have tried to use latest flex/bison versions available under windows and unix. the generated code is quite similar and the issue is the same.

Is there a magic Visual Studio Option to make it accept negative index ? Is there a magic Flex/bison parameter to use that would fix this issue ?

Thanks a lot !

2 Answers2

1

You're almost certainly looking in the wrong place.

yyvsp always points to the top of the parser stack, so negative indexes are perfectly normal. And totally legal. The problem will be that the thing that's supposed to be a char* isn't a valid pointer, probably because the default semantic value type was not changed from int. On 32-bit architectures, you can often get away with stashing pointers into ints, since they are likely the same size. But 64-bit compiles will break, since half of the pointer will be truncated..

This error should be apparent if you compile with warnings enabled.

Note that nothing guarantees that YYSTYPE is the same in the lexical scanner and in the parser., since they are independent programs generated from different source files by different code generators. So it might be wrong in either or both. (Compiler warnings will help distinguish the cases.)

Your best bet is to ensure that YYSTYPE is correctly defined in the bison-generated header file to avoid type mismatch issues. The easiest way to do that is with the %define api.value.type bison declaration, but that's a relatively recent feature. The older style was to put #define YYSTYPE whatever in a bison %code requires block. And the even older style was to duplicate the YYSTYPE definition in both the .y and .l files. (Or to "fix" the problem by suppressing or ignoring compiler warnings, leaving the problem for some future maintenance programmer. :-) )

rici
  • 234,347
  • 28
  • 237
  • 341
  • This is very interesting because actually i had such a type difference in the code ! Bad beat : I tried to change/force YYSTYPE to int both in lex and bison, but still observe the same issue. I also tried the char * in both – Matthieu Moret Oct 05 '20 at 07:01
  • You're using it as a `char*` so declaring is as `int` is UB. Making it `char*` in both is correct -- if you did it right. But only you can see what you did, so there's no point even mentioning it if you're not going to show it. – rici Oct 05 '20 at 07:09
0

I think there was two issues here :

  1. @rici was right concerning YYSTYPE of different type : they MUST be the same. In my case char*

  2. The callback lexer code was using strdup(). Visual Studio 2019 by default resolve this function to a function returning int.

    yylval = strdup(yytext);

This was corrupting the stack content I had to force #include <string.h> to use the posix version returning char *

Note : i already needed to force include <stdlib.h> so that other "C" function points to correct versions (alloca ...)

Mystery solved ! Thanks a lot to all contributors