The basic problem here is that you #include
the bison-generated header file (prologue.tab.h
) and the flex-generated header file (prologue.h
) in your Grammar file, and more specifically in a %code requires
section.
It's neither necessary nor advisable to include the bison-generated header file in the bison-generated C file. Bison doesn't require you to generate a header file, and consequently it needs to put everything which it might need in the generated source file. If you do also include the generated header file, you could end up with duplicated #define
s and other related errors, because not all of Bison's generated #define
s are guarded with #ifndef
s. That's less of a concern than it was with earlier versions of Bison, but there are still some circumstances in which it causes errors.
It's particularly useless to put #include "parser.tab.h" in a
%code requiresor
%code providesblock, because both of those blocks are copied into the generated header file. That would make the header file
#include` itself. Since Bison 3, the header file has an include guard, so that doesn't cause infinite include recursion. But still.
But that's not creating the errors you report. Those come from the fact that you're effectively including the flex-generated header file in the flex source, and that's much more problematic. Again, that comes from putting the #include
statement in a %code requires
block, from which it is copied into the bison-generated header file, which in turn is #include
d in the flex source.
Like Bison, Flex does not require you to generate a header file at all, so it has to put all necessary declarations and macro definitions into the generated source file. In fact, the generated Flex header file is simply an excerpt of the generated code file; the Flex skeleton is decorated with %ok-for-header
and %not-for-header
markings, and the Flex code generator uses those annotations to decide whether to write generated code only into the .c
file or into both the .c
and the .h
file.
So, again, unnecessary and unadvisable, and likely to lead at least to compiler warnings. But there is an additional problem, which is the one you're running into: Flex tries to clean up all of its macro definitions at the end of the generated code.
It does that because a lot of programmers are uncomfortable with multifile C projects, so they #include
the generated Flex source directly into their application rather than trying to figure out how to integrate it into their build procedure. So at the end of the generated Flex source, there is a stream of #undef
directives for every internal macro defined. That includes YY_DO_BEFORE_ACTION
and YY_NEW_FILE
, among many others. Some of these macros are also used in the header file. So the #undef
directives are part of a skeleton section which is copied into both .c
and .h
files, at the end of each file.
So when you #include "prologue.h"
in prologue.l
, the consequence is that Flex internal macros, will be undefined much too early. That's the error you're seeing.
Of course, you don't explicitly #include "prologue.h"
in prologue.l
, and you might well not have intended to do so at all. But it's there because the line is in a %code requires
block and you do #include "prologue.tab.h"
in prologue.l
(as you must, in order for the lexical scanner to see the definitions for tokens). And that's why the error suddenly showed up when you changed the %code
block into a %code requires
block.
But you shouldn't include the generated flex header in the generated bison code, even if you don't put it into a %code requires
or %code provides
block.
Since the scanner is a client of the parser -- in other words, yyparse
calls yylex
-- it does seem logical to include the scanner's header file in the parser's source file. But it won't work in general, because there's a dependency inversion between the parser and the scanner. Even though the parser calls the scanner, the scanner depends on symbols and type declarations defined by the parser. These include not only the enum
constants which define token identifiers, but also the data types used to implement the parser's semantic and location values. The scanner cannot be compiled without these things: it needs to know the type of yylval
(and yylloc
if it is producing location information), and it needs to know the enumeration values to return for each token type.
So if the parser also needed to know about the scanner, that would create a circular dependency. The usual way to solve circular dependencies is to abstract the circularly-required definitions into a common, self-contained header file, which itself contains enough forward declarations to resolve any circular internal dependencies. But since Flex and Bison are independent software products which do not depend on each other (you could use Flex with a different parser generator, or Bison with a different scanner generator), thinking that they will cooperatively create a shared header file is probably an unreasonable expectation. In any case, they don't, so any coordination has to be done by the programmer.
In the simplest (and traditional) scenario, in which all shared state is in global variables, the only thing the parser needs to know about is the prototype for yylex
, and that prototype is dead simple: yylex
takes no arguments and produces an int
. In the version of C prevalent at the time that lex and yacc were originally designed, that meant that yylex
didn't have to be declared at all, since undeclared functions were assumed to take whatever arguments they were given, if any, and return an int
. Of course, that hasn't been the case for some 30 years, and these days it is necessary to add a declaration for yylex
(int yylex();
) to your .y
files. In many cases, that's the only useful thing you would find in the scanner's header file, making the header file unnecessary.
A lot of other things have changed in those 30 years, one of which is that it is now much more common to expect libraries to use some mechanism other than global variables to maintain internal state. Both Bison and Flex can produce reentrant modules which are (mostly) free of globals. That's all to the good, but it brings into focus the problem of the circular dependency between the parser and the scanner. If you're planning to pursue that route, you might want to take a look at this annotated example, which goes into more detail. Alternatively, you could try using the C++ interfaces, or (my personal preference) avoid the circular dependency by inverting the call relationship, using a push parser.