As an exercise, let's make a minimal reproducible example out of this, like the SO help always suggests. It's really not so difficult. Since the problem is encountered processing the snippet with bison, it is not necessary that the MRE actually compile or run, in this instance.
Here's the file (conan.c
):
%token ID
%token RetType Formals Statements
%%
FuncDecl: RetType ID '(' Formals { funcdecl($1, $2, $4); } ')'
'{' Statements '}' { funcdef($1, $2, $4, $8); }
| RetType ID '(' Formals ')' ';' { funcdecl($1, $2, $4) }
I've removed everything extraneous to the problem, but I still have a file I can process with bison:
The non-terminals which are not relevant have been converted to terminals. (line 2) (If they had been relevant, converting them to terminals would make the problem go away. Since it doesn't, we know they are irrelevant.)
Liberal use is made of single-character tokens to make the grammar more readable and to avoid having to declare them as tokens. (I would have turned multi-character tokens like T_FOR
into quoted strings ("for"
) for the same reason.)
So that leaves me with a readable six-line snippet, which can now be processed with bison (I need to add the bison invocation as well as the resulting errors, so as to make this reproducible and complete). The error messages now have the expected line numbers:
$ bison -o conan.c conan.y
conan.y: warning: 1 shift/reduce conflict [-Wconflicts-sr]
conan.y:4.34-58: warning: rule useless in parser due to conflicts [-Wother]
FuncDecl: RetType ID '(' Formals { funcdecl($1, $2, $4); } ')'
^^^^^^^^^^^^^^^^^^^^^^^^^
Now, to solve the problem. There is indeed a shift/reduce conflict, because after the parser get to:
RetType ID ( Formals )
^
|------- lookahead
it has to decide whether to perform the Mid-Rule Action (funcdecl($1, $2, $4);
). But it doesn't yet know which of the two alternative productions will apply. The first one requires the MRA to be executed; the second one does not. But the compiler won't know until it sees the token following the close parenthesis, and by then it will be too late (according to the LALR(1) algorithm).
As presented in the snippet, the MRA in the first alternative is exactly the same as the final action in the second alternative. If that's really the case, then the parser doesn't actually have to decide. It could simply run the final action for the second alternative a little earlier. But that simple solution is not as simple as it looks, because bison does not make any attempt to see whether two MRAs are the same code. It just assumes that they are all different, and that would lead to it still having to make a difference.
On the other hand, there is a much simpler solution, because it really cannot make any difference whether the MRA is invoked before or after the close parenthesis. Nothing can happen reading a token (aside from advancing the line counter, and if that's an issue you should be using location objects). Moving the MRA will result in:
%token RetType Formals Statements
%token ID
%%
FuncDecl: RetType ID '(' Formals ')' { funcdecl($1, $2, $4); }
'{' Statements '}' { funcdef($1, $2, $4, $8); }
| RetType ID '(' Formals ')' ';' { funcdecl($1, $2, $4) }
Now the conflict is gone:
$ bison -o conan.c conan.y
$
That's because the MRA decision is made at the point where the parse has reached
RetType ID ( Formals ) {
^
|----- lookahead
and now the lookahead is sufficient to decide. (In the other alternative, the lookahead is ;
).