0

I'm working extensively with Bison grammars for the first time. I have my grammar set up, and a little test suite to correlate results.

Occasionally, the test suite passes:

Reducing stack by rule 101 (line 613):
   $1 = nterm mathenv ()
-> $$ = nterm closedTerm ()
Stack now 0 5 3
Entering state 120
Reading a token: Next token is token ENDMATH ()
Reducing stack by rule 28 (line 517):
   $1 = nterm closedTerm ()
-> $$ = nterm compoundTerm ()
Stack now 0 5 3
Entering state 119
Reducing stack by rule 12 (line 333):
   $1 = nterm compoundTerm ()
-> $$ = nterm compoundTermList ()
Stack now 0 5 3
Entering state 198
Next token is token ENDMATH ()
Shifting token ENDMATH ()
Entering state 325

... continues to completion ...

Occasionally, it does not:

Reducing stack by rule 101 (line 613):
   $1 = nterm mathenv ()
-> $$ = nterm closedTerm ()
Stack now 0 5 3
Entering state 120
Reading a token: Next token is token MN ()
Reducing stack by rule 28 (line 517):
   $1 = nterm closedTerm ()
-> $$ = nterm compoundTerm ()
Stack now 0 5 3
Entering state 119
Reducing stack by rule 12 (line 333):
   $1 = nterm compoundTerm ()
-> $$ = nterm compoundTermList ()
Stack now 0 5 3
Entering state 198
Next token is token MN ()
Shifting token MN ()
Entering state 11

... errors eventually ...

Now at end of input.
Line: 9 Error: syntax error at token 

ENDMATH is the correct token to shift to, but sometimes, MN is determined. I get inconsistent results whenever I run my test. Is such a "random" ambiguity normal? What could be causing it? Should I define some %precedence rules?

At the top of y.output, I do see several conflicts for states, like

State 0 conflicts: 3 shift/reduce
State 120 conflicts: 2 shift/reduce
State 127 conflicts: 2 shift/reduce
State 129 conflicts: 2 shift/reduce
State 154 conflicts: 1 shift/reduce
State 207 conflicts: 3 shift/reduce
State 265 conflicts: 109 shift/reduce
State 266 conflicts: 109 shift/reduce
State 267 conflicts: 109 shift/reduce
State 268 conflicts: 109 shift/reduce
State 269 conflicts: 109 shift/reduce
State 342 conflicts: 2 shift/reduce
State 390 conflicts: 109 shift/reduce
State 391 conflicts: 109 shift/reduce
State 396 conflicts: 1 shift/reduce
State 397 conflicts: 1 shift/reduce

Is it advisable to eliminate all of these conflicts? Note that state 120 is listed as having a conflict, and is the state right before this random error occurs.

  • The lexer determines which tokens are recognized -- the parser just uses those tokens. If you're getting inconsistent tokens from your lexer, that is a problem with the lexer and the parser is irrelevant. – Chris Dodd Sep 21 '14 at 22:03

1 Answers1

1

Conflicts in your grammar mean that the grammar is not LALR(1). That may be due to the grammar being ambiguous or it may be due to the grammar requiring more than one token of lookahead. Whenever you have a conflict, bison resolves it by chosing one of the possible actions (either shift or reduce) based on the precedence directives you have. This results in a parser which recognizes (parses) some subset of the language described by the grammar.

If the conflicts are purely due to ambiguity, this may result in just eliminating ambiguous parses and not actually reducing the language at all. For such cases, using precedence rules to resolve the ambiguity is the right way to deal with the problem, since it gives you a grammar that parses the language you want.

If the conflicts are due to needing more lookahead, precedence rules are generally no help. You need to resolve the problem by rearranging the grammar to not require the lookahead or by using other techniques (hacks) such as having the lexer insert extra synthetic tokens based on further lookahead in the input or other information.

In your case the immediate problem seems to be in the lexer -- in on case it returns the token ENDMATH and in another it returns MN. There may also be ambiguity or lookahead issues in the grammar connected to the conflicts you see in y.output, but such problems appear at first glance to be completely independent of the problem with the lexer.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
  • After many irritating hours, I tracked down the problem to a flappy `memmove` call earlier in the code. Although your specific answer didn't resolve this--without looking at the whole file, I'm not sure who could--but it *did* help me understand Bison parsing a bit better, and for that I am grateful. –  Sep 23 '14 at 02:17