0

Here's a simplification of my working EBNF grammar:

%token NEWLINE BLOCK_MARK A
%start file

file: block+ NEWLINE*;
block: BLOCK_MARK line;
line: A+;

Both \n and EOF spit out NEWLINE as a token (so that a single ending NEWLINE isn't required before EOF). It works with a stream like this:

BLOCK_MARK A A BLOCK_MARK A NEWLINE[actually EOF]

Now I want to have several line in a block, at least one being mandatory and the rest separated with NEWLINE. E.g.:

BLOCK_MARK A A NEWLINE A A BLOCK_MARK A A A EOF

I tried doing this:

file: block+ NEWLINE*;
block: BLOCK_MARK line moreline*;
line: A+;
moreline: NEWLINE line;

But Jison complains about a S/R conflict when lookahead is NEWLINE. I guess the state machine is confused deciding if the NEWLINE is part of a new block line or the final NEWLINE* in file (which is needed because the file can end with NEWLINE/EOF).

How can I fix this?

kaoD
  • 1,534
  • 14
  • 25
  • One mechanism might be to stop treating newlines as EOF; they're not the same thing at all. I'm more than a little puzzled about how that is working. – Jonathan Leffler Dec 21 '13 at 21:22
  • @JonathanLeffler it's the other way around, I'm treating EOF as NEWLINE. I think I'm not the only one doing that to make an ending NEWLINE not required (since EOF is parsed as NEWLINE too). Changing `file` to `file: block+ NEWLINE* EOF` does not fix the issue and makes it worse, requiring both `NEWLINE` (for `line`) and `EOF` (for `file`) as the last parsed tokens. – kaoD Dec 21 '13 at 21:31

2 Answers2

1

What you want is to make the newlines PART of the preceeding line, deferring a reduction of anything other than a line until after you see the newline. So you end up with:

file: block+ ;
block: BLOCK_MARK line_nl+ line_nonl? | BLOCK_MARK line_nonl ;
line_nl: line NEWLINE ;
line_nonl: line ;
line: A+ ;

Now the only problem with the above is that it doesn't allow for any blank lines (a blank line will be a syntax error). But that's the same as your original grammar.

Chris Dodd
  • 119,907
  • 13
  • 134
  • 226
  • I think that makes sense, thanks! I got the blank line covered in the scanner (\n+ spits NEWLINE). – kaoD Dec 22 '13 at 04:06
  • Woops, didn't solve the whole issue since this now fails for the original use case (I think I ran into this already). `BLOCK_MARK A BLOCK_MARK A EOF` fails, but `NEWLINE` should only be required if there are at least two lines present! Compare with `BLOCK_MARK A NEWLINE A BLOCK_MARK A EOF`, which works fine (because there's at least a `NEWLINE` or EOF in `line`). – kaoD Dec 22 '13 at 17:25
  • 1
    Adding an additional rule: `block: BLOCK_MARK line_nonl ;` should fix that. – Chris Dodd Dec 22 '13 at 17:47
  • My grammar is slightly more complex and I had to add an extra production covering both cases, but that solved the issue. I came up with another solution based on yours that seems to be fine too (the other answer), though I'll accept (and probably use) yours. – kaoD Dec 22 '13 at 18:01
0

Based on Chris Dodd's idea but reversed as it was in my frist try. The basic idea was just to remove the NEWLINE* at file, which was already covered in line after all.

file: block+;
block: BLOCK_MARK line_nonl line_nl* NEWLINE?;
line_nl: NEWLINE line_nonl;
line_nonl: A+;

I think this one solves all cases.

kaoD
  • 1,534
  • 14
  • 25