2

As an exercise I try to parse a EBNF/ABNF grammar with Megaparsec. I got trivial stuff like terminals and optionals working, but I'm struggling with alternatives. With this grammar:

S ::= 'hello' ['world'] IDENTIFIER LITERAL | 'test';

And this code:

production :: Parser Production
production = sepBy1 alternativeTerm (char '|') >>= return . Production

alternativeTerm :: Parser AlternativeTerm
alternativeTerm = sepBy1 term space >>= return . AlternativeTerm

term :: Parser Term
term = terminal
    <|> optional
    <|> identifier
    <|> literal

I get this error:

unexpected '|'
expecting "IDENTIFIER", "LITERAL", ''', '[', or white space

I guess the alternativeTerm parser is not returning to the production parser when it encounters a sequence that it cannot parse and throws an error instead.

What can I do about this? Change my ADT of an EBNF or should I somehow flatten the parsing. But then again, how can I do so?

Jan van Brügge
  • 742
  • 8
  • 13

1 Answers1

1

It's probably best to expand my previous comment into a full answer.

Your grammar is basically a list of list of terms seperated (and ended) by whitespace, which in turn is seperated by |. Your solution with sepBy1 does not work because there is a trailing whitespace after LITERAL - sepBy1 assumes there is another term following that whitespace and tries to apply term to the |, which fails.

If your alternativeTerm is guaranteed to end with a whitespace character (or multiple), rewrite your alternativeTerm as follows:

alternativeTerm = (term `sepEndBy1` space) >>= return . AlternativeTerm
Phil Kiener
  • 778
  • 1
  • 7
  • 15