3

I'm attempting to write an Augmented Backus-Naur form parser. However, I am coming across a Stack Overflow exception whenever I attempt to parse alternatives. Below is an example which triggers the issue:

#r @"..\packages\FParsec\lib\net40-client\FParsecCS.dll"
#r @"..\packages\FParsec\lib\net40-client\FParsec.dll"
open FParsec

type Parser<'t> = Parser<'t, unit>

type Element =
    | Alternates of Element list
    | ParsedString of string

let (pRuleElement, pRuleElementRef) : (Parser<Element> * Parser<Element> ref) = createParserForwardedToRef()

let pString =
    pchar '"' >>. manyCharsTill (noneOf ['"']) (pchar '"')
    |>> ParsedString

let pAlternates : Parser<_> =
    sepBy1 pRuleElement (many (pchar ' ') >>. (pchar '/') >>. many (pchar ' ') )
    |>> Alternates

do pRuleElementRef :=
    choice
        [
            pString
            pAlternates
        ]

"\"0\" / \"1\" / \"2\" / \"3\" / \"4\" / \"5\" / \"6\" / \"7\""
|> run (pRuleElement .>> (skipNewline <|> eof))

The issue is easily resolved by simply reordering the choice like so:

do pRuleElementRef :=
    choice
        [
            pAlternates
            pString
        ]

However, that then causes a Stack Overflow because it continuously attempts to parse a new sequence of alternatives without consuming input. In addition, that method would then break ABNF precedence:

  1. Strings, names formation
  2. Comment
  3. Value range
  4. Repetition
  5. Grouping, optional
  6. Concatenation
  7. Alternative

My question essentially boils down to this: How can I combine parsing of a single element that can be a sequence of elements or a single instance of an element? Please let me know if you require any clarification / additional examples.

Your help is much appreciated, thank you!

EDIT:

I should probably mention that there are various other kinds of groupings as well. A sequence group (element[s]) and an optional group [optional element[s]. Where element can be nested groups / optional groups / strings / other element types. Below is an example with sequence group parsing (optional group parsing not included for simplicity):

#r @"..\packages\FParsec\lib\net40-client\FParsecCS.dll"
#r @"..\packages\FParsec\lib\net40-client\FParsec.dll"
open FParsec

type Parser<'t> = Parser<'t, unit>

type Element =
    | Alternates of Element list
    | SequenceGroup of Element list
    | ParsedString of string

let (pRuleElement, pRuleElementRef) : (Parser<Element> * Parser<Element> ref) = createParserForwardedToRef()

let pString =
    pchar '"' >>. manyCharsTill (noneOf ['"']) (pchar '"')
    |>> ParsedString

let pAlternates : Parser<_> =
    pipe2
        (pRuleElement .>> (many (pchar ' ') >>. (pchar '/') >>. many (pchar ' ')))
        (sepBy1 pRuleElement (many (pchar ' ') >>. (pchar '/') >>. many (pchar ' ') ))
        (fun first rest -> first :: rest)
    |>> Alternates

let pSequenceGroup : Parser<_> =
    between (pchar '(') (pchar ')') (sepBy1 pRuleElement (pchar ' '))
    |>> SequenceGroup

do pRuleElementRef :=
    choice
        [
            pAlternates
            pSequenceGroup
            pString
        ]

"\"0\" / ((\"1\" \"2\") / \"2\") / \"3\" / (\"4\" / \"5\") / \"6\" / \"7\""
|> run (pRuleElement .>> (skipNewline <|> eof))

If I attempt to parse alternates / sequence groups first, it terminates with a stack overflow exception because it then tries to parse alternates repeatedly.

Chris Altig
  • 680
  • 3
  • 8
  • 22

2 Answers2

2

The issue is that when you run the pRuleElement parser on the input, it correctly parses one string, leaving some unconsumed input, but then it fails later outside of the choice that would backtrack.

You can run the pAlternates parser on the main input, which actually works:

"\"0\" / \"1\" / \"2\" / \"3\" / \"4\" / \"5\" / \"6\" / \"7\""
|> run (pAlternates .>> (skipNewline <|> eof))

I suspect that you can probably just do this - the pAlternates parser works correctly, even on just a single string - it will just return Alternates containing a singleton list.

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • Hello Tomas, thank you for your response! I've edited my question accordingly. – Chris Altig Jun 11 '19 at 00:17
  • 1
    @ChrisAltig I think distinguishing between `pString` and `pSequenceGroup` will work just fine, because these two differ in their first character - i.e. the parser will fail if it is not one of these. However, the `pAlternates` is still the one that needs to run at the top level, so that it can consume one thing after another - so I think you will need two parsers - one for anything that can be determined by the first character and another for iterating over that via alternates. – Tomas Petricek Jun 11 '19 at 00:49
  • You read my mind. I had actually just finished programming out a similar solution and was prepping my answer when you commented. I have also come to the conclusion that the ordering doesn't (can't?) match the precedence order listed on the Wikipedia link. Though, I don't think that should matter. – Chris Altig Jun 11 '19 at 00:51
0

It looks like the solution was simply not attempting to parse alternatives whilst parsing alternatives in order to avoid an infinite loop resulting in a stack overflow. A working version of the code posted in my question is as follows:

#r @"..\packages\FParsec\lib\net40-client\FParsecCS.dll"
#r @"..\packages\FParsec\lib\net40-client\FParsec.dll"
open FParsec

type Parser<'t> = Parser<'t, unit>

type Element =
    | Alternates of Element list
    | SequenceGroup of Element list
    | ParsedString of string

let (pRuleElement, pRuleElementRef) : (Parser<Element> * Parser<Element> ref) = createParserForwardedToRef()
let (pNotAlternatives, pNotAlternativesRef) : (Parser<Element> * Parser<Element> ref) = createParserForwardedToRef()

let pString =
    pchar '"' >>. manyCharsTill (noneOf ['"']) (pchar '"')
    |>> ParsedString

let pAlternates : Parser<_> =
    pipe2
        (pNotAlternatives .>>? (many (pchar ' ') >>? (pchar '/') >>. many (pchar ' ')))
        (sepBy1 pNotAlternatives (many (pchar ' ') >>? (pchar '/') >>. many (pchar ' ') ))
        (fun first rest -> first :: rest)
    |>> Alternates

let pSequenceGroup : Parser<_> =
    between (pchar '(') (pchar ')') (sepBy1 pRuleElement (pchar ' '))
    |>> SequenceGroup

do pRuleElementRef :=
    choice
        [
            pAlternates
            pSequenceGroup
            pString
        ]

do pNotAlternativesRef :=
    choice
        [
            pSequenceGroup
            pString
        ]

"\"0\" / (\"1\" \"2\") / \"3\" / (\"4\" / \"5\") / \"6\" / \"7\""
|> run (pRuleElement .>> (skipNewline <|> eof))

In addition to the addition of pNotAlternatives I also modified it so that it would backtrack when failing to parse the alternative separator / which allows it to proceed after "realizing" that it wasn't a list of alternatives after all.

Chris Altig
  • 680
  • 3
  • 8
  • 22