4

I'm writing an FParsec parser for strings in this form:

do[ n times]([ action] | \n([action]\n)*endDo)

in other words this is a "do" statement with an optional time quantifier, and either a single "action" statement or a list of "action"s (each on a new line) with an "end do" at the end (I omitted indentations/trailing space handling for simplicity).

These are examples of valid inputs:

do action

do 3 times action

do
endDo

do 3 times
endDo

do
action
action
endDo

do 3 times
action
action
endDo

This does not look very complicated, but:

Why does this not work?

let statement = pstring "action"
let beginDo = pstring "do"
                >>. opt (spaces1 >>. pint32 .>> spaces1 .>> pstring "times")
let inlineDo = tuple2 beginDo (spaces >>. statement |>> fun w -> [w])
let expandedDo = (tuple2 (beginDo .>> newline)
                    (many (statement .>> newline)))
                 .>> pstring "endDo"
let doExpression = (expandedDo <|> inlineDo)

What is a correct parser for this expression?

Francesco De Vittori
  • 9,100
  • 6
  • 33
  • 43

1 Answers1

6

You need to use the attempt function. I just modified your beginDo and doExpression functions.

This is the code:

let statement  o=o|>  pstring "action"

let beginDo o= 
    attempt (pstring "do"
        >>. opt (spaces1 >>. pint32 .>> spaces1 .>> pstring "times")) <|> 
        (pstring "do" >>% None)                                       <|o

let inlineDo   o= tuple2 beginDo (spaces >>. statement |>> fun w -> [w]) <|o
let expandedDo o= (tuple2 (beginDo .>> newline) (many (statement .>> newline)))
                 .>> pstring "endDo" <|o

let doExpression o= ((attempt expandedDo) <|> inlineDo) .>> eof <|o

I added an eof at the end. This way it will be easier to test.

I added also dummy o parameters to avoid the value restriction.

Gus
  • 25,839
  • 2
  • 51
  • 76
  • works! So basically I was missing the two `attempt`? I don't still get why they are needed though. – Francesco De Vittori Dec 16 '11 at 15:40
  • Yes, because if the first parser (in a p1 <|> p2 expression) fails and it has input consumed you need a way to backtrack. That's basically what the attempt function does. – Gus Dec 16 '11 at 16:01
  • 1
    ok, got it. But then when can you use <|> without attempt? Every time you do a <|> b, a must fail in order to pick b, isn't? – Francesco De Vittori Dec 16 '11 at 16:07
  • 1
    For example if you do pFloat <|> something, you don't need it because if pFloat fails it consumes no input. The problem is when you compose parsers, maybe the last one fails but the other ones consumed some input. – Gus Dec 16 '11 at 16:09
  • Thanks for the clear explanation! I've re-read the documentation and now it makes complete sense. – Francesco De Vittori Dec 17 '11 at 06:02
  • 2
    Just a word of caution with this parser. `spaces` consumes newlines as well, so the parser would accept "do[newline]action" with no 'endDo', which could become a gotcha later. – YotaXP Jan 09 '12 at 01:33