3

The actual scenario below is made up. The purpose of the question is to understand more about what FParsec is doing here.

I am parsing a list of the strings (w) and (x) that are separated by one or more space characters ' '.

The parser for my list xs uses sepBy with a separator parser isSeparator.

isSeparator is based on manySatisfy and seems to correctly consume spaces. I believe this can be seen in the test output below when it parses two leading space characters it ends at position 3.

However, it fails when I use it in xs, as shown below.

Why does this fail and what would be a good approach for dealing with a separator that could be one or more spaces?

open FParsec

let test p str =
    match run p str with
    | Success(result, _, p)   -> printfn "Success: %A position = %A" result p
    | Failure(errorMsg, _, _) -> printfn "Failure: %s" errorMsg

let str s = pstringCI s

let w = str "(w)"
let z = str "(z)"

let woz = w <|> z

let isSeparator = manySatisfy (fun c -> c = ' ')
let xs = sepBy woz isSeparator

test isSeparator "  (w)" // Success: "  " position = (Ln: 1, Col: 3)
test xs "(z) (w)"        // Failure: Error in Ln: 1 Col: 8
                         // (z) (w)
                         //        ^
                         // Note: The error occurred at the end of the input stream.                         
                         // Expecting: '(w)' (case-insensitive) or '(z)' (case-insensitive)
Sean Kearon
  • 10,987
  • 13
  • 77
  • 93
  • I'm not quite sure what the question is. The parser `xs` correctly parses `w` or `z` separated by a single space, because that's what you have specified as a separator. The parser `x2` should work as you expect: parsing `w` or `z` separated by multiple spaces. – Fyodor Soikin Jun 01 '18 at 16:44
  • Oh dear - my bad...the code is badly formed. Will edit and update. Thanks Fyodor Soikin. – Sean Kearon Jun 01 '18 at 17:10
  • Have now edited and updated the code. – Sean Kearon Jun 01 '18 at 17:24

1 Answers1

4

This happens, because manySatisfy matches zero or more characters that satisfy the given predicate, the key word being "zero". This means that, at the very end of input, isSeparator actually succeeds, even though it doesn't consume any characters. And since isSeparator succeeds, sepBy is expecting to find another instance of woz after the separator. But there are no more instances, so sepBy returns an error.

To verify this, try parsing an input without spaces between w and z: test xs "(z)(w)". This should print "Success", because empty separator is ok.

To make isSeparator always consume at least one character and fail when no spaces are found, use many1Satisfy instead of manySatisfy:

let isSeparator = many1Satisfy (fun c -> c = ' ')
let xs = sepBy woz isSeparator
Fyodor Soikin
  • 78,590
  • 9
  • 125
  • 172