6

Consider this parser that converts digit strings to ints:

let toInt (s:string) = 
    match Int32.TryParse(s) with
    | (true, n) -> preturn n
    | _         -> fail "Number must be below 2147483648"

let naturalNum = many1Chars digit >>= toInt <?> "natural number"

When I run it on non-numeric strings like "abc" it shows the correct error message:

Error in Ln: 1 Col: 1
abc
^
Expecting: natural number

But when I give it a numeric string exceeding the int range it gives the following counter-productive message:

Error in Ln: 1 Col: 17
9999999999999999
                ^
Note: The error occurred at the end of the input stream.
Expecting: decimal digit
Other error messages:
  Number must be below 2147483648

The primary message "Expecting: decimal digit" makes no sense, because we have to many digits already.

Is there a way to get rid of it and only show "Number must be below 2147483648"?


Full example:

open System
open FParsec

[<EntryPoint>]
let main argv =
    let toInt (s:string) = 
        match Int32.TryParse(s) with
        | (true, n) -> preturn n
        | _         -> fail "Number must be below 2147483648"

    let naturalNum = many1Chars digit >>= toInt <?> "natural number"

    match run naturalNum "9999999999999999" with
    | Failure (msg, _, _) -> printfn "%s" msg
    | Success (a, _, _)   -> printfn "%A" a

    0
Good Night Nerd Pride
  • 8,245
  • 4
  • 49
  • 65
  • I don’t understand why in both cases "abc" and "9999999999" the results are different. Both of them should go into the `fail ...`. – Nghia Bui May 25 '19 at 04:22
  • `many1Chars digit` will do the `fail` for us when we feed the parser `"abc"`. Only in `toInt` we have to do it ourselves. – Good Night Nerd Pride May 25 '19 at 07:31
  • The effect you see is caused by FParsec's internal handling of sequencing parsers. Even if the first parser succeeds, it may generate an error message (here since it cannot parse more digits). If the second parser fails without consuming input (which is the case here, since you only convert the first parser's result) all existing error messages in the state are merged together. This is described in FParsec's documentation. A solution could be a custom sequencing operator that drops potential error messages of the first parser in the success case and then forwards the result further. – mschmidt May 29 '19 at 15:10

2 Answers2

2

I think the root of the problem here is that this is a non-syntactic concern, which doesn't fit well with the model of a lookahead parser. If you could express "too many digits" in a syntactic way, it would make sense for the parser too, but as it is it will instead go back and try to consume more input. I think the cleanest solution therefore would be to do the int conversion in a separate pass after the parsing.

That said, FParsec seems flexible enough that you should still be able to hack it together. This does what you ask I think:

let naturalNum: Parser<int, _> =
    fun stream ->
        let reply = many1Chars digit stream
        match reply.Status with
            | Ok ->
                match Int32.TryParse(reply.Result) with
                | (true, n) -> Reply(n)
                | _         -> Reply(Error, messageError "Number must be below 2147483648")                
            | _ ->
                Reply(Error, reply.Error)

Or if you want the "natural number" error message instead of "decimal digit", replace the last line with:

Reply(Error, messageError "Expecting: natural number")
glennsl
  • 28,186
  • 12
  • 57
  • 75
2

The effect you see ist that the first parser of your sequence succeeds, but also generates an error message (because it could consume even more digits). Your second parser consumes no further input and if it fails FParsec will therefore merge the error messages of the two sequenced parsers (Manual on merging of error messages).

A solution would be to create a small wrapper for a parser, that removes error messages from a result in the Ok case. Then when sequenced with a second parser only the message of the second parser remain.

Untested code from the top of my head:

let purify p =
    fun stream ->
        let res = p stream
        match res.Status with
            | Ok -> Reply(res.Result)
            | _ -> res


let naturalNum = purify (many1Chars digit) >>= toInt <?> "natural number"
mschmidt
  • 2,740
  • 4
  • 17
  • 31
  • Interesting solution! In this specific case I'd probably go with by glennsl's answer, but `purify` should come in handy in more complex scenarios. – Good Night Nerd Pride May 29 '19 at 17:49
  • Perhaps you can combine both solutions. What is actually needed here is a `map` like function for parsers, where the applied function may fail. In case of failure a specific error can be created which overwrites the original errors of the parser. – mschmidt May 29 '19 at 18:07