I'm currently writing my simple programming language parser in Haskell with megaparsec library.
I found this megaparsec tutorial, and I wrote following parser code:
import Data.Void
import Text.Megaparsec
import Text.Megaparsec.Char
import qualified Text.Megaparsec.Char.Lexer as L
type Parser = Parsec Void String
lexeme :: Parser a -> Parser a
lexeme = L.lexeme space
rws :: [String] -- list of reserved words
rws = ["if", "then"]
identifier :: Parser String
identifier = (lexeme . try) (p >>= check)
where
p = (:) <$> letterChar <*> many alphaNumChar
check x =
if x `elem` rws
then fail $ "keyword " ++ show x ++ " cannot be an identifier"
else return x
A simple identifier parser with reserved name error handling. It successfully parses valid identifier such as foo
, bar123
.
But when an invalid input(a.k.a. reserved name) goes in to the parser, it outputs error:
>> parseTest identifier "if"
1:3:
keyword "if" cannot be an identifier
which, error message is alright, but error location(1:3:
) is a bit different from what I expected. I expected error location to be 1:1:
.
In the following part of definition of identifier
,
identifier = (lexeme . try) (p >>= check)
I expected try
would behave like there was no input consumed if (p >>= check)
fails and go back to source location 1:1:
.
Is my expectation wrong? How can I get this code work as I intended?