I'm a beginner with Megaparsec and Haskell in general, and trying to write a parser for the following grammar:
A word will always be one of:
- A number composed of one or more ASCII digits (ie "0" or "1234") OR
- A simple word composed of one or more ASCII letters (ie "a" or "they") OR
- A contraction of two simple words joined by a single apostrophe (ie "it's" or "they're")
So far, I've got the following (this can probably be simplified):
data Word = Number String | SimpleWord String | Contraction String deriving (Show)
word :: Parser MyParser.Word
word = M.choice
[ Number <$> number
, Contraction <$> contraction
, SimpleWord <$> simpleWord
]
number :: Parser String
number = M.some C.numberChar
simpleWord :: Parser String
simpleWord = M.some C.letterChar
contraction :: Parser String
contraction = do
left <- simpleWord
void $ C.char '\''
right <- simpleWord
return (left ++ "'" ++ right)
But I'm having problem with defining a parser for skipping white spaces and anything that is non-alphanumeric. For example, given the input 'abc'
, the parser should discard the apostrophes and just take the "simple word".
The following doesn't compile:
filler :: Parser Char
filler = M.some (C.spaceChar A.<|> not C.alphaNumChar)
spaceConsumer :: Parser ()
spaceConsumer = L.space filler A.empty A.empty
lexeme :: Parser a -> Parser a
lexeme = L.lexeme spaceConsumer