1

I am building a parser using Megaparsec and I don't know which is the best approach to parse a structure like

names a b c
surnames d e f g

where names and surnames are keywords followed by a list of strings, and each of the two line is optional. This means that also

names a b c

and

surnames d e f g

are valid.

I can parse every line with something like

maybeNames <- optional $ do
    constant "names"
    many identifier

where identifier parses a valid non-reserved string.

Now, I'm not sure how to express that each line is optional, but still retrieve its value if it is present

marcosh
  • 8,780
  • 5
  • 44
  • 74
  • 1
    What's wrong with the code you wrote (`maybeNames <- ...`)? It looks like it does exactly what you want to me. – Daniel Wagner Sep 04 '18 at 19:05
  • @DanielWagner the pieces by themselves work fine. I can parse correctly something like `names a b c` and `surnames d e f g`, but the parser fails for `names a b c surnames d e f g`. I don't know how to glue correctly the two parsers together – marcosh Sep 04 '18 at 20:50
  • 2
    Just use `(>>=)`, or consecutive lines in your `do` block... – Daniel Wagner Sep 04 '18 at 22:14
  • you're right! I was writing my test wrong (missing a whitespace...) and I tought its failure was the parser fault... thanks for the help – marcosh Sep 05 '18 at 05:46

2 Answers2

0

Start with writing the context free grammar for your format:

program  ::= lines
lines    ::= line | line lines
line     ::= names | surnames
names    ::= NAMES ids
surnames ::= SURNAMES ids
ids      ::= id | id ids
id       ::= STRING

Where upper case names are for terminals, and lower case names are for non terminals. You could then easily use Alex + Happy to parse your text file.

OrenIshShalom
  • 5,974
  • 9
  • 37
  • 87
0

You can do something similar to what appears in this guide and use <|> to select optional arguments. Here are the essence of things:

whileParser :: Parser Stmt
whileParser = between sc eof stmt

stmt :: Parser Stmt
stmt = f <$> sepBy1 stmt' semi
  where
    -- if there's only one stmt return it without using ‘Seq’
    f l = if length l == 1 then head l else Seq l

stmt' = ifStmt
  <|> whileStmt
  <|> skipStmt
  <|> assignStmt
  <|> parens stmt

ifStmt :: Parser Stmt
ifStmt = do
  rword "if"
  cond  <- bExpr
  rword "then"
  stmt1 <- stmt
  rword "else"
  stmt2 <- stmt
  return (If cond stmt1 stmt2)

whileStmt :: Parser Stmt
whileStmt = do
  rword "while"
  cond <- bExpr
  rword "do"
  stmt1 <- stmt
  return (While cond stmt1)
OrenIshShalom
  • 5,974
  • 9
  • 37
  • 87
  • thanks! The problem that I see in this is that it allows multiple `ifStmt`, while I can't allow multiple `names` sections. Moreover, here statements are separated by a `semi`, while I don't have any separator between sections – marcosh Sep 04 '18 at 10:41
  • I think this should help: https://stackoverflow.com/questions/33881260/parsec-parsing-a-list-of-lists-both-with-the-same-delimiter – OrenIshShalom Sep 04 '18 at 10:52
  • I thinks that's somehow a different problem. It can have repetitions of commands, while I can not. In other terms, it has a list of lists, while I have a record of lists – marcosh Sep 04 '18 at 11:12