28

To learn how to write and parse a context-free grammar I want to choose a tool. For Haskell, there are two big options: Happy, which generates a parser from a grammar description and *Parsec, which allows you to directly code a parser in Haskell.

What are the (dis)advantages of either approach?

fuz
  • 88,405
  • 25
  • 200
  • 352
  • Depending on what you're doing `attoparsec` may be a good choice. http://www.serpentine.com/blog/2010/03/03/whats-in-a-parser-attoparsec-rewired-2/ – asm Sep 01 '11 at 11:34
  • Out of all the parser combinator libraries on Hackage, parsec is possibly my least favorite. uu-parsinglib, attoparsec, and polyparse are my top choices (in that order). – John L Sep 01 '11 at 13:31
  • 1
    My impression is that Happy has significantly better performance than almost any parser combinator library, especially for large grammars with little or no context. On the other hand, the flexibility of combinator libraries (with regards to effects, treating parsers as first class values, etc) is I think an everywhere better choice until performance becomes the dominating factor *and* one is dealing with a complex grammar. – sclv Sep 01 '11 at 15:20
  • @sclv Would you consider a grammar for a (simple) programming language like PL/0 complex? – fuz Sep 01 '11 at 16:09
  • @FUZxxl -- by my standards above, yes. In this context, I'm thinking of "complex" grammars as pretty much anything beyond formats that one would use for raw data serialization (where attoparsec or a hand-tuned parser could well be better). Since Happy packs things down to neat little state machines which it can optimize, then the more states the parser can be in, the better the expected gain, I 'd imagine. Also, bear in mind that happy is happiest (most performant) when used with a separate lexer pass such as by alex. – sclv Sep 01 '11 at 17:27
  • 3
    My rule of thumb is if I have a LR grammar I go with Happy, otherwise I use Parsec. It's usually not worth the effort changing LR to Parsec (more code, slower parser), but if I have to work out the grammar myself I like the tricks that Parsec supplies (e.g. context sensitive parsing). – stephen tetley Sep 01 '11 at 18:02
  • @John L: what's your reasoning behind that order? (I use polyparse, have had a look at attoparsec but have never touched uu-parsinglib) – ivanm Sep 02 '11 at 06:09
  • @ivanm: uu-parsinglib is very full-featured (automatic error correction is nice), and simple to use. The documentation is decent also. I probably use attoparsec more because I do more work with binary stuff, for which it's very efficient. The on-demand model also matches well with enumerator-style IO (which I naturally use heavily). I listed polyparse because I admire the simplicity and elegance, although I rarely use it. – John L Sep 02 '11 at 09:11
  • Another vote for uu-parsinglib. Our team switched from Parsec to uu-parsinglib a bit more than a year ago, and we haven't looked back. No explicit 'try' combinator needed, ability to do lazy parsing, and error correction mean that the fundamentals are more powerful than those of Parsec (even if the end-user niceties - documentation,pre-provided parsers for lexing etc aren't yet quite as good - but getting better). Happy is generally somewhat more awkward to use - but you do get the standard yacc-style guarantee of linear time parsing - otherwise you get told at compile time. – BenMos Sep 03 '11 at 07:32

5 Answers5

25

External vs internal DSL

The parser specification format for Happy is an external DSL, whereas with Parsec you have the full power of Haskell available when defining your parsers. This means that you can for example write functions to generate parsers, use Template Haskell and so on.

Precedence rules

With Happy, you can use precedences to simplify your grammar, whereas with Parsec you have to nest the grammar rules correctly yourself. Changing the precedence of an operator is therefore much more tedious in Parsec.

Static checking

Happy will warn you about ambiguities in your grammar at compile time. (Though it's not great at telling you where they are.) With Parsec, you get no warning until your parser fails at run time.

hammar
  • 138,522
  • 17
  • 304
  • 385
4

This is the traditional decision: do I use lex/yacc (happy) or do I write my own (mostly recursive descent) parser, only that the parsec library is like a DSL for doing it right.

If one has experience with the yacc/lex approach, using happy will be a smaller learning curve.

Ingo
  • 36,037
  • 5
  • 53
  • 100
  • I have only experience with Parsec, which is a toolkit to write LL(1) parsers in Haskell using a specialized monad, so it's not quite "writing a parser from scratch". – fuz Sep 01 '11 at 11:12
  • If you already have experience with Parsec, then I advise to stay with it. Note that I said Parsec "is like a DSL for doing it (i.e. writing it's own parser) right" – Ingo Sep 01 '11 at 11:15
  • 1
    @FUZxxl: Slight nitpick: Parsec is LL(N) not just LL(1). You can have arbitrary lookahead using the `try` combinator. Actually, parsec is even more powerful than LL(N) because it can cause context sensitive grammars due to it being a monad. – nominolo Sep 12 '11 at 13:43
2

In my opinion Parsec hides most of the nasty grammar details and lets you write your parsers more intuitively. If you want to learn this stuff in the first place go with some parser-generator like Happy (or even try to implement one yourself).

declension
  • 4,110
  • 22
  • 25
Random Dev
  • 51,810
  • 9
  • 92
  • 119
1

Naive parser combinators do not allow left-recursion in grammar rules and I haven't found a library that does.

Happy does allow full BNF in language spec, and some useful staff like priority rules. So, for complicated cases Happy and parser generators in general are much better. However, in case of simple, stupid languages with LL(k) parseable grammars, I would use a parser combinator library as more maintainer-friendly.

declension
  • 4,110
  • 22
  • 25
permeakra
  • 612
  • 6
  • 12
  • There are parser combinator libraries that implement the GLL algorithm which can handle left-recursive grammars. – Poscat May 26 '20 at 16:58
1

I'm used to the parser combinator library uu-parsinglib from utrecht university. One can have error correcting and permutations for free, and also the things that parsec has. I also like it because my implemented grammar looks like an EBNF grammar, without so much monadic stuff, and is easy to read.

Matt Fenwick
  • 48,199
  • 22
  • 128
  • 192
carlos
  • 11
  • 1