I'm writing a simple text-template language for a web application I'm writing (think google's ctemplate). When finished, it'll feature only a small number of possible actions, simple stuff like "evaluate and execute", "evaluate and print", "evaluate and escape html", "comment". I was thinking of hand writing the whole parser from scratch, but I started looking at parser generators like lex, flex and antlr. These seem like way more than I need for my simple syntax. So the question is, at what point is it practical to use a parser generator?
-
3Ouch. Nice ruse by Adobe to get more than they (probably) paid for. ;) Would "gnu-flex" be a better tag? – spender Aug 04 '10 at 01:02
-
The Pragmatic Programmer's book [Language Implementation Patterns](http://pragprog.com/titles/tpdsl/language-implementation-patterns) has an excellent first chapter discussing the different 'strengths' of parsers for different languages. – sarnold Aug 04 '10 at 01:11
-
Antlr is the way to go, especially because of the AntlrWorks gui which really helps to debug your grammar & syntax. – Mawg says reinstate Monica Aug 04 '10 at 01:31
-
Rekex - grammar as algebraic datatypes - github.com/zhong-j-yu/rekex – ZhongYu Sep 27 '21 at 01:57
2 Answers
Sooner rather than later. If you have a simple syntax now, using a parser generator is easy. It makes it easier still when you want to add variables and loops and conditionals.
But wait! - There is little reason to invent your own language unless it is very domain specific like eqn
or TeX
or molecular modeling languages. You are far better off embedding a language that was specifically designed for the purpose. Tcl is the old guard in that realm, with Python being a strong contender. Perl was also designed to be an embedded scripting language but I think it a poor candidate as it will likely yield very "write-only" code in the hands of your users.
Language design is hard and smoking all the fiddly bits out is harder still. With both Python and Tcl you can decide how much of the core language to expose to your users and open up closed bits as you find a need for them.
The first little language that I wrote (which astonishingly is still in production use) would have been so much better had Tcl been there to use instead.

- 42,753
- 9
- 87
- 112
On the one hand if you don't have experience with one of these tools, and you have the time, then perhaps this is a good opportunity to learn one for this use case. I would imagine that if you were experienced in these tools, you'd simply use them, much like many grab for a regex for many parsing tasks.
On the other hand, simple parsers aren't that hard to do, they aren't even that hard to maintain. I like writing them, and I typically reach out for them when the task needs one rather than a tool (but I'm not super familiar with the tools). In many cases I prefer a simple parser over regex, depending on the task.

- 115,893
- 19
- 128
- 203