6

So I'm writing a php templating engine called Slade inspired by Ruby Slim and laravel Blade.

And now many people are recommending me to rewrite it to a lexer / parser thing, instead of relying completely on regexes. So I googled about lexers and parsers and tried to learn how they work, and while I think I get the general idea, I still find it hard to start write one.

So I was hoping if someone would help me on my way, by showing how to do one example. How would I exactly lex (is that even a verb?) and parse this:

#wrapper.container.well first-attr="a literal attribute" second-attr=variable.name And here some text that will be the content of the div...

Into these nodes:

[
    'tagName' => 'div', // since there was no tagname, a div is assumed
    'attributes' => [
        'id' => 'wrapper',
        'class' => 'container well',
        'first-attr' => 'a literal attribute',
        'second-attr' => 'the value of the variable',
    ],
    'textContent' => 'And here some text that will be the content of the div...'
]

Of course I don't expect anyone to write out a function that 100% lexes / parses this, but I'd like to see the general pseudo code of how to go about this. Could anyone help me with this?

Evert
  • 2,022
  • 1
  • 20
  • 29
  • First off, your grammar is not well formed, because what if my text content starts with `something="blah"` – ircmaxell Apr 25 '15 at 15:56
  • @ircmaxell, then that's invalid as text content and it will be assumed as an attribute. I can put restrictions on what is valid and not valid, can't I? This doesn't mean you can't have the text `something="blah"`, you can, but in a different way, namely by using it on a new line that's indented and starting the new line with a pipe. – Evert Apr 25 '15 at 15:59
  • Suggestion: start by formally defining your grammar. Use [EBNF](http://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_Form) to detail it out. Additionally, whitespace (starting on a new line) is going to be tricky to deduce from a new rule unless you're very careful... – ircmaxell Apr 25 '15 at 16:02
  • Wow, that's a whole branch of computer science I've never dealt with so far. Do you know of any simple tutorials for beginners? – Evert Apr 25 '15 at 16:13
  • Hi. I recall your original post on reddit. I just did a Goolgle Search for PHP parser generators and came across PHP Peg. Give it a try: https://github.com/hafriedlander/php-peg it should at the very least get you in the right mind set. Good luck! – Kyle Apr 25 '15 at 16:58
  • Also, check out this article I just found about FSMs and Parsers. https://pegasus.cc.ucf.edu/~fgonzale/egn3210/Program8_1.pdf – Kyle Apr 25 '15 at 17:02
  • May I suggest you to look at Jade (https://github.com/jadejs/jade), they use a parser/lexer if I remember correctly. It's javascript, but it may helps you understand the concepts. And its almost the same grammer as you suggested (with parenthesis around attributes) – Maxime Fafard Apr 26 '15 at 02:03

0 Answers0