2

I'm writing a messaging system for the users of our site, which implements segmentation to allow for individual messages to target dynamic segments of users. Because a given message's segment definition may contain multiple individual segment matches, it's necessary for the content of the message body also to be segmented. I've attempted to do this by writing what turned out to be a custom lexer/parser (without me even knowing about lexers or parsers) until a chance conversation with a much more experienced programmer suggested I take a look at lexers and parser generators. I've done a bit of research, and found that the PHP native Lime parser generator seems to be my best option, seeing as the code I'm writing is PHP.

I've looked at the grammar file for the calculator example, and at the metagrammar, (in fact, I've spent a few hours analyzing most of the source code) but I'm really having trouble wrapping my head around how to construct even a simple grammar file. Is there anyone who knows of any example grammar files specifically for Lime, as it seems to us its own grammar definition, rather than that of Lemon or any of the other PGs.

Should you be willing and able to provide concrete examples, I'm specifically trying to write conditionals in the format of something like the following:

This is a text block all users will see.

{{IF user.modules.sms}}
This is a text block only visible to users with the sms module enabled
{{/IF}}

{{IF user.modules.anothermodule AND user.previouslogin < (now() - 3600)}}
This is a text block only visible to users with the anothermodule module enabled, whose previous login was more than an hour ago
{{/IF}}

Or just in general, if anyone hase any suggestions on possible other methods of implementing such a feature, I'm open to advice! Just bear in mind it's not possible to use PHP, as the people writing these messages will be project managers and marketers.

Alexander L. Hayes
  • 3,892
  • 4
  • 13
  • 34
Andrew Plank
  • 942
  • 10
  • 22

2 Answers2

1

I haven't done any parser generator work since the mid 90s when I used lex & yacc to build C programs, but I'll offer this - since I see you haven't gotten a satisfactory answer or updated your question since 2012:

In general, it looks like lime is an OK substitute for yacc when you want a parser generator to emit PHP code, but the tokenize() method shown in the calculator example is an extremely weak replacement for lex. So in general if your goal is to embed bits of programming logic inside "messages" then you can expect writing the tokenizer logic "from scratch" to be a challenge (less so if the message format is highly constrained).

But your proposed example message raises the larger question:

How exactly will the PHP code which is to be emitted by your parser generator be used?

Specifically:

  • Will these chunks of parser generated code be "standalone" web pages - addressable directly via URL and rendered directly by the webserver (in which case the next question is how you're going to tell the webserver to execute the PHP code, e.g. by making them into CGI scripts)? Or will they run inside some sort of application framework (or "message renderer")?

  • How will (PHP) program state be persisted? Your example refers to "user.previouslogin", which suggests persistence not just across page views but also "sessions" of some sort.

  • Will the logic which you're proposing to embed in your messages inside tags really be some variant of PHP or Javascript, or something genuinely new?

Embedding logic inside static pages is an old idea (Server Side Includes were popular in the 90s, after all), and modern templating engines (as suggested in the answer by Ugo Meda) are quite powerful. Whether it really makes sense to roll your own message parsing + rendering system really depends on the constraints imposed by the application context which you're referring to when you write "user.modules.*" in your example.

Peter
  • 2,526
  • 1
  • 23
  • 32
0

Don't reinvent the wheel. Maybe you should use something like Smarty to implement this. Beware, this should be used by trusted users since it executes code, which may be dangerous.

If you don't plan on implementing hundreds of functionnalities, proper regexes should do the trick.

Ugo Méda
  • 1,205
  • 8
  • 23
  • That's just the issue... At some point in the future, this segmentation ability needs to be implemented in a feature where the end-users themselves will be able to create inline message segmentation. So using any templating engine that provides real access to PHP functionality is a non option. – Andrew Plank Jul 05 '12 at 10:30
  • ...also, it's more than likely that there will need to be the option of extending this parsing feature easily, and adding statements to a grammar file and a slight re-implementation of client code is preferable to manually having to write a whole new implementation of some home-brew parsing method. I've read in many places now that using regex to perform parsing is fraught with danger and complication, which is why I'm wanting to use something like Lime. And I wouldn't say that is reinventing the wheel :) – Andrew Plank Jul 05 '12 at 10:31
  • Then you should take a look at how are implemented other libraries such as PHPBB (BBcode), MediaWiki, MarkDown, Smarty. Might give you a hint ! – Ugo Méda Jul 05 '12 at 10:33