3

In Antlr2 there were numerous grammar options that could be set (reference). Now in Antlr3 we have like 1/3 of the amount of options (reference).

So I have two questions concerning this:

  1. Does anyone know why so many options were taken out and are any of them coming back?
  2. Does Antlr3 have the ability to do what Antlr2 could, even without all those options?

To be more specific on my second question, I want to be able to do a few things. First, I want to change the visibility of the generated lexer and parser classes (i.e. Antlr2 option "classHeaderPrefix").

Secondly, I want to be able to ignore any whitespace tokens found within certain keywords, like having "&keyword&" and "& k ey w o rd &" both match (i.e. Antlr2 option "ignore", I think?).

Finally, I want to make certain keywords case insensitive (i.e. Antlr2 option "caseSensitive").

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288

1 Answers1

2

BluePlateSpecial wrote:

To be more specific on my second question, I want to be able to do a few things. First, I want to change the visibility of the generated lexer and parser classes (i.e. Antlr2 option "classHeaderPrefix").

In v3 there is no way to do this.

BluePlateSpecial wrote:

Secondly, I want to be able to ignore any whitespace tokens found within certain keywords, like having "&keyword&" and "& k ey w o rd &" both match (i.e. Antlr2 option "ignore", I think?).

That options might have been removed because the LL(*) algorithm in the lexer is far more powerful than what was used in v2. Now, there is no need for such an option since this would do the trick:

FOO
  :  '&' (' ' | 'a'..'z')+ '&'
  ;

BluePlateSpecial wrote:

Finally, I want to make certain keywords case insensitive (i.e. Antlr2 option "caseSensitive").

That is also not possible in v3 other than doing it the "hard" way:

BAR
  :  ('b' | 'B') ('a' | 'A') ('r' | 'R')
  ;
Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • Thanks for the reply. I feel like they have taken one step forward and one step backward with v3. I personally feel as though a new version should be able to do everything the old one could do and more and instead they actually limit us. Especially with the omission of the caseSensitive option, how could one build a parser for a language such as VB where all keywords are case insensitive? Unless these options come back, I will have to strongly consider going back to v2. Any idea on why these were removed or if they are coming back? And thanks for the edit, my question looks a lot better now :) – BluePlateSpecial Jan 09 '11 at 17:01
  • @BluePlateSpecial, no, I'm not sure why `classHeaderPrefix` and `caseSensitive` were discarded in v3. I've missed them myself on several occasions as well. But I'd not go back to v2 for these things though. – Bart Kiers Jan 09 '11 at 17:24
  • @BluePlateSpecial: To handle case-insensitive keywords, you simply write lexical recognizers for each one explicitly allowing upper and lower case. Bart's answer is pretty clear. Its a bit more work; in the long run it doesn't matter much. – Ira Baxter Jan 10 '11 at 11:30
  • 2
    I understood his answer, but it doesn't change the fact that there were several very useful options discarded in v3 (I still can't do classHeaderPrefix and the answer for ignoring whitespace won't work well if I am only allowing certain keywords and not all letters. Like for "in" it would be '$' (' '|'i') (' '|'n') '$' which gets much worse for longer keywords). Also, I'm sorry but why does case insensitivity require more work when it could easily be built into Antlr by using Java's toUpper or something? In my own project, I would never cut that many features for a new version. – BluePlateSpecial Jan 10 '11 at 16:57
  • Either way I do appreciate Bart's response and have marked it as the answer. I'm probably sticking to v3, I just wish I could know why such decisions were made to make more work for the user or, in some cases, completely cut the ability to do something all together. – BluePlateSpecial Jan 10 '11 at 17:00
  • @BluePlateSpecial, you're welcome of course. Since you're interested in the "why", I recommend you post to the [ANTLR-interest mailing list](http://www.antlr.org/mailman/listinfo/antlr-interest) and ask your question there: Terence Parr frequently posts there (more than here, at least). – Bart Kiers Jan 10 '11 at 18:31
  • 1
    This is a joke ! Case insensitivity is very easy to handle, this is a simple subtraction on characters ! This is nothing compared to the amount of work required to test 's' | 'S' ! First time and definitely last time I use antlr. – Julio Guerra May 30 '11 at 19:29
  • @Julio, your remark is a joke. What does your comment contribute? (hint: the answer is: nothing). There's a good reason why this was removed from v2 to v3. Yet you think you know best and spew your unfounded remarks. – Bart Kiers May 30 '11 at 19:39
  • @Julio, I can imagine this being just a small inconvenience, but to never use some tool or framework based on it, is just plain silly, IMO. Anyway, no one is going to shed a tear that you won't be using ANTLR again, I can assure you that. – Bart Kiers May 30 '11 at 20:22
  • Antlr with C target is a joke. It is really limited and absolutely not made for languages with memory management. Antlr is not mature enough for C or C++ compared to Bison. But yes, I am sure Antlr is perfect for garbage-collected languages. – Julio Guerra May 30 '11 at 21:27