2

I have some non-reserved keywords I'm matching with rules like:

kFOO = {self.input.LT(1).text.lower() == 'foo'}? ID;

Where the ID token is a standard alpha-numeric string. These kinds of rules work great, except when I try to do something like this:

some_rule
@after { do_something_with($t.text) }
  : t=kWORD1
  | t=kWORD2
  | t=kWORD3
  ;

In the generated parser, the kWORD1 and kWORD2 rule functions don't return anything, but the kWORD3 function does. As well, in the some_rule function, only the block trying to match kWORD3 assign the return value to t. The other two invocations don't reference t in any way.

(Also, I expected the following to work, but it did not, I suspect for the same underlying reason.

some_rule
@after { do_something_with($t.text) }
  : t=( kWORD1
  | kWORD2
  | kWORD3)
  ; 

Nothing gets assigned to t under any conditions.)

However, the following DOES work as expected:

some_rule
@after { do_something_with($t1.text or $t2.text or $t3.text) }
  : t1=kWORD1
  | t2=kWORD2
  | t3=kWORD3
  ;

Each of the matching functions is generated to return a value, and each of the blocks matching the keyword rules in some_rule assigns the return value to their label. The problem with this solution is it gets a little excessive when there are several more alternatives.

Half of me cries "BUG!" but this is the first antlr project I've done, so more likely there's something I don't understand.

What's the best way to do what it is I'm trying to do here?

Glenn Moss
  • 6,812
  • 6
  • 29
  • 23

1 Answers1

1

Besides the fact that .toLower() never matches FOO, I believe this is due to the fact that kWORD1 is another type of kWORD2 etc. but something that might work is this:

kWORD [String pre]
: {self.input.LT(1).text.lower() == $pre}? ID;

some_rule
@after { do_something_with($t.text) }
  : t=kWORD['word1']
  | t=kWORD['word2']
  | t=kWORD['word3']
  ;

untested though.

stryba
  • 1,979
  • 13
  • 19
  • This is a good approach to keywords in general. I have MANY lines just like kFOO in my grammar. I could replace it all with this. Thanks! – Glenn Moss Jan 31 '12 at 21:30
  • I'd also prefer ` kWORD [String pre] : ID {$ID. text == $pre}? ;` over ` kWORD [String pre] : {self.input.LT(1).text.lower() == $pre}? ID;` for readability but am not sure performancewise though. – stryba Jan 31 '12 at 22:57
  • Well, I want it to be case insensitive. But are you specifically talking about putting the ID token before or after the conditional? – Glenn Moss Mar 23 '12 at 17:30
  • You can still do $ID.text.equalsCaseIn... ($pre) – stryba Mar 24 '12 at 21:11
  • Turns out you can't do a parameterized rule like this because the body of the condition gets hoisted out to the predictive part of the parser, and the parameter is not in scope. Too bad it can't hoist it up as a function, or replace the $pre token with the literal passed as the rule argument. Oh well... – Glenn Moss Mar 28 '12 at 19:49
  • for me this works: `kWORD [String pre] : {true}? ID {$ID.text == $pre}?;` – stryba Mar 30 '12 at 09:52