0

I'm using the ANTLR v4 Java grammar, available here, to parse Java code. One of the productions looks like this:

expression
    :   primary
    |   expression '.' Identifier
    |   expression '.' 'this'
    |   expression '.' 'new' nonWildcardTypeArguments? innerCreator
    |   expression '.' 'super' superSuffix
    |   expression '.' explicitGenericInvocation
    |   expression '[' expression ']'
    |   expression arguments
    |   // Lots of other patterns...
    ;

expression '.' Identifier matches a simple member access, and expression arguments matches a method call. You can view the full source of this production here.

For the purposes of syntax highlighting, I want to introduce additional redundant patterns to detect what I call named method invocations. bar() or foo.bar() would count as a named method invocation, with bar being the name of the method. For such expressions I want bar to be colored green, even though identifiers are normally colored white. However, in foo.bar or foo.bar[0](), nothing should be colored green. In the former bar is not calling a method, and in the latter bar[0] is not a valid identifier.

I added these two extra patterns before expression arguments (note: arguments is synonymous with '(' expressionList? ')' in the original source code):

expression
    :   // ...
    |   expression '[' expression ']'
    |   Identifier arguments namedMethodInvocationStub // Detect bar()
    |   expression '.' Identifier arguments namedMethodInvocationStub // Detect (some().complicated().expression()).bar()
    |   expression arguments
    |   // ...
    ;

namedMethodInvocationStub
    :
    ;

(Here, namedMethodInvocationStub is an extra dummy production I've added. The idea is I can override VisitExpression and check if the last child is a namedMethodInvocationStub. If so, then we've matched a named method invocation, so go through all direct children of type Identifier and color them green. Anyhow, this is just to demystify what that is, it's not directly related to my question below.)

I expected this rule change to make foo.bar(), which had previously parsed as (expression '.' Identifier) arguments, now parse as expression '.' Identifier arguments namedMethodInvocationStub. However, it still parses the same way as before, whether or not I remove the namedMethodInvocationStub. Why is this?

James Ko
  • 32,215
  • 30
  • 128
  • 239

1 Answers1

0

I believe you can't match empty rule/token (namedMethodInvocationStub) in ANTLR (or any other lexer)

ANTLR: empty condition not working

What is the equivalent for epsilon in ANTLR BNF grammar notation?

Do you see any errors/warning during ANTLR code generation phase?

Dmitry Zvorygin
  • 473
  • 6
  • 14
  • It seems to be happening with or without `namedMethodInvocationStub`, though. – James Ko Jun 13 '17 at 19:21
  • I didn't find pattern "expression arguments" at the location you've provided https://github.com/antlr/grammars-v4/blob/master/java/Java.g4#L497-L538 – Dmitry Zvorygin Jun 13 '17 at 19:24
  • Sorry, I tweaked the code a little bit to make it neater. `arguments` is another nonterminal ([here](https://github.com/antlr/grammars-v4/blob/master/java/Java.g4#L604-L606)) synonymous with `'(' expressionList? ')'`. – James Ko Jun 13 '17 at 19:27
  • @JamesKo, you can try to use "marker" #methodInvocation at "| expression '(' expressionList? ')'" to handle that in different manner in your visitor code. – Dmitry Zvorygin Jun 13 '17 at 19:27
  • What does `#methodInvocation` do? I have not seen that syntax before. – James Ko Jun 13 '17 at 19:36
  • OK thanks, I just found [this answer](https://stackoverflow.com/questions/22002799/antlr4-how-to-know-which-alternative-is-chosen-given-a-context) which explains what pound signs do. However, these patterns are still not being matched-- all of the method calls are still parsing as `expression arguments`. Can you tell me why? – James Ko Jun 13 '17 at 19:40