1

After reading Chapter 10 of "The Definitive ANTLR 4 Reference", I tried to write a simple analyzer to get lexical attributes, but I got an error. How can I get the lexical attributes?

lexer grammar TestLexer;

SPACE:                       [ \t\r\n]+ -> skip;

LINE:                        INT DOT [a-z]+ {System.out.println($INT.text);};
INT:                         [0-9]+;
DOT:                         '.';
[INFO] 
[INFO] --- antlr4-maven-plugin:4.9.2:antlr4 (antlr) @ parser ---
[INFO] ANTLR 4: Processing source directory /Users/Poison/IdeaProjects/parser/src/main/antlr4
[INFO] Processing grammar: me.tianshuang.parser/TestLexer.g4
[ERROR] error(128): me.tianshuang.parser/TestLexer.g4:5:65: attribute references not allowed in lexer actions: $INT.text
[ERROR] /Users/Poison/IdeaProjects/parser/me.tianshuang.parser/TestLexer.g4 [5:65]: attribute references not allowed in lexer actions: $INT.text

ANTLR4 version: 4.9.2.

Reference:
antlr4/actions.md at master · antlr/antlr4 · GitHub
How to get the token attributes in Antlr-4 lexer rule's action · Issue #1946 · antlr/antlr4 · GitHub

Poison
  • 389
  • 2
  • 14

2 Answers2

1

How can I get the lexical attributes?

You can't: labels are simply not supported in lexer rules. You might say, "well, but I'm not using any labels!". But the following:

INT DOT [a-z]+ {System.out.println($INT.text);}

is just a shorthand notation for:

some_var_name=INT DOT [a-z]+ {System.out.println($some_var_name.text);}

where some_var_name is called a label.

If you remove the embedded code (the stuff between { and }), add a label before INT and then generate a lexer, you'll see the following warning being printed to stderr:

labels in lexer rules are not supported in ANTLR 4; actions cannot reference elements of lexical rules but you can use getText() to get the entire text matched for the rule

The last part means that you can grab the entire text of the lexer rule like this:

LINE
 : INT DOT [a-z]+ {System.out.println(getText());}
 ;

But grabbing text from individual parts of a lexer rule is not possible.

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
0

Try to separate the concerns of the lexer and other output matters: that's a main focus point of Antlr VS Bison/Flex. You can use for example visitor/listener patterns from the other chapters of the book.

rikyeah
  • 1,896
  • 4
  • 11
  • 21