5

This should be fairly simple. I'm working on a lexer grammar using ANTLR, and want to limit the maximum length of variable identifiers to 32 characters. I attempted to accomplish this with this line(following normal regex - syntax):

ID : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|'_'){0,31};

No errors in code generation, but compilation failed due to a line in the generated code that was simply:

0,31

Obviously antlr is taking the section of text between the brackets and placing it in the accept state area along with the print line. I searched the ANTLR site, and I found no example or reference to an equivalent expression. What should the syntax of this expression be?

Mahdi Javaheri
  • 1,080
  • 13
  • 25

1 Answers1

4

ANTLR4 is not able to deal with the quantifier syntax {a,b}, moreover, I don't know if it is great to set this constraint in the lexer. I explain myself. The constraint you add in the lexer is responsible for the token recognition. So, if your string is more than 32 char, then the token will not be recognized as an ID token. That seems not so great because it can lead your string to be recognized as another token and will probably lead to a failure fom the parsing phase.

A solution is to avoid this length constraint and deal with it in a Java ANTLR4 Listener or Visitor for example, throwing an exception/displaying an error...etc when the length is greater than 32 char.

EDIT> This question had already been answered here: Range quantifier syntax in ANTLR Regex

Community
  • 1
  • 1
Vincent Aranega
  • 1,441
  • 10
  • 21