How to append two lexer expressions- ANTLR4

Question

I need lexer to parse two different character expressions as one expression.

So I've something like this,

rootPath : 'A' rootType SEP childPath; //my output should be AB:2 or AC:4

childPath : RESERVED_NUMBERS;

rootType : ONE_LETTER;

SEP: ':' RESERVED_NUMBERS :[1-9] ONE_LETTER : [A-Z]

I'm getting error when I'm parsing this, How can I combine 'A' and ONE_LETTER into single string

Mike Cargal · Accepted Answer · 2022-02-28T21:12:20.217

Based upon your comments, it appears that you want to keep the two letters of your root level and sub level as separate tokens, but have a "conflict" (in you rootLevel parser rule) that your 'A' token literal and your ONE_LETTER rule both match the "A" character. If I have this right, you're not really "appending Lexer expressions".

It's important to recognize that the 'A' in your grammar is just a syntactic shortcut for defining a Lexer rule (ANTLR will create it with a name something like T__0), so it's just another Lexer rule.

It's also important to understand that you stream of input characters are used to create a stream of Tokens to be used by the parser. There is nothing that a parser rule can do to control whether "A" matches the T__0 ('A') rule or the ONE_LETTER rule. That decision was made by the Tokenizer, and it has to pick one just looking at the stream of input characters.

With that in mind, you should probably not try to fight the Lexer, but allow both characters to be recognized as ONE_LETTER tokens, and add a semantic predicate to your rootPath rule:

rootPath
    : rootLevel = 'A' {$rootLevel.text == "A" }? subLevel = ONE_LETTER SEP childPath
    ; //my output should be AB:2 or AC:4

now the rootPath rule will only match if the rootLevel ONE_LETTER token is an "A", and you will have rootLevel and subLevel fields in your RootPathContext class.

I tried it worked, But my goal is to parse rootType ([A-Z]) seperately to maintain Business Hierarchy. So I need to use 'A' rootType SEP and Inside my Java code, I'll store rootType in a variable. — harshini reddy, Feb 28 '22 at 19:53
So My business expectation is A- root level followed by sub-levels[A-Z] in that and section number.FINALLY, AB:1. I NEED TO PARSE AND STORE SUB-LEVEL SEPERATELY — harshini reddy, Feb 28 '22 at 19:55
thank you Mike, it would be able if you tell me a way to concatenate two lexer rules, so that it'll parse as one while parsing. — harshini reddy, Feb 28 '22 at 20:04

How to append two lexer expressions- ANTLR4

1 Answers1