antlr4: how to make identifier case insensitive

Question

deficiency is a keyword in my DSL, I want to make keywords case insensitive.

I have read this doc and try. https://github.com/antlr/antlr4/blob/master/doc/case-insensitive-lexing.md

In my grammer, I have two basic rules: matching_rule_not_work and matching_rule_will_work. The first rule is which I want, but it did not work. The second rule worked, but this is not case insensitive.

so, how to make the first rule work? thanks

grammar test;

matching_rule_not_work: K_DEFICIENCY;

matching_rule_will_work: 'deficiency';

WS : [ \t\r\n]+ -> skip;


K_DEFICIENCY: D E F I C I E N C Y;

fragment A : [aA];
fragment B : [bB];
fragment C : [cC];
fragment D : [dD];
fragment E : [eE];
fragment F : [fF];
fragment G : [gG];
fragment H : [hH];
fragment I : [iI];
fragment J : [jJ];
fragment K : [kK];
fragment L : [lL];
fragment M : [mM];
fragment N : [nN];
fragment O : [oO];
fragment P : [pP];
fragment Q : [qQ];
fragment R : [rR];
fragment S : [sS];
fragment T : [tT];
fragment U : [uU];
fragment V : [vV];
fragment W : [wW];
fragment X : [xX];
fragment Y : [yY];
fragment Z : [zZ];

score 2 · Accepted Answer · answered Jul 29 '18 at 09:35

In your example you're using both K_DEFICIENCY and the literal 'deficiency'. Using a literal in a parser rule is the same as defining a token rule for that literal, which comes before all other token rules.

So you have two types of tokens: K_DEFICIENCY tokens and 'deficiency' tokens. When the lexer sees an input that could match both, the one that comes first is used, which would be 'deficiency' because, as I said, literals always come first.

If you remove the matching_rule_will_work: 'deficiency'; line from your grammar, it will work like you expect it to.

`When the lexer sees an input that could match both, the one that comes first is used` inspired by your words, I fixed my problem. Many thanks. mua~ — hellojinjie, Jul 29 '18 at 10:26

antlr4: how to make identifier case insensitive

1 Answers1