ANTLR4 How can I create a regular expression that allows all characters except two selected ones?

Question

Hi for example I have this code for the g4 file:

a: [A-Z][A-Z];
b: [a-z]'3';

Now I want to add one line more, which recognizes all characters that do not belong to a or b

I tried:

a: [A-Z][A-Z];
b: [a-z]'3';
ALLOTHERCHARACTERS: ~[a]|~[b]

But i didn´t work.

For example the input 84209ddjio29 should now be in ALLOTHERCARACTERS, but i didn ´t work.

(The Lexer gives at the end a java file, but I think this is not important to know, for this "task")

Are you trying in your example to write a regex that would match any character except 'a' and 'b'? — Dan, May 02 '22 at 15:47
No any Character except [A-Z][A-Z] and [a-z]'3', for example ALLOTHERCHARACTERS should not catch z3 or AA or BB or b3 but it should catch for example AAAjidej29 or b7 or u9 — gekomorio, May 02 '22 at 15:50
Would it not be easer just to search for the two excluded characters. — cliff2310, May 02 '22 at 16:20

score 1 · Answer 1 · answered May 02 '22 at 20:41

There are many things going wrong here: inside parser rules, you cannot use character sets. So a: [A-Z][A-Z]; is not possible. Only a lexer rule can use character sets, so A: [A-Z][A-Z]; is valid.

So, to define a valid (lexer) grammar, you'd need to do this:

A : [A-Z] [A-Z];
B : [a-z] '3';

Now for your second problem: how to negate rules A and B? Answer: you cannot. You can only negate single characters. So negating A : [A-Z]; would be NA: ~[A-Z]; (or NA : ~A; is also valid). But you cannot negate a rule that matches 2 characters like A : [A-Z] [A-Z];.

If you want a rule that matches anything other than upper case letters, lower case letters and the digit 3, then you can so this:

ALLOTHERCHARACTERS : ~[A-Za-z3];

score -2 · Answer 2 · answered May 02 '22 at 17:15

-2

This is the proper syntax for "anything except":

[^ab]

so that will match any character that is not a or b.

answered May 02 '22 at 17:15

ejkeep

318
1
6

1

No, `[^ab]` is not valid ANTLR syntax – Bart Kiers May 02 '22 at 20:34

ANTLR4 How can I create a regular expression that allows all characters except two selected ones?

2 Answers2

Linked