I define some String tokens like this in ANTLR4, with some exceptions surely properly handled in Java:
STRINGLIT: '"'('\\'[bfrnt\\"]|~[\n"EOF])*'"';
ILLEGAL_ESC: '"'(('\\'[bfrnt\\"]|~[\n\\"EOF]))*('\\'(~[bfrnt\\"]|EOF))
{if (true) throw new bkool.parser.IllegalEscape(getText());};
UNCLOSED_STRING: '"'('\\'[bfrnt\\"]|[\n\\"EOF])*
{if (true) throw new bkool.parser.UncloseString(getText());};
Then I tested with some cases, with:
"This is a string"
"String with legal escape \\"
"Legal \\n"
"Illegal \"
"Illegal \n"
No exceptions are thrown. Then with some other cases:
"This is a string
"String with legal escape \\"
"Legal \\n"
"Illegal \"
"Illegal \n"
Then it ends up with:
Unclosed string: "
The exceptions are handled by printing the respective improper string with the name of exception
I have been struggling with them for a day and now I'm stuck with it. What is still not okay with my ANTLR definitions?