6

I am a bit perplexed about how to capture a quoted string in ANTLR4.

Currently, this lexer rule is not tokenizing the way I expect.

The sample string is "=\"". I've tried lots of different ways to capture this, but I am at a loss about what I am doing incorrectly. I'd really appreciate some insights on best practices for this. Thank you so much!

ESCAPED_QUOTE : '\"';
QUOTED_STRING :   '"' ( ESCAPED_QUOTE | ~('\n'|'\r') )*? '"';
Sam Harwell
  • 97,721
  • 20
  • 209
  • 280
Steve H.
  • 253
  • 1
  • 3
  • 12
  • I came up with this method. It seems to work, but I'm wondering if this is the best way to go about it. Thank you! ESCAPED_QUOTE : '\"'; QUOTE : '"'; QUOTED_STRING : QUOTE ( ESCAPED_QUOTE | ~('\n'| '\r' | '\"') )* QUOTE; – Steve H. Oct 08 '13 at 00:24

1 Answers1

9

There are two problems with the above rules.

  1. You didn't actually escape your quote like you thought. You meant to use '\\"'.
  2. Your ESCAPED_QUOTE rule doesn't form a token all by itself, so it should be a fragment rule.

The result of these two changes would be the following:

fragment ESCAPED_QUOTE : '\\"';
QUOTED_STRING :   '"' ( ESCAPED_QUOTE | ~('\n'|'\r') )*? '"';
Sam Harwell
  • 97,721
  • 20
  • 209
  • 280
  • I copied the rule to my grammar and tested. It gives me an error when there are special characters such as ':', '!', ''', etc. That ~('\n'|'\r') regex is supposed to accept anything other than a newline character? – yoshi Jun 14 '17 at 03:07