1

I'm trying to implement a lexer rule for an oracle Q quoted string mechanism where we have something like q'$some string$'

Here you can have any character in place of $ other than whitespace, (, {, [, <, but the string must start and end with the same character. Some examples of accepted tokens would be: q'!some string!' q'ssome strings' Notice how s is the custom delimiter but it is fine to have that in the string as well because we would only end at s'

Here's how I was trying to implement the rule:

Q_QUOTED_LITERAL: Q_QUOTED_LITERAL_NON_TERMINATED . QUOTE-> type(QUOTED_LITERAL); 

Q_QUOTED_LITERAL_NON_TERMINATED:
    Q QUOTE ~[ ({[<'"\t\n\r] { setDelimChar( (char)_input.LA(-1) ); } 
    ( . { !isValidEndDelimChar() }? )* 
;

I have already checked the value I get from !isValidEndDelimChar() and I'm getting a false predicate here at the right place so everything should work, but antlr simply ignores this predicate. I've also tried moving the predicate around, putting that part in a separate rule, and a bunch of other stuff, after a day and a half of research on the same I'm finally raising this issue.

I have also tried to implement it in other ways but there doesn't seem to be a way to implement a custom char delimited string in antlr4 (The antlr3 version used to work).

user3770822
  • 63
  • 1
  • 5

1 Answers1

3

Not sure why the { ... } action isn't invoked, but it's not needed. The following grammar worked for me (put the predicate in front of the .!):

grammar Test;

@lexer::members {
  boolean isValidEndDelimChar() {
    return (_input.LA(1) == getText().charAt(2)) && (_input.LA(2) == '\'');
  }
}

parse
 : .*? EOF
 ;

Q_QUOTED_LITERAL
 : 'q\'' ~[ ({[<'"\t\n\r] ( {!isValidEndDelimChar()}? . )* . '\''
 ;

SPACE
 : [ \t\f\r\n] -> skip
 ;

If you run the class:

import org.antlr.v4.runtime.*;

public class Main {

  public static void main(String[] args) {

    Lexer lexer = new TestLexer(CharStreams.fromString("q'ssome strings' q'!foo!'"));
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    tokens.fill();

    for (Token t : tokens.getTokens()) {
      System.out.printf("%-20s %s\n", TestLexer.VOCABULARY.getSymbolicName(t.getType()), t.getText());
    }
  }
}

the following output will be printed:

Q_QUOTED_LITERAL     q'ssome strings'
Q_QUOTED_LITERAL     q'!foo!'
EOF                  <EOF>
Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • You're a life saver. Turns out replacing (_input.LA(1) == delimChar) with (_input.LA(1) == getText().charAt(2)) in isValidEndDelimChar() was all I needed. It's still weird that even though I was getting the same response from both implementations of the function, my old one couldn't get the closure to end. Thanks again! – user3770822 Apr 11 '18 at 19:34
  • Good to hear it @user3770822! – Bart Kiers Apr 11 '18 at 19:37