1

I'm trying to create a parser for QuickBasic and this is my attempt to get the comments:

grammar QuickBasic;

options 
{
    language = 'CSharp2';
    output = AST;
}

tokens
{
    COMMENT;
}

parse
    :    .* EOF
    ;

// DOESN'T WORK
Comment
    :    R E M t=~('\n')* { Text = $t; } -> ^(COMMENT $t)
    |    Quote t=~('\n')* { Text = $t; } -> ^(COMMENT $t)
    ;

Space  
    :    (' ' | '\t' | '\r' | '\n' | '\u000C') { Skip(); }  
    ;

fragment Quote : '\'';    
fragment E     : 'E' | 'e';
fragment M     : 'M' | 'm';
fragment R     : 'R' | 'r';

Even if I rewrite using only the token COMMENT and nothing more, I still get the same error.

// It DOESN'T WORK EITHER
Comment
    :    (R E M | Quote) ~('\n')* -> ^(COMMENT)
    ;

If I give up rewriting, it works:

// THIS WORKS
Comment
    :    (R E M | Quote) ~('\n')*
    ;
Jonathas Costa
  • 1,006
  • 9
  • 27

1 Answers1

3

Rewrite rules only work with parser rules not with lexer rules. And t=~('\n')* will cause only the last non-line-break to be stored in the t-label, so that wouldn't have worked anyway.

But why not skip these Comment tokens all together. If you leave them in the token stream, you'd need to account for Comment tokens in all your parser rules (where Comment tokens are valid to occur): not something you'd want, right?

To skip the, simply call Skip() at the end of the rule:

Comment
    :    R E M ~('\r' | '\n')* { Skip(); }
    |    Quote ~('\r' | '\n')* { Skip(); }
    ;

or more concise:

Comment
    :    (Quote | R E M) ~('\r' | '\n')* { Skip(); }
    ;

However, if you're really keen on leaving Comment tokens in the stream and strip either "rem" or the quote from the comment, do it like this:

Comment
    :    (Quote | R E M) t=NonLineBreaks { Text = $t.text; }
    ;

fragment NonLineBreaks : ~('\r' | '\n')+;

You could then also create a parser rule that creates an AST with COMMENT as the root (although I don't see the benefit over simply using Comment):

comment
    :    Comment -> ^(COMMENT Comment)
    ;
Bart Kiers
  • 166,582
  • 36
  • 299
  • 288