3

I want to replace a token using ANTLR.

I tried with TokenRewriteStream and replace, but it didn't work.

Any suggestions?

  ANTLRStringStream in = new ANTLRStringStream(source);
  MyLexer lexer = new MyLexer(in);
  TokenRewriteStream tokens = new TokenRewriteStream(lexer);
  for(Object obj : tokens.getTokens()) {
     CommonToken token = (CommonToken)obj;  
     tokens.replace(token, "replacement");
  }

The lexer finds all occurences of single-line comments, and i want to replace them in the original source too.

EDIT:

This is the grammar:

grammar ANTLRTest;

options {
  language = Java;
}

@header {
  package main;
}

@lexer::header {
  package main;
}

rule: SingleLineComment+;

SingleLineComment
  :  '//' ~( '\r' | '\n' )* {setText("replacement");}
    ;

What i want to do is replace all single-line comments in a file, let's say.

user1019710
  • 321
  • 5
  • 14

1 Answers1

4

Rewrite the text inside the lexer:

SingleLineComment
 : '//' ~('\r' | '\n')* {setText("replacement");}
 ;

EDIT

Okay, here's a quick demo how you can filter certain tokens from a language:

SingleCommentStrip.g

grammar SingleCommentStrip;

parse returns [String str]
@init{StringBuilder builder = new StringBuilder();}
 : (t=. {builder.append($t.text);})* EOF {$str = builder.toString();}
 ;

SingleLineComment
 : '//' ~('\r' | '\n')* {skip();}
 ;

MultiLineComment
 : '/*' .* '*/'
 ;

StringLiteral
 : '"' ('\\' . | ~('"' | '\\' | '\r' | '\n'))* '"'
 ;

AnyOtherChar
 : .
 ;

Main.java

import org.antlr.runtime.*;

public class Main {
  public static void main(String[] args) throws Exception {
    SingleCommentStripLexer lexer = new SingleCommentStripLexer(new ANTLRFileStream("Test.java"));
    SingleCommentStripParser parser = new SingleCommentStripParser(new CommonTokenStream(lexer));
    String adjusted = parser.parse();
    System.out.println(adjusted);
  }
}

Test.java

// COMMENT
class Test {
  /*
  // don't remove
  */
  // COMMENT AS WELL
  String s = "/* don't // remove */ \" \\ me */ as well";
}

Now run the demo:

java -cp antlr-3.3.jar org.antlr.Tool SingleCommentStrip.g
javac -cp antlr-3.3.jar *.java
java -cp .:antlr-3.3.jar Main

which will print:


class Test {
  /*
  // don't remove
  */

  String s = "/* don't // remove */ \" \\ me */ as well";
}

(i.e. the single line comments are removed)

Bart Kiers
  • 166,582
  • 36
  • 299
  • 288
  • Thank you. I am new to ANTLR, i didn't know about this. And i also be grateful if you could point me to a nice tutorial about ANTLR. – user1019710 Mar 27 '12 at 05:26
  • @user1019710, you're welcome. There's quite an extensive list of tutorials/resources w.r.t. ANTLR here on SO: http://stackoverflow.com/questions/278480/antlr-tutorials – Bart Kiers Mar 27 '12 at 07:08
  • I tried the code above. But i want the replacements to be made in the original string also, e.g. source. How do i manage this? – user1019710 Mar 27 '12 at 16:39
  • The CommonTokenStream contains only the tokens that the lexer found. I need the rest of the tokens from the original source too. A solution would be to go to the start index of the token in the original source and to replace it right there? EDIT: But i think this would be difficult... because as i move along with the replacements, the indexes in the original source will modify because the length of the replacements will differ from those of the original tokens... – user1019710 Mar 27 '12 at 18:41
  • @user1019710, well, usually the lexer creates tokens for *all* the chars from the input. How is that *not* so in your case? Are you using `filter=true` in your lexer grammar? I think it's time you edited your original question and provide more information about your problem/question. – Bart Kiers Mar 27 '12 at 18:54
  • @user1019710, no, you added a combined grammar (lexer and parser), not just a lexer grammar. How you're trying to do it won't work, you have to account for the entire input. I added a quick demo of how to do that. – Bart Kiers Mar 27 '12 at 21:10
  • Thank you very much. I will try this code tonight, but i think this is exactly what i was looking for. – user1019710 Mar 28 '12 at 04:19
  • I can't test this code right now, but i was thinking at a way to replace {System.out.print($t.text);} with a StringBuilder of my own so i can append the $t.text. Should i modify the parse() method from the generated SingleCommentStripParser.java file so it can take a StringBuilder parameter and append the $t.text to it, or is there another way? – user1019710 Mar 28 '12 at 09:46
  • @user1019710, didn't test the edited version, but I'm fairly sure it works. – Bart Kiers Mar 28 '12 at 10:05
  • Hi again. The code above works just fine... But i need to so something else right now. For a token, let's say StringLiteral, i want to process every char of this string. I tried t= ('\\' . | ~('"' | '\\' | '\r' | '\n'))* and then use t, but it didn't work... – user1019710 Apr 20 '12 at 19:59
  • @user1019710, sorry I have no idea what you're trying to do. Consider creating a new question. – Bart Kiers Apr 20 '12 at 20:04