Manipulate AST to change input text

Question

I have a question of the more general kind.

I'm currently working on a project using ANTLR v3.4 - here's our idea:

Parse some input and store important parts of it during the process in a Map.
Display the parsed input in a form (please see this Example Picture). The form fields are named by the keys and filled with the values of the Map, while the text input field contains the whole input.

So, if the user enters/changes the text at the bottom, the content of the form fields will also change because of the code actions in the parser grammar that build the Map (which is used as input to fill the form fields).

But we also want the other direction: If the user changes a value in one of the form fields, those changes should be reflected in the text at the bottom.

I thought of using AST creation and then manipulation to accomplish this task. I've already added tree rewrite rules to my grammar. But I'm not quite sure how to make the connection between the Map, the AST and how to create output.

Am I even on the right track? Any other ideas?

EDIT

So I followed your suggestion and put the leave nodes directly into a data structure that can be changed and everything is fine.

But now I have 2 other problems:

How do I add tokens? There are optional parts in the input and therefor some form fields may be empty. If I change the form field to a given value, I'd have to create new nodes in the tree. How would I do this? I'd have to determine where to put the new node...
How do I delete tokens? So, if I change the value of some form field to 'empty', the associated node has to be deleted, but this may affect surrounding nodes (like, in a list you have commas after each token). I really have no idea how to do this.

Here's my current grammar:

parser grammar TestASTParser;

options {
  language = Java;
  tokenVocab = TestASTLexer;
  output = AST;
}

...

entry
:
lemma=phrase 

(lgrammatt=grammatts)? (lsemantic=semantics)? (lsubsemantic=subsemantics)?

SEP 

translat=phrase  

(tgrammatt=grammatts )? (tsemantic=semantics)? (tsubsemantic=subsemantics)?

EOF

  -> ^(ENTRY 
         ^(LEMMA ^(LEMMA_TEXT $lemma) ^(LEMMA_GRAMM $lgrammatt)? 
         ^(LEMMA_SEM $lsemantic)? ^(LEMMA_SUBSEM $lsubsemantic)?) 
      ^(SEPARATOR SEP)
       ^(TRANSLATION 
          ^(TRANSLAT_TEXT $translat) ^(TRANSLAT_GRAMM $tgrammatt)? 
          ^(TRANSLAT_SEM $tsemantic)? ^(TRANSLAT_SUBSEM $tsubsemantic)?)
       )

;

phrase
:
    t=((options{ greedy = false; }:.)+)

    -> PHRASE[$phrase.text]
;

grammatts
:
  OPEN_BRACKET grammlist CLOSE_BRACKET
;

semantics:
  LEFT_CURLY phrase RIGHT_CURLY
;

subsemantics:
  D_LEFT_CURLY phrase D_RIGHT_CURLY
;

grammlist
:
   grammatt (COMMA grammatt)*
;

grammatt:
    (GENUS | GRAMMATT)
;

So the resulting tree for test input

Angeber(in) [f] {Großkotz} {{xyz}} <> baterlunza(a) [m(f), refl] {boeser buba}

looks like this (yellow nodes will always be there, everything else is optional). So, if I change "refl" to "" in the form field, the node "refl" should disappear, but also the "," before it. If I delete "f" below LEMMA_GRAMM, the whole subtree has to disappear (because the list cannot be empty). Or, if I wanna add subsemantics to TRANSLATION, I would have to create the corresponding node TRANSLAT_SUBSEM plus child nodes for the curly braces {{ and }}.

I really have no idea where to start. Is my tree structure even good for this? Do I need my own implementation of BaseTree? Or is it just plain Java?

Did you already have a look at Eclipse JDT? They don't use EMF there. So chances are good that they might do some similar things. — SpaceTrucker, Jan 04 '13 at 13:28
How about having the Map associate the key to the corresponding AST node? You could get the initial value from the node and update it directly. — user1201210, Jan 04 '13 at 18:39

score 0 · Answer 1 · answered Jan 04 '13 at 13:18

I don't know if this is directly suitable to your case. But when using Xtext you also get a corresponding EMF Model. Then your user interface is basically a form for the EMF Model in the upper part of the view and an Xtext editor for the to-text-serialised part of the EMF model in the lower part of the view. Since the upper and lower part of the view are all based on the same EMF model instance, changes in either part of the view can be reflected in any other part. Xtext will provide you with the necessary parsing and serialization of the text in the lower form part.

Since you are using ANTLR, you could migrate your ANTLR grammar to Xtext. This is however not automatically achieveable, but shouldn't be a too difficult task.

Well, thanks for your answer and the suggestion. Unfortunately, in this case it is of no help because we can't use Xtext in this project (for reasons I'm not quite able to explain). We even shifted from Xtext to ANTLR actually. ;) But thanks nevertheless. — codebat, Jan 04 '13 at 13:24

Manipulate AST to change input text

1 Answers1