2

Hello I am trying to implement a translator. Since it is coming more and more complicated I will try to explain better what I'd like to implement.

I need to specify a new java like language. This language must implement all structure of a java method: variable declaration, expression, conditional expression, parenthesis expressions and so on... The language will work with vectors, constants and booleans. It has different function: log, avg, sqrt as wll as sum, diff, shift and so on. This language must be translated into plsql and other languages. So the method defined will become a StoredProcedure or a c++ function or whatever. I need to consider also the math constraints such as priority of operators (+,-,*,/, <<, >> and so on...).

I already get this hint: Decompose expression in base operation: ANTLR + StringTemplate

I need to know the best solution for achieving my task. I suppose I have to use all your solution in a pipelined fashion, but i don't want to use a trial and error method for the solution.

I tried different (separated) solutions, but putting all together is hard for me.

My last problem is to separate an expression between vector and constant and an expression between vector and vector. In fact using plsql I have different function for handling these situations. i.e. an expression vactor1+5 (or 5+vector1) must be translated like PKG_FUN.constant_sum(cursor1, 5) instead vector1+vector2 must be translated as PKG_FUN.vector_sum(vector1, vector2). Moreover I can have functions or expressions that produce vector and other that produces constant and this must be considered when analyzing an expression (i.e. vector a = vector1 +((5+var2)*ln(vector2)*2)^2).

An example of this language can be:

DEFINE my_new_method(date date_from, date date_to, long variable1, long variable2){
   vector result;
   vector out1;
   vector out2; 
   int max = -5+(4);    

   out1 = GET(date_from, date_to, variable1, 20);
   out2 = GET(date_from, date_to, variable2);

   if(avg(out1) > max)
   {
       result = sqrt(ln(out2) + max)*4; 
   }else
   {
       result = out1 + ln(out1) + avg(out2);
   }

       for(int i=0; i<result.length ; i++)
       {
          int num = result.get(i);
          result.set(num*5, i);
       }

       return result;

}

I should translate it in plsql, c or c++ or other languages.

Any help would be appreciated.

Community
  • 1
  • 1
  • This is to big a question for a forum such as this. Please read the [FAQ](http://stackoverflow.com/faq) about what kind of questions you [can](http://stackoverflow.com/faq#questions) and [shouldn't](http://stackoverflow.com/faq#dontask) ask. – Some programmer dude Jan 08 '13 at 08:15
  • Hello, I just need to know how to extend the answer: http://stackoverflow.com/questions/13817046/decompose-expression-in-base-operation-antlr-stringtemplate/14198140#comment19695920_14198140 for handling vector-constant expression and vector-vector expression in order to traslate them differently. Moreover I'd like to know if ANTLR+StringTemplate is the best way for achieving what i am trying to do. – Lorenzo Camerini Jan 08 '13 at 08:23

1 Answers1

1

What you need is "type inference". For every expression, you need to know the types of its operands, and the types of the results of each operator symbol.

You get this by a few steps:

1) by building a symbol table that records the type of declared entity in your variable scopes

2) by walking each expression, computing the types of the leaf nodes: for expressions, in your language at least all constant values are scalars, and any identifier has type you can look up in the symbol table. For most languages, the type of an operator result can be computed from the language rules for the operator, given its operand types. (Some language require the types to computed by constraint propagation). Having computed all of these types, you need to associate each tree node with its type (or at least be able to compute the type for a node on demand).

With this computed type information, you can differentiate between different operators (e.g, + on vectors, + with vector first operand and scalar second, etc.) and so choose which target language construct to generate.

ANTLR doesn't offer you any support in building and managing symbol tables, or in computing the type information, other than offering you a tree. Once you have the tree and all the type information, you can choose which string template to use to generate code, giving you and on-the-fly style translator. So doing this is just a lot of sweat. (Doing an on-the-fly translator has a downside: you better generate the exact code you want in that place, because you have no chance to optimize the generated result, and that likely means huge case analyses of the tree to choose what to generate).

Our DMS Software Reengineering Toolkit does offer you such additional support for constructing symbol tables, and for computing inferences over trees with its attributed grammar evaluators, along with additional means to write explicit transformations, easily made conditional on such type lookups. The transformations map from a tree in the source language, to a tree in the target language. You can then so "simpler" translations to the target language, and apply optimizations in the target language using additional explicit transforms. This can greatly simplify the translation process.

But in any case, building a full translator for one language (let alone 3) is a lot of work, for those that have experience and background for doing it. The fact that you have asked this question suggests you likely don't understand lots of issues related to analyzing and transforming code. I suggest you read a good compiler book (e.g., Aho/Ullman/Sethi "Compilers") before you proceed, or you are likely to run into other troubles like this.

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341