I'm making a java to python translator with the help of the flex and bison tools. The bison rules refer to a restriction of java grammar. In addition to creating the rules in bison, I also created the Abstract Syntax Tree as an Intermediate Representation. The respective nodes of the AST were created in the semantic actions alongside the bison rules.
My problem concerns the management of lists of elements (or recursion) in the bison rules.
Giving the translator the following text file, the parsing is completed without syntactical errors but when I cross the AST in pre-order for test purposes, it would seem that the crossing stops in the first child node of the list, and therefore does not cycle on the remaining children of the lists.
TEXT FILE IN INPUT:
import java.util.*;
class table {
int a;
int c;
}
class ball {
int a;
}
I put the grammar rules of bison involved in it:
Program
: ImportStatement ClassDeclarations { set_parse_tree($$ = program_new($1,$2,2));}
;
ImportStatement
: IMPORT LIBRARY SEMICOLON {$$ = import_new($2,1); printf("Type di import: %d \n", $$->type);}
| %empty {$$ = import_new(NULL,0); }
;
ClassDeclarations
: ClassDeclaration { $$ = list_new(CLASS_DECLARATIONS,$1,NULL,2); }
| ClassDeclarations ClassDeclaration { list_append( $$ = $1, list_new(CLASS_DECLARATIONS,$2,NULL,2)); }
;
ClassDeclaration
: CLASS NameID LBRACE FieldDeclarations RBRACE { $$ = classDec_new($2,$4,2); }
| PUBLIC CLASS NameID LBRACE FieldDeclarations RBRACE { $$ = classDec_new($3,$5,2);}
;
FieldDeclarations
: FieldDeclaration {$$ = list_new(FIELD_DECLARATIONS,$1,NULL,2); }
| FieldDeclarations FieldDeclaration { list_append( $$ = $1, list_new(FIELD_DECLARATIONS,$2,NULL,2)); }
;
FieldDeclaration
: VariableFieldDeclaration {$$ = fieldDec_new($1,NULL,NULL,3);}
| PUBLIC VariableFieldDeclaration {$$ = fieldDec_new($2,NULL,NULL,3);}
| MethodFieldDeclaration {$$ = fieldDec_new(NULL,$1,NULL,3);}
| ConstructorDeclaration {$$ = fieldDec_new(NULL,NULL,$1,3);}
;
VariableFieldDeclaration
: Type VariableDeclarations SEMICOLON {$$ = variableFieldDec_new($1,$2,2);}
;
VariableDeclarations
: VariableDeclaration {$$ = list_new(VARIABLE_DECLARATIONS,$1,NULL,2); }
| VariableDeclarations COMMA VariableDeclaration { list_append( $$ = $1, list_new(VARIABLE_DECLARATIONS,$3,NULL,2)); }
;
VariableDeclaration
: NameID {$$ = varDec_new($1,NULL,NULL,NULL,NULL,5);}
| NameID ASSIGNOP ExpressionStatement {$$ = varDec_new($1,$3,NULL,NULL,NULL,5);}
| NameID LSBRACKET RSBRACKET {$$ = varDec_new($1,NULL,NULL,NULL,NULL,5); }
| LSBRACKET RSBRACKET NameID {$$ = varDec_new($3,NULL,NULL,NULL,NULL,5); }
| NameID LSBRACKET RSBRACKET ASSIGNOP NEW Type LSBRACKET Dimension RSBRACKET {$$ = varDec_new($1,NULL,$6,$8,NULL,5); }
| LSBRACKET RSBRACKET NameID ASSIGNOP NEW Type LSBRACKET Dimension RSBRACKET {$$ = varDec_new($3,NULL,$6,$8,NULL,5); }
| NameID LSBRACKET RSBRACKET ASSIGNOP LBRACE VariableInitializers RBRACE {$$ = varDec_new($1,NULL,NULL,NULL,$6,5); }
| LSBRACKET RSBRACKET NameID ASSIGNOP LBRACE VariableInitializers RBRACE {$$ = varDec_new($3,NULL,NULL,NULL,$6,5); }
| NameID LSBRACKET RSBRACKET ASSIGNOP LBRACE RBRACE {$$ = varDec_new($1,NULL,NULL,NULL,NULL,5); }
| LSBRACKET RSBRACKET NameID ASSIGNOP LBRACE RBRACE {$$ = varDec_new($3,NULL,NULL,NULL,NULL,5); }
;
Type
: INT {$$ = typeId_new($1,1);}
| CHAR {$$ = typeId_new($1,1);}
| FLOAT {$$ = typeId_new($1,1);}
| DOUBLE {$$ = typeId_new($1,1);}
;
NameID
: ID {$$ = nameID_new($1, 1);}
;
In the general structure of a node of the ast there are:
- the type of each node,
- a union structure containing the different structures of each possible node,
- an integer variable (numLeaf) which represents the maximum possible number of leaves for each parent node (it is passed from bison in semantic actions as the last parameter of the functions)
- an array of pointers (leafVet) to structures that will have the number of leaves as the size and each location will contain a pointer to a possible child (if the child is not present it will be NULL).
These last two variables are used to manage the crossing of the tree. I will cycle on each vector to pass to the children of each node.
I think the problem refers mainly to the structures of the lists (ClassDeclarations, FieldDeclarations, VariableDeclarations...). The structure of each list is as follows and this structure is part of the union of possible structures of each node.
STRUCT LIST:
struct {
int type;
struct ast_node *head; //pointer to the head of the list
struct ast_node *tail; //pointer to the tail of the list
} list;
The functions that refer to the creation of list nodes are the following:
static ast_node *newast(int type)
{
ast_node *node = malloc(sizeof(ast_node));
node->type = type;
return node;
}
ast_list *list_new(int type, ast_node *head, ast_list *tail, int numLeaf)
{
ast_list *l = newast(AST_LIST); //allocates memory for the AST_LIST type node
l->list.type = type;
l->list.head = head;
l->list.tail = tail;
l->numLeaf = numLeaf;
l->LeafVet[0] = head;
l->LeafVet[1] = tail;
return l;
}
void list_append(ast_list *first, ast_list *second)
{
while (first && first->list.tail)
{
first = first->list.tail;
}
if (first)
{
first->list.tail = second;
}
first->numLeaf = 2;
}
I think the error could be in the list_append function because when I run through the pre-order tree, it manages to enter the first leaf node of the lists but does not proceed with the remaining leaf nodes. Specifically, referring to the initial text file, the crossing stops after reaching the NameID node of VariableDeclaration (to be precise, it stops at the first variable of the first class) without giving any error. Immediately afterwards it should parse the second leaf node of fieldDeclarations as there is a second variable declaration (variableFieldDeclaration), but trying to print the nonzero leaf numbers of each list, I always get 1, so it would seem that the append of the lists do not work properly.
The error could also be in the crossing algorithm that I write below:
void print_ast(ast_node *node) //ast preorder
{
int leaf;
leaf = node->numLeaf;
printf("Num leaf: %d \n",leaf);
switch(node->type)
{
case AST_LIST:
break;
case AST_PROGRAM:
break;
case AST_IMPORT:
printf("Import: %s \n", node->import.namelib);
break;
case AST_CLASSDEC:
printf("name class: %s\n", node->classDec.nameClass->nameID.name);
break;
case AST_TYPEID:
break;
case AST_VARFIELDDEC:
break;
case AST_VARDEC:
break;
case AST_FIELDDEC:
break;
case AST_NAMEID:
printf("Il valore della variabile e': %s \n", node->nameID.name);
break;
default:
printf("Error in node selection!\n");
exit(1);
}
for (int i=0; i<leaf; i++)
{
if(node->LeafVet[i] == NULL ){
continue;
} else{
printf("%d \n", node->LeafVet[i]->type);
print_ast(node->LeafVet[i]);
}
}
}
I hope you can help me, thanks a lot.