0

I am trying to write a parser for a simple language that recognizes integer and float expressions using ocamlyacc. However I want to introduce the possiblity of having variables. So i defined the token VAR in my lexer.mll file which allows it to be any alphanumneric string starting with a capital letter.

 expr:
 | INT                      { $1 }
 | VAR                      { /*Some action */}
 | expr PLUS expr           { $1 + $3 }
 | expr MINUS expr          { $1 - $3 }

 /* and similar rules below for real expressions differently */

Now i have a similar definition for real numbers. However when i run this file, I get 2 reduce/reduce conflict because if i just enter a random string(identified as token VAR). The parser would not know if its a real or an integer type of variable as the keyword VAR is present in defining both int and real expressions in my grammar.

Var + 12  /*means that Var has to be an integer variable*/
Var  /*Is a valid expression according to my grammar but can be of any type*/

How do I eliminate this reduce/reduce conflict without losing the generality of variable declaration and mainting the 2 data types available to me.

Vivek Pradhan
  • 4,777
  • 3
  • 26
  • 46
  • You cannot keep track of type information in a context-free grammar. You must do it at runtime. – n. m. could be an AI Feb 23 '13 at 16:01
  • Well, if you only have two types, then it's kinda possible, but you have to replicate your entire grammar to have int-expr and real-expr etc; also, your tokenizer must return either int-var or real-var by looking the symbol up in the symbol table. – n. m. could be an AI Feb 23 '13 at 16:06
  • Thanks @n.m. for your reply. I have already written a similar expression for reals. But the problem is I have bound any alphanumeric regular expression as **VAR** in my lexer and I use this token in both real_expr and int_expr. How do I define the tokens int_var and real_var. Are you suggesting that I need to store the variables in some kind of a data structure? – Vivek Pradhan Feb 23 '13 at 16:17
  • You define INT_VAR and REAL_VAR just like VAR. The lexer must do a bit more than just a regular expression match, it has to look up the type (execute a lexer action). Yes you need to store variables somewhere, how else would you find their values? – n. m. could be an AI Feb 23 '13 at 16:27
  • How can we look up the type of a variable during lexing? I mean both int and real vars can be any alphanumeric and I call that a VAR. It would help if you could give some code that does what u are suggesting. – Vivek Pradhan Feb 23 '13 at 16:44
  • How do *you* know which type it is? – n. m. could be an AI Feb 23 '13 at 16:45
  • I don't. Thats precisely why I asked this question. I get a reduce/reduce conflict because when the lexer identifies a VAR from a stream of characters. It does not know whether it is a real_expr or a real_expr. Its only when there is a look ahead of one token i.e a + or some similar operator that works on integers only is when the parser can certainly say that the variable is an integer. I just want to get rid of the reduce/reduce conflict. – Vivek Pradhan Feb 23 '13 at 16:49
  • OK so you deduce the type of the variable from the expression. There are no declarations. This means you cannot do this in the grammar. Now to get rid of the conflict, you need to do the opposite, unify different types of expressions in one production. – n. m. could be an AI Feb 23 '13 at 17:12

0 Answers0