0

I have this grammar section in a happy parser, given on the Happy official site, but I need some deeper explanation of the meaning of the rules in brackets. Here is the token definition

%token 
      let             { TokenLet }
      in              { TokenIn }
      int             { TokenInt $$ }
      var             { TokenVar $$ }
      '='             { TokenEq }
      '+'             { TokenPlus }
      '-'             { TokenMinus }
      '*'             { TokenTimes }
      '/'             { TokenDiv }
      '('             { TokenOB }
      ')'             { TokenCB }

and here the grammar section

Exp   : let var '=' Exp in Exp  { Let $2 $4 $6 }
          | Exp1                    { Exp1 $1 }

Exp1  : Exp1 '+' Term           { Plus $1 $3 }
      | Exp1 '-' Term           { Minus $1 $3 }
      | Term                    { Term $1 }

Term  : Term '*' Factor         { Times $1 $3 }
      | Term '/' Factor         { Div $1 $3 }
      | Factor                  { Factor $1 }

Factor            
      : int                     { Int $1 }
      | var                     { Var $1 }
      | '(' Exp ')'             { Brack $2 }

What I understand is that the lexer, defined below in the file, should produce tokens only of the type definined and then build the parse tree using the grammar. But what exactley mean "{Let $2 $4 $6}"? I know that $2 refers to the second rule argument and so on but if someone can give me a "human read version" of the rules I would be really happy. Hope I've been clear.

Thanks in advance.

A Coder Gamer
  • 840
  • 1
  • 10
  • 31
badvi93d
  • 133
  • 9

2 Answers2

1

This line is one rule for creating (parsing) the production Exp:

Exp   : let var '=' Exp in Exp  { Let $2 $4 $6 }

It corresponds to the rule:

if you see "let"        ($1)
followed by a variable name  ($2)
followed by "="         ($3)
followed by an Exp      ($4)
followed by "in"        ($5)
followed by another Exp ($6)

then return the value Let $2 $4 $6. The $n parameters will be replaced with the values of each sub-production. So if this rule is matched, the Let function (which is probably some data constructor) will be called with:

  • the value of the var token as the first parameter,
  • the first Exp parsed ($4) as the second parameter
  • and the second parsed Exp ($6) as the third parameter.

I believe here the value of the var token is the variable name.

ErikR
  • 51,541
  • 9
  • 73
  • 124
1

In the %token section the left column is the token names used elsewhere in the grammar, and the right is a pattern that can be used in a case statement. Where you see $$ Happy will substitute its own variable. So if the resulting parser is expecting an Integer at some point then Happy will have a case statement with a pattern including TokenInt v1234 where the v1234 bit is a variable name created by Happy.

The "Let" is the constructor for the grammar expression being recognised. If you look a little lower in the example page you will see

data Exp  
    = Let String Exp Exp
    | Exp1 Exp1
    deriving Show

So the Let constructor takes a string and two sub-expressions (of type 'Exp'). If you look at the grammar you can see that there six elements in the let rule. The first is just the constant string "let". That is used by the generated parser to figure out that its looking at a "let" clause, but the resulting parse tree doesn't need it. So $1 doesn't appear. Instead the first argument to the Let constructor has to be the name of the variable being bound, which is the second item in the grammar rule. Hence this is $2. The other things are the two sub-expressions, which are $4 and $6 by the same logic. Both of these can be arbitrarily complex expressions: Happy figures out where they begin and end and parses them by following the other rules for what constitutes expressions.

Paul Johnson
  • 17,438
  • 3
  • 42
  • 59
  • Hi thanks all for answer. For example , in the action "Plus $1 $3" wasn't enough to write $$ = $1 + $3? – badvi93d May 24 '16 at 09:30
  • The "Plus" is the constructor you are using in Haskell, while the "+" is something you find in the syntax of the parsed document. Your "$$ = $1 + $3" has nothing to link the "+" in the document with the "Plus" in the parse tree. – Paul Johnson May 24 '16 at 16:16