2

I am trying to implement a lambda calculus inside of Rascal but am having trouble getting the precedence and parsing to work the way I would like it to. Currently I have a grammar that looks something like:

keyword Keywords= "if" | "then" | "else"  | "end" | "fun";

lexical Ident = [a-zA-Z] !>> [a-zA-Z]+ !>> [a-zA-Z0-9] \ Keywords;
lexical Natural = [0-9]+  !>> [0-9];
lexical LAYOUT = [\t-\n\r\ ];
layout LAYOUTLIST = LAYOUT*  !>> [\t-\n\r\ ];

start syntax Prog = prog: Exp LAYOUTLIST;

syntax Exp =
    var: Ident
    | nat: Natural
    | bracket "(" Exp ")"
    > left app: Exp Exp
    > right func: "fun" Ident "-\>" Exp

When I parse a program of the form:

(fun x -> fun y -> x) 1 2

The resulting tree is:

prog(app(
    app(
      func(
        "x",
        func(
          "y",
          var("x")
      nat(1),
    nat(2))))))

Where really I am looking for something like this (I think):

prog(app(
    func(
      "x",
      app(
        func(
          "y",
          var("x")),
        nat(2))),
    nat(1)))

I've tried a number of variations of the precedence in the grammar, I've tried wrapping the App rule in parenthesis, and a number of other variations. There seems to be something going on here I don't understand. Any help would be most appreciated. Thanks.

josh
  • 1,544
  • 2
  • 16
  • 27
  • I think you are getting what you should get, and what you think you want isn't. The lambda takes an argument and returns a lambda which is then applied to the second argument. In other words, the second apply works on the result of the first apply, s o I would expect the tree to be appy(apply(...), 2) – rici Jan 14 '15 at 23:24

1 Answers1

1

I've used the following grammar, which removes the extra LAYOUTLIST and the dead right, but this should not make a difference. It seems to work as you want when I use the generic implode function :

keyword Keywords= "if" | "then" | "else"  | "end" | "fun";

lexical Ident = [a-zA-Z] !>> [a-zA-Z]+ !>> [a-zA-Z0-9] \ Keywords;
lexical Natural = [0-9]+  !>> [0-9];
lexical LAYOUT = [\t-\n\r\ ];
layout LAYOUTLIST = LAYOUT*  !>> [\t-\n\r\ ];

start syntax Prog = prog: Exp;

syntax Exp =
    var: Ident
    | nat: Natural
    | bracket "(" Exp ")"
    > left app: Exp Exp
    > func: "fun" Ident "-\>" Exp
    ;

Then calling the parser and imploding to an untyped AST (I've removed the location annotations for readability):

rascal>import ParseTree;
ok
rascal>implode(#node, parse(#start[Prog], "(fun x -\> fun y -\> x) 1 2"))
node: "prog"("app"(
        "app"(
          "func"(
            "x",
            "func"(
              "y",
              "var"("x"))),
          "nat"("1")),
        "nat"("2")))

So, I am guessing you got the grammar right for the shape of tree you want. How do you go from concrete parse tree to abstract AST? Perhaps there is something funny going on there.

Jurgen Vinju
  • 6,393
  • 1
  • 15
  • 26
  • I am using the same technique shown here to move from concrete to abstract. The tree you have here seems identical to the first example (which is what I want to get away from) as opposed to the second example (which is what I would like my trees to look like). Unless I'm missing something :). For reference my concrete -> abstract transformation is this: `public Prog parse(loc l) = parse(#Prog, l);` and `public Prog load(loc l) = implode(#Prog, parse(l));` (Also the trailing LAYOUTLIST in Prog seems to be the only way I can successfully read from a file). – josh Jan 14 '15 at 19:45
  • Is there a reason you want your trees to look like that, though? If you fully parenthesized your expression, you would have something like `( ( (fun x -> fun y -> x) 1 ) 2)`, so you would want an application node where `(fun x -> fun y -> x)` is applied to `1`. This would then be a child of another application node, where `(fun x -> fun y -> x) 1` is applied to 2. Your preferred tree would be for a program like `(fun x -> ( (fun y -> x) 2)) 1`. – Mark Hills Jan 15 '15 at 00:03
  • Thanks @MarkHills. I definitely agree. For some reason it was making more sense to have one child of every app node be the function and the other node be the argument. You are right though. Thanks! – josh Jan 15 '15 at 12:41
  • I must be mismatching brackets, but the second example ends with three brackets and the first with six.. :-) – Jurgen Vinju Jan 18 '15 at 14:51