4

I have this language AST

data ExprF r = Const Int
              | Var   String
              | Lambda String r
              | EList [r]
              | Apply r r
 deriving ( Show, Eq, Ord, Functor, Foldable )

And I want to convert it to string

toString = cata $ \case
  Const x -> show x
  Var x -> x
  EList x -> unwords x
  Lambda x y -> unwords [x, "=>", y]
  Apply x y -> unwords [x, "(", y, ")"]

But when lambda is used in Apply I need the parentheses

(x => x)(1)

but I cannot match inner structure with cata

toString :: Fix ExprF -> String
toString = cata $ \case
  Const x -> show x
  Var x -> x
  Lambda x y -> unwords [x, "=>", y]
  Apply (Lambda{}) y -> unwords ["(", x, ")", "(", y, ")"]
  Apply x y -> unwords [x, "(", y, ")"]

Is there any better solution than para?

toString2 :: Fix ExprF -> String
toString2 = para $ \case
  Const x -> show x
  Var x -> x
  Lambda x (_,y) -> unwords [x, "=>", y]
  EList x -> unwords (snd <$> x)
  Apply ((Fix Lambda {}),x) (_,y) -> unwords ["(", x, ")", "(", y, ")"]
  Apply (_,x) (_,y) -> unwords [x, "(", y, ")"]

It looks uglier. Even it is needed only in one place I need to remove fst tuple parameters everywhere and I guess it will be slower.

Robin Green
  • 32,079
  • 16
  • 104
  • 187
ais
  • 2,514
  • 2
  • 17
  • 24
  • 1
    In the general case, you could take a precedence argument as in `showsPrec`. Para does not look too bad to me, though. – chi Aug 23 '16 at 16:28
  • Could you elaborate? I don't see where I can put this argument. In this case yes `para` is ok, but when ast is much bigger it's just too much noise – ais Aug 23 '16 at 16:44
  • @ais I don't have time to write a full answer right now, but you can pass a `Bool` down to the recursive call by folding to a function `Bool -> a`. [Kiselyov's tagless final notes](http://okmij.org/ftp/tagless-final/course/lecture.pdf) touch on it. – Benjamin Hodgson Aug 23 '16 at 17:35
  • 1
    You could build a function using cata: something like `cata $ \case Var x -> const x ; Apply x y -> \_ -> unwords [x 5, y 5] ; Lambda x y -> (\p -> if p==5 then addParen (x 0) (y 0) else noParen (x 0) (y 0) ; ...` Adjust precedence numbers as needed, and pass an initial 0 at the top level so that the result is a string. – chi Aug 23 '16 at 17:35
  • 2
    @ais You may like the discussion in [this question on pretty-printing boolean formulae](http://stackoverflow.com/q/20406722/791604) for more on the `showsPrec` approach. – Daniel Wagner Aug 23 '16 at 18:04
  • @chi interesting. But why `Int`, Shouldn't `Bool` be enough? – ais Aug 23 '16 at 18:07
  • One option is to wrap a newtype around `Fix ExprF`, then use pattern synonyms to construct/deconstruct it. The pattern synonyms will make it less awkward, and the newtype will make the types look less scary. – dfeuer Aug 23 '16 at 19:12
  • 1
    @ais Bool is fine if you only have two levels of precedence. In the general case, you need more. The link by Daniel Wagner above is such an example. – chi Aug 23 '16 at 20:09

3 Answers3

8

As @chi, @DanielWagner and I pointed out in the comments, the way to do this sort of pretty-printing-with-parenthesisation in a structurally recursive manner is "the showsPrec approach".

The big idea is not to fold up the syntax tree into a String, but into a function Bool -> String. This gives us a degree of context-sensitivity in the fold: we'll use that extra Bool parameter to keep track of whether we're currently in the context of the left-hand side of an application.

parens x = "(" ++ x ++ ")"

ppAlg :: ExprF (Bool -> String) -> (Bool -> String)
ppAlg (Const x) isBeingApplied = show x
ppAlg (Var x) isBeingApplied = x
ppAlg (Lambda name body) isBeingApplied = p ("\\" ++ name ++ " -> " ++ body False)
    where p = if isBeingApplied then parens else id
ppAlg (EList es) isBeingApplied = unwords (sequenceA es False)
ppAlg (Apply fun arg) isBeingApplied = fun True ++ " " ++ arg False

We pass values of isBeingApplied down the recursive calls depending on where we are in the syntax tree right now. Note that the only place we're passing down True is as an argument to fun in the body of the Apply case. Then, in the Lambda case, we inspect that argument. If the current term is the left-hand part of an application we parenthesise the lambda; if not we don't.

At the top level, having folded up the whole tree into a function Bool -> String, we pass it an argument of False - we're not currently in the context of an application - to get a String out.

pp :: Expr -> String
pp ex = cata ppAlg ex False

ghci> pp $ app (lam "x" (var "x")) (cnst 2)
"(\\x -> x) 2"

By replacing the Bool with an Int, this approach can be generalised to parenthesising operators with arbitrary precedences, as covered in @DanielWagner's linked answer.

Community
  • 1
  • 1
Benjamin Hodgson
  • 42,952
  • 15
  • 108
  • 157
  • 2
    A function `Bool -> String` is just a pair `(String, String)`. So you fold it twice, with and without parentheses, then select which one you want during the recursion. That's brilliant, +1 – V. Semeria Aug 23 '16 at 21:04
  • 2
    @V.Semeria Yep! That's exactly right. The `Bool -> String` version is better though. You do less work (because you only _actually_ fold it once). It also scales better to syntaxes with more than two precedences: if you have, say, 5 precedence levels in your language then working with `Int`s (or some type with five values, if you wanna be total) is much easier than working with tuples `(String, String, String, String, String)`. – Benjamin Hodgson Aug 23 '16 at 21:17
1

One solution is to use the {-# LANGUAGE PatternSynonyms #-} extension and define unidirectional patterns like:

pattern Apply' r1 r2 <- Apply (_,r1) (_,r2)

that you could then use in your definitions like this:

toString2 :: Fix ExprF -> String
toString2 = para $ \case
  Const x -> show x
  Var x -> x
  Lambda x (_,y) -> unwords [x, "=>", y]
  EList x -> unwords (snd <$> x)
  Apply ((Fix Lambda {}),x) (_,y) -> unwords ["(", x, ")", "(", y, ")"]
  Apply' x y -> unwords [x, "(", y, ")"]

Since ExprF is a Functor, another option would be simply to write:

toString2' :: Fix ExprF -> String
toString2' = para $ \case
  Apply ((Fix Lambda {}),x) (_,y) -> unwords ["(", x, ")", "(", y, ")"]
  other -> case fmap snd other of
      Const x -> show x
      Var x -> x
      Lambda x y -> unwords [x, "=>", y]
      Apply x y -> unwords [x, "(", y, ")"]

With the pattern synonym, and compiling with -Wall, I'm having trouble convincing the exhaustivity checker that the pattern matches are exhaustive.

danidiaz
  • 26,936
  • 4
  • 45
  • 95
  • This makes the code shorter but it doesn't directly address the question. OP wants to use structural recursion to write the pretty-printer – Benjamin Hodgson Aug 23 '16 at 20:14
  • 2
    The exhaustivity checker is completely broken with regard to pattern synonyms (or at least non-trivial ones). No one has a fix yet. – dfeuer Aug 23 '16 at 22:23
0

How about straight recursion for the missing case :

toString :: Fix ExprF -> String
toString (Fix (Apply (Fix (Lambda _ x)) y)) = "(" ++ toString x ++ ")(" ++ toString y ++ ")"
toString z = (cata $ \case
  Const x -> show x
  Var x -> x
  EList x -> unwords x
  Lambda x y -> unwords [x, "=>", y]
  Apply x y -> unwords [x, "(", y, ")"]) z
V. Semeria
  • 3,128
  • 1
  • 10
  • 25
  • 2
    This will fail since cata does not recurse using the first case e.g. in `Apply (Apply (Lambda ..) ..) ..` – chi Aug 23 '16 at 17:37
  • Then a recursion for each ExprF constructor. It's not much longer than para and it will probably be faster. – V. Semeria Aug 23 '16 at 17:45
  • I think the whole point of the question is about avoiding writing explicit recursion. – Daniel Wagner Aug 23 '16 at 18:00
  • @DanielWagner not necessary. The main point is to find what could be a better solution for the problem. – ais Aug 23 '16 at 18:13