12

Recently I've finally started to feel like I understand catamorphisms. I wrote some about them in a recent answer, but briefly I would say a catamorphism for a type abstracts over the process of recursively traversing a value of that type, with the pattern matches on that type reified into one function for each constructor the type has. While I would welcome any corrections on this point or on the longer version in the answer of mine linked above, I think I have this more or less down and that is not the subject of this question, just some background.

Once I realized that the functions you pass to a catamorphism correspond exactly to the type's constructors, and the arguments of those functions likewise correspond to the types of those constructors' fields, it all suddenly feels quite mechanical and I don't see where there is any wiggle room for alternate implementations.

For example, I just made up this silly type, with no real concept of what its structure "means", and derived a catamorphism for it. I don't see any other way I could define a general-purpose fold over this type:

data X a b f = A Int b
             | B
             | C (f a) (X a b f)
             | D a

xCata :: (Int -> b -> r)
      -> r
      -> (f a -> r -> r)
      -> (a -> r)
      -> X a b f
      -> r
xCata a b c d v = case v of
  A i x -> a i x
  B -> b
  C f x -> c f (xCata a b c d x)
  D x -> d x

My question is, does every type have a unique catamorphism (up to argument reordering)? Or are there counterexamples: types for which no catamorphism can be defined, or types for which two distinct but equally reasonable catamorphisms exist? If there are no counterexamples (i.e., the catamorphism for a type is unique and trivially derivable), is it possible to get GHC to derive some sort of typeclass for me that does this drudgework automatically?

michid
  • 10,536
  • 3
  • 32
  • 59
amalloy
  • 89,153
  • 8
  • 140
  • 205
  • Pick a piece of your type expression, apply the isomorphism `a ~ forall b. (a -> b) -> b`, voilà. – Benjamin Hodgson Oct 04 '17 at 09:17
  • 1
    I've written a generator for catamorphisms in template Haskell: https://github.com/KommuSoft/template-fun. It does not all cover all type declarations yet, but that is not a fundamental issue. So afaik catamorphisms can be derived automatically. – Willem Van Onsem Oct 04 '17 at 09:19
  • @BenjaminHodgson I'm afraid that's a little too terse for me. I don't think I understand the suggestion at all. – amalloy Oct 04 '17 at 09:20
  • Sorry, I’ll be clearer. You were asking whether the catamorphism for a given type is unique. I say no, because there are (an infinite number of) identities you can always apply to any type to get a different but isomorphic type. The classic catamorphism `foldr :: (a -> b -> b) -> b -> [a] -> b` is isomorphic to `foldr :: (a -> b -> b) -> (forall x. (b -> x) -> x) -> [a] -> b`, for example. – Benjamin Hodgson Oct 04 '17 at 09:24
  • I've also written a [fold generator](https://github.com/effectfully/Generic/blob/master/Lib/Reflection/Fold.agda) (only in Agda and for higher-order inductive families rather than simple data types). It's not a hard task unless you consider more interesting data types: nested (like `Rose` trees), mutually recursive, ones with polymorphic recursion and such. – effectfully Oct 04 '17 at 12:14
  • @user3237465, rose trees aren't what I've usually seen called "nested" datatypes. As I've seen it, that term is typically used of non-regular types. You seem to be referring to the fact that a rose tree has a list of rose trees, so the catamorphism needs to `fmap` itself down to get `cataTree :: (a -> [r] -> r) -> Tree a -> r`? – dfeuer Oct 04 '17 at 17:11
  • @BenjaminHodgson I think I now understand why I've seen people write "up to isomorphism": your proposal clearly gets around my "up to argument reordering" requirement, but is also totally unsatisfying, because of course a value of type `forall x. (b -> x) -> x` is morally no different from a value of type `b`, despite being represented differently by GHC. – amalloy Oct 04 '17 at 18:17
  • 2
    @amalloy Indeed! In category theory you usually see people talk about objects being unique “up to unique isomorphism”, because you can’t in general talk about equality of objects. And in fact `cata` _is_ unique up to unique isomorphism, because `Fix` is the initial object in the category of algebras. [I’ll leave the full explanation to Bartosz Milewski](https://bartoszmilewski.com/2017/02/28/f-algebras/), who’s better at this than me :) – Benjamin Hodgson Oct 04 '17 at 18:46
  • @dfeuer, I wasn't aware "nested" is a proper term. Yes, I meant that `fmap` thing. – effectfully Oct 05 '17 at 06:35

2 Answers2

6

The catamorphism associated to a recursive type can be derived mechanically.

Suppose you have a recursively defined type, having multiple constructors, each one with its own arity. I'll borrow OP's example.

data X a b f = A Int b
             | B
             | C (f a) (X a b f)
             | D a

Then, we can rewrite the same type by forcing each arity to be one, uncurrying everything. Arity zero (B) becomes one if we add a unit type ().

data X a b f = A (Int, b)
             | B ()
             | C (f a, X a b f)
             | D a

Then, we can reduce the number of constructors to one, exploiting Either instead of multiple constructors. Below, we just write infix + instead of Either for brevity.

data X a b f = X ((Int, b) + () + (f a, X a b f) + a)

At the term-level, we know we can rewrite any recursive definition as the form x = f x where f w = ..., writing an explicit fixed point equation x = f x. At the type-level, we can use the same method to refector recursive types.

data X a b f   = X (F (X a b f))   -- fixed point equation
data F a b f w = F ((Int, b) + () + (f a, w) + a)

Now, we note that we can autoderive a functor instance.

deriving instance Functor (F a b f)

This is possible because in the original type each recursive reference only occurred in positive position. If this does not hold, making F a b f not a functor, then we can't have a catamorphism.

Finally, we can write the type of cata as follows:

cata :: (F a b f w -> w) -> X a b f -> w

Is this the OP's xCata type? It is. We only have to apply a few type isomorphisms. We use the following algebraic laws:

1) (a,b) -> c ~= a -> b -> c          (currying)
2) (a+b) -> c ~= (a -> c, b -> c)
3) ()    -> c ~= c

By the way, it's easy to remember these isomorphisms if we write (a,b) as a product a*b, unit () as1, and a->b as a power b^a. Indeed they become

  1. c^(a*b) = (c^a)^b
  2. c^(a+b) = c^a*c^b
  3. c^1 = c

Anyway, let's start to rewrite the F a b f w -> w part, only

   F a b f w -> w
=~ (def F)
   ((Int, b) + () + (f a, w) + a) -> w
=~ (2)
   ((Int, b) -> w, () -> w, (f a, w) -> w, a -> w)
=~ (3)
   ((Int, b) -> w, w, (f a, w) -> w, a -> w)
=~ (1)
   (Int -> b -> w, w, f a -> w -> w, a -> w)

Let's consider the full type now:

cata :: (F a b f w -> w) -> X a b f -> w
     ~= (above)
        (Int -> b -> w, w, f a -> w -> w, a -> w) -> X a b f -> w
     ~= (1)
           (Int -> b -> w)
        -> w
        -> (f a -> w -> w)
        -> (a -> w)
        -> X a b f
        -> w

Which is indeed (renaming w=r) the wanted type

xCata :: (Int -> b -> r)
      -> r
      -> (f a -> r -> r)
      -> (a -> r)
      -> X a b f
      -> r

The "standard" implementation of cata is

cata g = wrap . fmap (cata g) . unwrap
   where unwrap (X y) = y
         wrap   y = X y

It takes some effort to understand due to its generality, but this is indeed the intended one.


About automation: yes, this can be automatized, at least in part. There is the package recursion-schemes on hackage which allows one to write something like

type X a b f = Fix (F a f b)
data F a b f w = ...  -- you can use the actual constructors here
       deriving Functor

-- use cata here

Example:

import Data.Functor.Foldable hiding (Nil, Cons)

data ListF a k = NilF | ConsF a k deriving Functor
type List a = Fix (ListF a)

-- helper patterns, so that we can avoid to match the Fix
-- newtype constructor explicitly    
pattern Nil = Fix NilF
pattern Cons a as = Fix (ConsF a as)

-- normal recursion
sumList1 :: Num a => List a -> a
sumList1 Nil         = 0
sumList1 (Cons a as) = a + sumList1 as

-- with cata
sumList2 :: forall a. Num a => List a -> a
sumList2 = cata h
   where
   h :: ListF a a -> a
   h NilF        = 0
   h (ConsF a s) = a + s

-- with LambdaCase
sumList3 :: Num a => List a -> a
sumList3 = cata $ \case
   NilF      -> 0
   ConsF a s -> a + s
amalloy
  • 89,153
  • 8
  • 140
  • 205
chi
  • 111,837
  • 3
  • 133
  • 218
  • 2
    `recursion-schemes` also has some Template Haskell to let you derive a base functor associated with a recursive type and also instances of `Recursive` and `Corecursive`. I'm curious whether the `Functor`-only thing is very deep, or whether there is a weaker sort of recursion scheme that could make sense for [exponential functors](https://hackage.haskell.org/package/invariant). – dfeuer Oct 04 '17 at 17:37
  • And it turns out the answer is yes! http://comonad.com/reader/2008/rotten-bananas/ – dfeuer Oct 04 '17 at 17:50
  • 1
    @dfeuer see also https://www.schoolofhaskell.com/user/edwardk/phoas, wherein he rebuilds the HOAS/`cata` machinery using profunctors, to a much more satisfying outcome – Benjamin Hodgson Oct 04 '17 at 18:33
2

A catamorphism (if it exists) is unique by definition. In category theory a catamorphism denotes the unique homomorphism from an initial algebra into some other algebra. To the best of my knowledge in Haskell all catamorphisms exists because Haskell's types form a Cartesian Closed Category where terminal objects, all products, sums and exponentials exist. See also Bartosz Milewski's blog post about F-algebras, which gives a good introduction to the topic.

michid
  • 10,536
  • 3
  • 32
  • 59