12

I recently learned about GADTs and their notation:

E.g.

data Maybe a where
  Nothing :: Maybe a
  Just    :: a -> Maybe a

data Either a b where
  Left  :: a -> Either a b
  Right :: b -> Either a b

data Bool where
  False :: Bool
  True  :: Bool

Now I noticed a similiarity to functions like bool, and either, which is basically just like the GADT definition:

  1. taking every line as an argument
  2. replacing the actual type with the next letter of the alphabet
  3. and finally returning a function Type -> (the letter of step 2)

E.g.

maybe  :: b -> (a -> b) -> Maybe a -> b
either :: (a -> c) -> (b -> c) -> Either a b -> c
bool   :: a -> a -> Bool -> a

This also includes foldr, but I noticed that e.g. Tuple doesn't have such a function, though you could easily define it:

tuple :: (a -> b -> c) -> (a,b) -> c
tuple f (x,y) = f x y

What is this pattern? It seems to me these functions alleviate the need for pattern matching (because they give a general way for each case) and thus every function operating on the type can be defined in terms of this function.

Philipp Claßen
  • 41,306
  • 31
  • 146
  • 239
hgiesel
  • 5,430
  • 2
  • 29
  • 56
  • 3
    You mean [uncurry](http://hackage.haskell.org/package/base-4.9.1.0/docs/Prelude.html#v:uncurry)? – bheklilr Jul 20 '17 at 21:05
  • I don't see the relation between my concept and currying – hgiesel Jul 20 '17 at 21:06
  • "*Functions like `bool`*"? – Willem Van Onsem Jul 20 '17 at 21:06
  • 2
    `bool` is a function in Haskell defined in `Data.Bool` – hgiesel Jul 20 '17 at 21:07
  • 3
    @hgiesel: ah, thanks for the reference :) This actually looks like a *catamorphism* to me :) – Willem Van Onsem Jul 20 '17 at 21:08
  • I feel more like a catamorphism is a specialized version of what I describe, because a catamorphism works of a function that is `(f a -> a)`, while this is `(f a -> b)` – hgiesel Jul 20 '17 at 21:14
  • 1
    I think @bheklilr is saying that your `tuple` function is identical to `uncurry`. – 4castle Jul 20 '17 at 21:15
  • Ok, I see, yes, he is right. – hgiesel Jul 20 '17 at 21:18
  • 6
    This is indeed a catamorphism, albeit you are using a different choice of type variables. A general cata has type `(F a -> a) -> T -> a` where `T=Fix F`. Above, `T=Maybe b= 1+b` and `F a -> a` becomes `(1+b)->a` which is isomorphic to `(a , b->a)` so the cata becomes `(a, b->a) -> Maybe b -> a` which is your `maybe`, except for renamed tyvars and isos/currying. – chi Jul 20 '17 at 21:22
  • 2
    @chi I'm sorry, I don't quite get your answer, could you maybe elaborate it in an answer? – hgiesel Jul 20 '17 at 21:37
  • 7
    This is often referred to as _Church Encoding_ but in a typed language it is more properly _Boehm-Berarducci Encoding_. – Rein Henrichs Jul 21 '17 at 01:25
  • @chi could you please clarify about the `T = Maybe b = Fix F = F a` thing? – Will Ness Jul 21 '17 at 09:10
  • @WillNess The `Maybe b` type is not recursive, unlike lists/trees. Hence, it can be seen as a "fixed point" `Fix F` of the _constant_ type function `F a = 1+b`, so that `T=Fix F=F(Fix F)=1+b` holds trivially, since `F a` does not depend on `a`. – chi Jul 21 '17 at 10:27
  • @hgiesel This is a bit broad. Some discussion can be found in the [documentation](https://stackoverflow.com/documentation/haskell/2984/recursion-schemes) but it's still preliminary, and needs more examples, IMHO. I also answered a related question on [CS.SE](https://cs.stackexchange.com/questions/78101/what-are-structural-recursion-primitive-recursion-recursion-combinator-and-rec) yesterday. – chi Jul 21 '17 at 10:45
  • @chi thank you. the "constant type function" did the trick. – Will Ness Jul 21 '17 at 15:32
  • 1
    @hgiesel also, on simpler level (and this may be obvious), this can be referred to as "case analysis" functions: `maybe b f v = case v of Nothing -> b; Just a -> f a`. Number of arguments obviously corresponds to the number of cases in the data type definition. Having such functions defined for each datatype obviates the need to have the `case` primitive as part of a language. One could imagine a language without `case` expressions, where datatype definition facilities include defining such functions. With tuples there's only one case, hence case analysis is a no-op: `tuple f (x,y) = f x y`. – Will Ness Jul 21 '17 at 17:53

1 Answers1

4

First, the types you mention are not really GADTs, they are plain ADTs, since the return type of each constructor is always T a. A proper GADT would be something like

data T a where
   K1 :: T Bool  -- not T a

Anyway, the technique you mention is a well known method to encode algebraic data types into (polymorphic) functions. It goes under many names, like Church encoding, Boehm-Berarducci encoding, endcoding as a catamorphism, etc. Sometimes the Yoneda lemma is used to justify this approach, but there's no need to understand the category-theoretic machinery to understand the method.

Basically, the idea is the following. All the ADTs can be generated by

  • product types (,) a b
  • sum types Either a b
  • arrow types a -> b
  • unit type ()
  • void type Void (rarely used in Haskell, but theoretically nice)
  • variables (if the type bing defined has parameters)
  • possibly, basic types (Integer, ...)
  • type level-recursion

Type level recursion is used when some value constructor takes the type which is being defined as an argument. The classic example is Peano-style naturals:

data Nat where
   O :: Nat
   S :: Nat -> Nat
     -- ^^^ recursive!

Lists are also common:

data List a where
   Nil :: List a
   Cons :: a -> List a -> List a
             -- ^^^^^^ recursive!

Types like Maybe a, pairs, etc. are non recursive.

Note that each ADT, recursive or not, can be reduced to a single constructor with a sigle argument, by summing (Either) over all the constructors, and multiplying all the arguments. For instance, Nat is isomorphic to

data Nat1 where
  O1 :: () -> Nat
  S1 :: Nat -> Nat

which is isomorphic to

data Nat2 where K2 :: (Either () Nat) -> Nat

Lists become

data List1 a where K1 :: (Either () (a, List a)) -> List a

The steps above make use of the algebra of types, which makes the sum and products of types obey the same rules as high school algebra, while a -> b behaves like the exponential b^a.

Hence, we can write any ADT in the form

-- pseudo code
data T where
   K :: F T -> T
type F k = .....

For instance

type F_Nat k = Either () k      -- for T = Nat
type F_List_a k = Either () (a, k) -- for T = List a

(Note that the latter type function F depends on a, but it's not important right now.)

Non recursive types will not use k:

type F_Maybe_a k = Either () a     -- for T = Maybe a

Note that constructor K above makes the type T isomorphic to F T (let's ignore the lifting / extra bottom introduced by it). Essentially, we have that

Nat ~= F Nat = Either () Nat
List a ~= F (List a) = Either () (a, List a)
Maybe a ~= F (Maybe a) = Either () a

We can even formalize this further by abstracting from F

newtype Fix f = Fix { unFix :: f (Fix f) }

By definition Fix F will now be isomorphic to F (Fix F). We could let

type Nat = Fix F_Nat

(In Haskell, we need a newtype wrapper around F_Nat, which I omit for clarity.)

Finally, the general encoding, or catamorphism, is:

cata :: (F k -> k) -> Fix F -> k

This assumes that F is a functor.

For Nat, we get

cata :: (Either () k -> k) -> Nat -> k
-- isomorphic to
cata :: (() -> k, k -> k) -> Nat -> k
-- isomorphic to
cata :: (k, k -> k) -> Nat -> k
-- isomorphic to
cata :: k -> (k -> k) -> Nat -> k

Note the "high school albegra" steps, where k^(1+k) = k^1 * k^k, hence Either () k -> k ~= (() -> k, k -> k).

Note that we get two arguments, k and k->k which correspond to O and S. This is not a coincidence -- we summed over all the constructors. This means that cata expects to be passed the value of type k which "plays the role of O" there, and then the value of type k -> k which plays the role of S.

More informally, cata is telling us that, if we want to map a natural in k, we only have to state what is the "zero" inside k and how to take the "successor" in k, and then every Nat can be mapped consequently.

For lists we get:

cata :: (Either () (a, k) -> k) -> List a -> k
-- isomorphic to
cata :: (() -> k, (a, k) -> k) -> List a -> k
-- isomorphic to
cata :: (k, a -> k -> k) -> List a -> k
-- isomorphic to
cata :: k -> (a -> k -> k) -> List a -> k

which is foldr.

Again, this is cata telling us that, if we state how to take the "empty list" in k and to "cons" a and k inside k, we can map any list in k.

Maybe a is the same:

cata :: (Either () a -> k) -> Maybe a -> k
-- isomorphic to
cata :: (() -> k, a -> k) -> Maybe a -> k
-- isomorphic to
cata :: (k, a -> k) -> Maybe a -> k
-- isomorphic to
cata :: k -> (a -> k) -> Maybe a -> k

If we can map Nothing in k, and perform Just mapping a in k, the we can map any Maybe a in k.

If we try to apply the same approach to Bool and (a,b) we reach the functions which were posted in the questions.

More advanced theoretical topics to look up:

  • (initial) F-algebras in category theory
  • eliminators / recursors / induction principles in type theory (these can be applied to GADTs as well)
chi
  • 111,837
  • 3
  • 133
  • 218