10

In Idris, there's some magical machinery to automatically create (dependent) eliminators for user-defined types. I'm wondering if it's possible to do something (perhaps less dependent) with Haskell types. For instance, given

data Foo a = No | Yes a | Perhaps (Foo a)

I want to generate

foo :: b -> (a -> b) -> (b -> b) -> Foo a -> b
foo b _ _ No = b
foo _ f _ (Yes a) = f a
foo b f g (Perhaps c) = g (foo b f g x)

I'm pretty weak on polyvariadic functions and generics, so I could use a bit of help getting started.

dfeuer
  • 48,079
  • 5
  • 63
  • 167
  • 3
    You can't do this with generics, but the template haskell for this should be straightforward. – sclv Feb 29 '16 at 21:44
  • @sclv, why can't I do it with generics? – dfeuer Feb 29 '16 at 21:59
  • 2
    Because you want to generate functions whose _type_ is determined by the shape of the data structure. Generics let you provide functions which essentially have "one hole in the type" which is the type of the data structure with an instance of Generic that they are operating on. – sclv Feb 29 '16 at 22:58
  • @sclv, right. But why is that beyond the ability of associated types and closed type families? I'm not saying you're wrong; I just can't follow your argument. – dfeuer Feb 29 '16 at 23:09
  • 1
    I think this is doable. We can compute types from generic Rep-s as we like. – András Kovács Feb 29 '16 at 23:12
  • 5
    This is a typical example of needing to find functoriality which hasn't been helpfully pointed out in a datatype declaration, i.e. the way in which a `Foo a` structure can be seen as functorial in its substructures. Recursion schemes can often be given once and for all in terms of the functors whose fixpoints are the datatypes we're trying to work with. It may be worth considering that an easy way to get your hands on this functoriality is not to let go of it in the first place. – pigworker Feb 29 '16 at 23:41
  • @pigworker, how do you think this should be done from the standpoint of language design? Should algebraic datatypes be less general? Should the compiler provide some kind of special insight into the structure? – dfeuer Mar 01 '16 at 02:30
  • 3
    @dfeuer I'd like to see datatypes being characterised by first class data descriptions, effectively making typereps the organizing principle, rather than a subsequent rationalization. One can compute lots of useful generic operations on data if you have their descriptions to hand in inspectable form. – pigworker Mar 01 '16 at 12:15

3 Answers3

7

Here's a start of doing this using GHC Generics. Adding some code to reassociate the (:+:) would make this nicer. A few more instances are required and this probably has ergonomic problems.

EDIT: Bah, I got lazy and fell back to a data family to get injectivity for my type equality dispatch. This mildly changes the interface. I suspect with enough trickery, and/or using injective type families this can be done without a data family or overlapping instances.

{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE UndecidableInstances #-}
module Main where
import Data.Function (fix)
import GHC.Generics

data Foo a = No | Yes | Perhaps (Foo a) | Extra a Int Bool
    deriving (Show, Generic1)

data Bar a = Bar (Maybe a)
    deriving (Show, Generic1)

gcata :: (GCata (f a) (Rep1 f a), Generic1 f) => Alg (f a) (Rep1 f a) r -> f a -> r
gcata f = fix(\w -> gcata' w f . from1)

ex' :: Show a => Foo a -> String
ex' = gcata (("No","Yes"),(\(Rec s) -> "Perhaps ("++s++")", \a i b -> "Extra ("++show a++") ("++show i++") ("++show b++")"))

ex1 = ex' (Perhaps (Perhaps Yes) :: Foo Int)
ex2 = ex' (Perhaps (Perhaps (Extra 'a' 2 True)) :: Foo Char)

ex3 :: Foo a -> Foo a
ex3 = gcata ((No, Yes), (Perhaps . unRec, Extra))

ex4 = gcata (\(K m) -> show m) (Bar (Just 3))

class GCata rec f where
    type Alg (rec :: *) (f :: *) (r :: *) :: *
    gcata' :: (rec -> r) -> Alg rec f r -> f -> r

instance (GCata rec (f p)) => GCata rec (M1 i c f p) where
    type Alg rec (M1 i c f p) r = Alg rec (f p) r
    gcata' w f (M1 x) = gcata' w f x

instance (GCata rec (f p), GCata rec (g p)) => GCata rec ((f :+: g) p) where
    type Alg rec ((f :+: g) p) r = (Alg rec (f p) r, Alg rec (g p) r)
    gcata' w (l,_) (L1 x) = gcata' w l x
    gcata' w (_,r) (R1 x) = gcata' w r x

instance GCata rec (U1 p) where
    type Alg rec (U1 p) r = r
    gcata' _ f U1 = f

instance (Project rec (f p), GCata rec (g p)) => GCata rec ((f :*: g) p) where
    type Alg rec ((f :*: g) p) r = Prj rec (f p) r -> Alg rec (g p) r
    gcata' w f (x :*: y) = gcata' w (f (prj w x)) y

class Project rec f where
    type Prj (rec :: *) (f :: *) (r :: *) :: *
    prj :: (rec -> r) -> f -> Prj rec f r

instance (Project rec (f p)) => Project rec (M1 i c f p) where
    type Prj rec (M1 i c f p) r = Prj rec (f p) r
    prj w (M1 x) = prj w x

instance Project rec (K1 i c p) where
    type Prj rec (K1 i c p) r = c
    prj _ (K1 x) = x

instance (RecIfEq (TEq rec (f p)) rec (f p)) => Project rec (Rec1 f p) where
    type Prj rec (Rec1 f p) r = Tgt (TEq rec (f p)) rec (f p) r
    prj w (Rec1 x) = recIfEq w x

instance Project rec (Par1 p) where
    type Prj rec (Par1 p) r = p
    prj _ (Par1 x) = x

instance GCata rec (K1 i c p) where
    type Alg rec (K1 i c p) r = c -> r
    gcata' _ f (K1 x) = f x

instance GCata rec (Par1 p) where
    type Alg rec (Par1 p) r = p -> r
    gcata' _ f (Par1 x) = f x

instance (Project rec (Rec1 f p)) => GCata rec (Rec1 f p) where
    type Alg rec (Rec1 f p) r = Prj rec (Rec1 f p) r -> r
    gcata' w f = f . prj w 

data HTrue; data HFalse

type family TEq x y where
    TEq x x = HTrue
    TEq x y = HFalse

class RecIfEq b rec t where
    data Tgt b rec t r :: *
    recIfEq :: (rec -> r) -> t -> Tgt b rec t r

instance RecIfEq HTrue rec rec where
    newtype Tgt HTrue rec rec r = Rec { unRec :: r }
    recIfEq w = Rec . w

instance RecIfEq HFalse rec t where
    newtype Tgt HFalse rec t r = K { unK :: t }
    recIfEq _ = K
Derek Elkins left SE
  • 2,079
  • 12
  • 14
  • 1
    I have the sinking feeling the pain will come with recursive vs. non-recursive instances of `Rec1` (`data X a = X a (X a)` vs. `data Y a = X (Maybe a)`), which I predict will lead to horrors of overlapping instances and such. *shiver*. I'm still going to study your code for a while, because it looks like it has a lot of neat ideas for me to think about. – dfeuer Mar 01 '16 at 04:31
5

As pigworker remarked in the question comments, using the default Generic representation leads to great ugliness, since we don't have prior information about recursion in our type, and we have to dig out recursive occurrences by manually checking for type equality. I'd like to present here alternative solutions with explicit f-algebra-style recursion. For this, we need an alternative generic Rep. Sadly, this means we can't easily tap into GHC.Generics, but I hope this will be edifying nonetheless.

In my first solution I aim for a presentation that is as simple as possible within current GHC capabilities. The second solution is a TypeApplication-heavy GHC 8-based one with more sophisticated types.

Starting out as usual:

{-# language
  TypeOperators, DataKinds, PolyKinds,
  RankNTypes, EmptyCase, ScopedTypeVariables,
  DeriveFunctor, StandaloneDeriving, GADTs,
  TypeFamilies, FlexibleContexts, FlexibleInstances #-}

My generic representation is a fixpoint of a sum-of-products. It slightly extends the basic model of generics-sop, which is also a sum-of-products but not functorial and therefore ill-equipped for recursive algorithms. I think SOP is overall a much better practical representation than arbitrarily nested types; you can find extended arguments as to why this is the case in the paper. In short, SOP removes unnecessary nesting information and lets us separate metadata from basic data.

But before anything else, we should decide on a code for generic types. In vanilla GHC.Generics there isn't a well-defined kind of codes, as the type constructors of sums, products etc. form an ad-hoc type-level grammar, and we can dispatch on them using type classes. We adhere more closely to usual presentations in dependently typed generics, and use explicit codes, interpretations and functions. Our codes shall be of kind:

[[Maybe *]]

The outer list encodes a sum of constructors, with each inner [Maybe *] encoding a constructor. A Just * is just a constructor field, while Nothing denotes a recursive field. For example, the code of [Int] is ['[], [Just Int, Nothing]].

type Rep a = Fix (SOP (Code a)) 

class Generic a where
  type Code a :: [[Maybe *]]
  to   :: a -> Rep a
  from :: Rep a -> a

data NP (ts :: [Maybe *]) (k :: *) where
  Nil  :: NP '[] k
  (:>) :: t -> NP ts k -> NP (Just t ': ts) k
  Rec  :: k -> NP ts k -> NP (Nothing ': ts) k
infixr 5 :>  

data SOP (code :: [[Maybe *]]) (k :: *) where
  Z :: NP ts k -> SOP (ts ': code) k
  S :: SOP code k -> SOP (ts ': code) k

Note that NP has different constructors for recursive and non-recursive fields. This is quite important, because we want codes to be unambiguously reflected in the type indices. In other words, we would like NP to also act as a singleton for [Maybe *] (although we remain parametric in * for good reasons).

We use a k parameter in the definitions to leave a hole for recursion. We set up recursion as usual, leaving the Functor instances to GHC:

deriving instance Functor (SOP code)
deriving instance Functor (NP code)

newtype Fix f = In {out :: f (Fix f)}

cata :: Functor f => (f a -> a) -> Fix f -> a
cata phi = go where go = phi . fmap go . out

We have two type families:

type family CurryNP (ts :: [Maybe *]) (r :: *) :: * where
  CurryNP '[]             r = r
  CurryNP (Just t  ': ts) r = t -> CurryNP ts r
  CurryNP (Nothing ': ts) r = r -> CurryNP ts r

type family Alg (code :: [[Maybe *]]) (r :: *) :: * where
  Alg '[]         r = ()
  Alg (ts ': tss) r = (CurryNP ts r, Alg tss r)

CurryNP ts r curries NP ts with result type r, and it also plugs in r in the recursive occurrences.

Alg code r computes the type of an algebra on SOP code r. It tuples together the eliminators for the individual constructors. Here we use plain nested tuples, but of course HList-s would be adequate too. We could also reuse NP here as a HList, but I find that too kludgy.

All that's left is to implement the functions:

uncurryNP :: CurryNP ts a -> NP ts a -> a
uncurryNP f Nil        = f
uncurryNP f (x :> xs)  = uncurryNP (f x) xs
uncurryNP f (Rec k xs) = uncurryNP (f k) xs

algSOP :: Alg code a -> SOP code a -> a
algSOP fs (Z np)  = uncurryNP (fst fs) np
algSOP fs (S sop) = algSOP (snd fs) sop

gcata :: Generic a => Alg (Code a) r -> a -> r
gcata f = cata (algSOP f) . to

The key point here is that we have to convert the curried eliminators in Alg into a "proper" SOP code a -> a algebra, since that is the form that can be directly used in cata.

Let's define some sugar and instances:

(<:) :: a -> b -> (a, b)
(<:) = (,)
infixr 5 <:

instance Generic (Fix (SOP code)) where
  type Code (Fix (SOP code)) = code
  to   = id
  from = id  

instance Generic [a] where
  type Code [a] = ['[], [Just a, Nothing]]  
  to   = foldr (\x xs -> In (S (Z (x :> Rec xs Nil)))) (In (Z Nil))
  from = gcata ([] <: (:) <: ()) -- note the use of "Generic (Rep [a])"

Example:

> gcata (0 <: (+) <: ()) [0..10]
55

Full code.


However, it would be nicer if we had currying and didn't have to use HList-s or tuples to store eliminators. The most convenient way is to have the same order of arguments as in standard library folds, such as foldr or maybe. In this case the return type of gcata is given by a type family that computes from the generic code of a type.

type family CurryNP (ts :: [Maybe *]) (r :: *) :: * where
  CurryNP '[]             r = r
  CurryNP (Just t  ': ts) r = t -> CurryNP ts r
  CurryNP (Nothing ': ts) r = r -> CurryNP ts r

type family Fold' code a r where
  Fold' '[]         a r = r
  Fold' (ts ': tss) a r = CurryNP ts a -> Fold' tss a r

type Fold a r = Fold' (Code a) r (a -> r)

gcata :: forall a r. Generic a => Fold a r

This gcata is highly (fully) ambiguous. We need either explicit application or Proxy, and I opted for the former, incurring a GHC 8 dependence. However, once we supply an a type, the result type reduces, and we can easily curry:

> :t gcata @[_] 
gcata @[_] :: Generic [t] => r -> (t -> r -> r) -> [t] -> r
> :t gcata @[_] 0
gcata @[_] 0 :: Num t1 => (t -> t1 -> t1) -> [t] -> t1
> gcata @[_] 0 (+) [0..10]
55

I used above a partial type signature in [_]. We can also create a shorthand for this:

gcata1 :: forall f a r. Generic (f a) => Fold (f a) r
gcata1 = gcata @(f a) @r

Which can be used as gcata1 @[].

I'd rather not elaborate the implementation of the above gcata here. It's not much longer than the simple version, but the gcata implementation is pretty hairy (embarrassingly, it's responsible for my delayed answer). Right now I couldn't explain it very well, since I wrote it with Agda aid, which entails plenty of automatic search and type tetris.

András Kovács
  • 29,931
  • 3
  • 53
  • 99
  • I haven't worked through most of this yet, but I noticed only one hole for recursion. What about mutually recursive types? – dfeuer Mar 04 '16 at 20:01
  • Mutual recursion is possible, we need [indexed functors](https://www.reddit.com/r/haskell/comments/3sm1j1/how_to_mix_the_base_functorrecursion_scheme_stuff/cwyr61h) for that. Maybe I'll do the indexed version too. – András Kovács Mar 04 '16 at 20:08
  • That's a really beautiful effort! – sclv Mar 20 '16 at 03:46
1

As has been said in the comments and other answers, it's best to start from a generic representation that has access to the recursive positions.

One library that works with such a representation is multirec (another is compdata):

{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE GADTs, TypeFamilies, MultiParamTypeClasses, RankNTypes #-}
module FooFold where

import Generics.MultiRec.FoldAlgK
import Generics.MultiRec.TH

data Foo a = No | Yes a | Perhaps (Foo a)

data FooF :: * -> * -> * where
  Foo :: FooF a (Foo a)

deriveAll ''FooF

foldFoo :: (r, (a -> r, r -> r)) -> Foo a -> r
foldFoo phi = fold (const phi) Foo

The FoldAlgK module provides a fold with a single result type and computes the algebra type as a nested pair. It would be relatively easy to additionally curry that. There are some other variants offered by the library.

kosmikus
  • 19,549
  • 3
  • 51
  • 66