6

I hear a lot about dependent types nowadays and I heard that DataKinds is somehow related to dependent typing (but I am not sure about this... just heard it on a Haskell Meetup).

Could someone illustrate with a super simple Haskell example what dependent typing is and what is it good for ?

On wikipedia it is written that dependent types can help prevent bugs. Could you give a simple example about how dependent types in Haskell can prevent bugs?

Something that I could start using in five minutes right now to prevent bugs in my Haskell code?

Dependent types are basically functions from values to types, how can this be used in practice? Why is that good ?

jhegedus
  • 20,244
  • 16
  • 99
  • 167
  • 6
    You won't find something you could start using in five minute to prevent bugs because because dependent types is not a magic receipe you can apply to everything and even when it applies it's not necessarily simple. – mb14 Sep 27 '15 at 11:16
  • Right, but it still would be nice to know what all the hype is all about and why a mere mortal should care or should he/she really care? Or is this dependent typing story all just "too academical" to be useful in practice? – jhegedus Sep 27 '15 at 11:18
  • 3
    Yes, that's it exactly. Anything that takes more than five minutes to learn is too academical to be useful in practice. – melpomene Sep 27 '15 at 11:28
  • 1
    @melpomene, yeah like walking, or talking ;-) – mb14 Sep 27 '15 at 11:32
  • [Related](http://stackoverflow.com/questions/27029564/how-do-i-build-a-list-with-a-dependently-typed-length). – effectfully Sep 27 '15 at 12:31

4 Answers4

13

Late to the party, this answer is basically a shameless plug.

Sam Lindley and I wrote a paper about Hasochism, the pleasure and pain of dependently typed programming in Haskell. It gives plenty of examples of what's possible now in Haskell and draws points of comparison (favourable as well as not) with the Agda/Idris generation of dependently typed languages.

Although it is an academic paper, it is about actual programs, and you can grab the code from Sam's repo. We have lots of little examples (e.g. orderedness of mergesort output) but we end up with a text editor example, where we use indexing by width and height to manage screen geometry: we make sure that components are regular rectangles (vectors of vectors, not ragged lists of lists) and that they fit together exactly.

The key power of dependent types is to maintain consistency between separate data components (e.g., the head vector in a matrix and every vector in its tail must all have the same length). That's never more important than when writing conditional code. The situation (which will one day come to be seen as having been ridiculously naïve) is that the following are all type-preserving rewrites

  • if b then t else e => if b then e else t
  • if b then t else e => t
  • if b then t else e => e

Although we are presumably testing b because it gives us some useful insight into what would be appropriate (or even safe) to do next, none of that insight is mediated via the type system: the idea that b's truth justifies t and its falsity justifies e is missing, despite being critical.

Plain old Hindley-Milner does give us one means to ensure some consistency. Whenever we have a polymorphic function

f :: forall a. r[a] -> s[a] -> t[a]

we must instantiate a consistently: however the first argument fixes a, the second argument must play along, and we learn something useful about the result while we are at it. Allowing data at the type level is useful because some forms of consistency (e.g. lengths of things) are more readily expressed in terms of data (numbers).

But the real breakthrough is GADT pattern matching, where the type of a pattern can refine the type of the argument it matches. You have a vector of length n; you look to see whether it's nil or cons; now you know whether n is zero or not. This is a form of testing where the type of the code in each case is more specific than the type of the whole, because in each case something which has been learned is reflected at the type level. It is learning by testing which makes a language dependently typed, at least to some extent.

Here's a silly game to play, whatever typed language you use. Replace every type variable and every primitive type in your type expressions with 1 and evaluate types numerically (sum the sums, multiply the products, s -> t means t-to-the-s) and see what you get: if you get 0, you're a logician; if you get 1, you're a software engineer; if you get a power of 2, you're an electronic engineer; if you get infinity, you're a programmer. What's going on in this game is a crude attempt to measure the information we're managing and the choices our code must make. Our usual type systems are good at managing the "software engineering" aspects of coding: unpacking and plugging together components. But as soon as a choice has been made, there is no way for types to observe it, and as soon as there are choices to make, there is no way for types to guide us: non-dependent type systems approximate all values in a given type as the same. That's a pretty serious limitation on their use in bug prevention.

pigworker
  • 43,025
  • 18
  • 121
  • 214
  • Church would be disappointed being counted as a software engineer. – effectfully Sep 28 '15 at 16:19
  • Thanks for the intuitive explanation, I've just bought the Manning book on Idris, maybe all this will be more solid in my head by next summer :) – jhegedus Sep 28 '15 at 17:27
  • So the point is that if somewhere deep down in my program I add two vectors, then with dependent typing I can be sure that I will never have a run time exception because those two vectors are of different length. Right? – jhegedus Sep 28 '15 at 17:44
  • Yes, and the compiler shouldn't even generate code to check for that exceptional condition. – pigworker Sep 28 '15 at 17:45
  • 1
    To provide a third-party endorsement of @pigworker's "shameless plug": _Hasochism_ is a fantastic paper. I'm a "regular" Haskell programmer, not a mathematician or a professor or a type-theorist, and the paper was the first I'd ever heard of dependent types. It took me a couple of reads-through to grasp the ideas in the article, but I haven't read a better introduction to the concept (even though the paper isn't pitched as an introduction!). – Benjamin Hodgson Oct 02 '15 at 15:46
  • @BenjaminHodgson Thank you! Sam and I had a great time writing that paper. And I should add that much of it was provoked by questions on this site. – pigworker Oct 02 '15 at 21:33
  • Btw, just reading the Idris book atm, it really blows my mind :) super good intro, easy for "dummies" :) – jhegedus Dec 28 '15 at 09:01
5

The common example is to encode the length of a list in it's type, so you can do things like (pseudo code).

cons :: a -> List a n -> List a (n+1)

Where n is an integer. This let you specify that adding an object to list increment its length by one.

You can then prevent head (which give you the first element of a list) to be ran on empty list

 head :: n > 0 => List a n -> a

Or do things like

to3uple :: List a 3 -> (a,a,a)

The problem with this type of approach is you then can't call head on a arbitrary list without having proven first that the list is not null.

Sometime the proof can be done by the compiler, ex:

 head (a `cons` l)

Otherwise, you have to do things like

 if null list
    then ...
    else (head list)

Here it's safe to call head, because you are in the else branch and therefore guaranteed that the length is not null.

However, Haskell doesn't do dependent type at the moment, all the examples have given won't work as nicely, but you should be able to declare this type of list using DataKind because you can promote a int to type which allow to instanciate List a b with List Int 1. (b is a phantom type taking a literal).

If you are interested in this type of safety, you can have a look a liquid Haskell.

Here is a example of such code

{-# LANGUAGE DataKinds, KindSignatures, TypeFamilies, TypeOperators #-}

import GHC.TypeLits


data List a (n:: Nat) = List [a] deriving Show

cons :: a -> List a n -> List a (n + 1)
cons x (List xs) = List (x:xs)

singleton :: a -> List a 1
singleton x = List [x]

data NonEmpty
data EmptyList

type family ListLength a where
  ListLength (List a 0) =  EmptyList
  ListLength (List a n) = NonEmpty

head' :: (ListLength (List a n) ~ NonEmpty) => List a n -> a
head' (List xs) = head xs

tail' :: (ListLength (List a n) ~ NonEmpty) => List a n -> List a (n-1)
tail' (List xs) = List (tail xs)

list = singleton "a"

head' list -- return "a"

Trying to do head' (tail' list) doesn't compile and give

Couldn't match type ‘EmptyList’ with ‘NonEmpty’
Expected type: NonEmpty
  Actual type: ListLength (List [Char] 0)
In the expression: head' (tail' list)
In an equation for ‘it’: it = head' (tail' list)
mb14
  • 22,276
  • 7
  • 60
  • 102
  • This is cool, how can I try this right now ? :) Could you give a few line example that I can compile and run ? Or get a type error if want to take the head of an empty list? – jhegedus Sep 27 '15 at 11:47
  • So basically you run this program (checking List Empty-ess) at compile time. I still don't see how this differs from a simple test. I mean is dependent typing going to help to prevent me from giving bad input to a program? Or what is the point of dependent typing? Should the input into the program also already be present at compile time? Is dependent typing useful for programs when the input is not known until runtime? – jhegedus Sep 27 '15 at 13:04
  • 1
    The difference with a simple test, is you (or your library users) CAN NOT forget to do this simple test,this is enforce by the compiler. There is a problem indeed for runtime value, it's still doable but far beyond this "5 minute" example, which is one of the reason dependent type is not main stream yet ;-). – mb14 Sep 27 '15 at 13:24
  • 3
    @jhegedus You don't need to know the input at compile time. You can make sure the input is valid (through normal means), then dependent types can help prevent some incorrect manipulations of the data *inside* the program after you know the input is ok. Those dependent type errors are caught at compile time, so you never need to worry about missing an edge case, like you do with tests. Tests are still important, but depedent types provide an additional, useful (and different) tool. – David Young Sep 27 '15 at 13:28
  • @mb14 Ok, then maybe one day somehow I will understand how the "longer than 5 min example" will work. But I get the point. The useful stuff is not very simple stuff. – jhegedus Sep 27 '15 at 13:42
  • If someone can recommend a good and easy source on a "longer than 5 min" example, let me know. – jhegedus Sep 27 '15 at 17:36
  • I compiled the snippet, it needed TypeOperators too with GHC 7.10.2. Thanks for the working code example ! – jhegedus Sep 27 '15 at 17:45
  • Here is an answer for my comment that I posted on Sep 27 '15 at 13:04 http://docs.idris-lang.org/en/latest/effects/simpleeff.html#exceptions in that webpage it is described that dependent types enforce that user data must be validated before it enters into the "inner, pure, total and proved to be correct" part of the program. – jhegedus Jan 26 '16 at 16:00
  • This is the precise link http://docs.idris-lang.org/en/latest/effects/simpleeff.html#vadd-revisited – jhegedus Jan 26 '16 at 16:06
4

Adding to @mb14's example, here's some simpler working code.

First, we need DataKinds, GADTs, and KindSignatures to really make it clear:

{-# LANGUAGE DataKinds      #-}
{-# LANGUAGE GADTS          #-}
{-# LANGUAGE KindSignatures #-}

Now let's define a Nat type, and a Vector type based on it:

data Nat :: * where
    Z :: Nat
    S :: Nat -> Nat

data Vector :: Nat -> * -> * where
    Nil   :: Vector Z a
    (:-:) :: a -> Vector n a -> Vector (S n) a

And voila, lists using dependent types that can be called safe in certain circumstances.

Here are the head and tail functions:

head' :: Vector (S n) a -> a
head' (a :-: _) = a
-- The other constructor, Nil, doesn't apply here because of the type signature!

tail' :: Vector (S n) a -> Vector n a
tail (_ :-: xs) = xs
-- Ditto here.

This is a more concrete and understandable example than above, but does the same sort of thing.

Note that in Haskell, Types can influence values, but values cannot influence types in the same dependent ways. There are languages such as Idris that are similar to Haskell but also support value-to-type dependent typing, which I would recommend looking into.

AJF
  • 11,767
  • 2
  • 37
  • 64
  • What do you mean by types can influence values in Haskell ? What do you mean by influence ? Influence how ? – jhegedus Sep 27 '15 at 17:57
  • @jhedgedus Ie: Typeclasses and GADTs allow values to be restricted by types, but you can't determine a type via a value like you can in Idris. [this](https://gist.github.com/paulkoerbitz/11281614) example in Idris may explain. See line 76, the type signature refers to a value. – AJF Sep 27 '15 at 20:02
3

The machines package lets users define machines that can request values. Many machines request only one type of value, but it's also possible to define machines that sometimes ask for one type and sometimes ask for another type. The requests are values of a GADT type, which allows the value of the request to determine the type of the response.

Step k o r = ...
           | forall t . Await (t -> r) (k t) r

The machine provides a request of type k t for some unspecified type t, and a function to deal with the result. By pattern matching on the request, the machine runner learns what type it must supply the machine. The machine's response handler doesn't need to check that it got the right sort of response.

dfeuer
  • 48,079
  • 5
  • 63
  • 167