23

This is probably a very basic question, but ...

A function that's defined as, say

foo :: a -> Integer

denotes a function from any type to an Integer. If so, then in theory one should be able to define it for any type, like so

foo 1 = 10
foo 5.3 = 100
foo (x:xs) = -1
foo  _     = 0

But Haskell only allows a general definition, like foo a = 0.

And even if you restrict a to be one of a certain class of types, such as an instance of the Show typeclass :

foo :: (Show a) => a -> Integer

you still can't do something like

foo "hello" = 10
foo   _     = 0

even though "hello" :: [Char] is an instance of Show

Why is there such a restriction?

Matvey Aksenov
  • 3,802
  • 3
  • 23
  • 45
user1425230
  • 231
  • 1
  • 3

5 Answers5

32

It's a feature, and actually is very fundamental. It boils down to a property known as parametricity in programming language theory. Roughly, that means that evaluation should never depend on types that are variables at compile time. You cannot look at a value where you do not know its concrete type statically.

Why is that good? It gives much stronger invariants about programs. For example, you know from the type alone that a -> a has to be the identity function (or diverges). Similar "free theorems" apply to many other polymorphic functions. Parametricity also is the basis for more advanced type-based abstraction techniques. For example, the type ST s a in Haskell (the state monad), and the type of the corresponding runST function, rely on s being parametric. That ensures that the running function has no way of messing with the internal representation of the state.

It is also important for efficient implementation. A program does not have to pass around costly type information at run time (type erasure), and the compiler can choose overlapping representations for different types. As an example of the latter, 0 and False and () and [] are all represented by 0 at runtime. This wouldn't be possible if a function like yours was allowed.

Andreas Rossberg
  • 34,518
  • 3
  • 61
  • 72
21

Haskell enjoys an implementation strategy known as "type erasure": types have no computational significance, so the code that you emit doesn't need to track them. This is a significant boon for performance.

The price you pay for this performance benefit is that types have no computational significance: a function can't change its behavior based on the type of an argument it is passed. If you were to allow something like

f () = "foo"
f [] = "bar"

then that property would not be true: the behavior of f would, indeed, depend on the type of its first argument.

There are certainly languages that allow this kind of thing, especially in dependently typed languages where types generally can't be erased anyway.

Daniel Wagner
  • 145,880
  • 9
  • 220
  • 380
  • 1
    It might be theoretically possible to just break the definition of `f` up into all possible types at compile time, but for the existence of existential types. – Louis Wasserman May 30 '12 at 07:52
  • 3
    This is interesting. When I programmed in Java, I was frequently annoyed by type erasure. It meant that a `List` didn't know what `T` was at runtime, so you couldn't write code like `if (x instanceOf T) {...}`. But I've never even noticed that Haskell uses type erasure. I wonder why that is. – Chris Taylor May 30 '12 at 08:23
  • 2
    Part of it is that since Haskell doesn't have "subtypes," you almost always know the _exact_ type of each variable at compile time, whereas in Java you might receive different implementations of the same interface. In the rare exceptions, like existentially quantified types, you've _deliberately_ thrown away that type information, so you must've known you wouldn't need that information in the future. – Louis Wasserman May 30 '12 at 09:40
  • 4
    @ChrisTaylor: Good point, I've had the same experience. Thinking about it a bit, I suspect the reason is sub-typing. Almost any time I wanted/needed to use `instanceof` was because of sub-typing. Also, sum types let you write code that works basically like that except you have to wrap everything in an ADT. It's like using `instanceof` except with all the possibilities made explicit. – Tikhon Jelvis May 30 '12 at 09:40
  • 1
    @ChrisTaylor Probably because GHC uses type erasure as an optimization and Java uses type erasure as guarantee. `typeOf x` in Haskell will call the correct instance of `Typeable` for x. Though there is a price to pay for generic functions... they're certainly not free. The generic tax will be paid at compile time, runtime or both. – Nathan Howell May 30 '12 at 09:47
  • @ChrisTaylor: You never notice in Haskell because you never use reflection in Haskell. Checked downcasts and `instanceof` are reflective features -- and that's the only place where you want to look at runtime types (Java also has to do it for array assignment, because of its broken subtyping rules, but that's a different story). In Haskell, you don't usually need reflection because you have more adequate language constructs for expressing typical program patterns, in particular, algebraic datatypes. – Andreas Rossberg May 30 '12 at 09:59
  • 1
    Actually, though, in the presence of a `Typeable` context, Haskell _does_ support the features the OP expects! – Louis Wasserman May 30 '12 at 10:04
  • While type erasure is fast compared to dynamic typing, it is typically not as fast as using explicitly specialised code for each possible type (like C++ templates) because then you also don't need dictionaries / vtable pointers. Fortunately, GHC can do that as well, at least for certain local functions. Unlike Java, I think. – leftaroundabout May 30 '12 at 10:15
  • @LouisWasserman: Yeah, well, if only Typeable wasn't such a terrible, entirely unsound hack... The way it actually works I cannot seriously regard it as reflection but just as an abomination. :) – Andreas Rossberg May 30 '12 at 10:44
  • That's a fair position, and frankly, I do my best to avoid it. But it is the primary way to approximate reflection in Haskell. – Louis Wasserman May 30 '12 at 10:57
20

For a function a -> Integer there's only one behaviour which is allowed - return a constant integer. Why? Because you have no idea what type a is. With no constraints specified, it could be absolutely anything at all, and because Haskell is statically typed you need to resolve everything to do with types at compile time. At runtime the type information no longer exists and thus cannot be consulted - all choices of which functions to use have already been made.

The closest Haskell allows to this kind of behaviour is the use of typeclasses - if you made a typeclass called Foo with one function:

class Foo a where
    foo :: a -> Integer

Then you could define instances of it for different types

instance Foo [a] where
    foo [] = 0
    foo (x:xs) = 1 + foo xs

instance Foo Float where
    foo 5.2 = 10
    foo _ = 100

Then as long as you can guarantee some parameter x is a Foo you can call foo on it. You still need that though - you can't then write a function

bar :: a -> Integer
bar x = 1 + foo x

Because the compiler doesn't know that a is an instance of Foo. You have to tell it, or leave out the type signature and let it figure it out for itself.

bar :: Foo a => a -> Integer
bar x = 1 + foo x

Haskell can only operate with as much information as the compiler has about the type of something. This may sound restrictive, but in practice typeclasses and parametric polymorphism are so powerful I never miss dynamic typing. In fact I usually find dynamic typing annoying, because I'm never entirely sure what anything actually is.

Matthew Walton
  • 9,809
  • 3
  • 27
  • 36
16

The type a -> Integer does not really mean "function from any type to Integer" as you're describing it. When a definition or expression has type a -> Integer, it means that for any type T, it is possible to specialize or instantiate this definition or expression into a function of type T -> Integer.

Switching notation slightly, one way to think of this is that foo :: forall a. a -> Integer is really a function of two arguments: a type a and a value of that type a. Or, in terms of currying, foo :: forall a. a -> Integer is a function that takes a type T as its argument, and produces a specialized function of type T -> Integer for that T. Using the identity function as an example (the function that produces its argument as its result), we can demonstrate this as follows:

-- | The polymorphic identity function (not valid Haskell!)
id :: forall a. a -> a
id = \t -> \(x :: t) -> x

This idea of implementing polymorphism as a type argument to a polymorphic function comes from a mathematical framework called System F, which Haskell actually uses as one of its theoretical sources. Haskell completely hides the idea of passing type parameters as arguments to functions, however.

Luis Casillas
  • 29,802
  • 7
  • 49
  • 102
  • 1
    +1 while the other answers say some great stuff, I believe this explanation of implicit type passing is The Correct Answer. Consider the difference in Typed Racket between the type `(Any -> Integer)` and `(All (A) (A -> Integer))`. Since Haskell does not have subtyping, the former is impossible. – Dan Burton May 30 '12 at 19:15
  • 1
    +1, but there's one thing I'd like to add: universal quantification can be viewed as a function taking an extra type argument only when the set of all types (`*` in Haskell) is abstract (i.e. you cannot pattern match on it). If it wasn't, you could very well have a function `\(t :: *) -> case t of Int -> ...`. As an example, the `id` function in Agda has type `(A : Set) → A → A`, because we aren't allowed to pattern match on `Set`, `id` must have form `id A x = x`. – Vitus May 30 '12 at 23:20
12

This question is based on a mistaken premise, Haskell can do that! (although it's usually only used in very specific circumstances)

{-# LANGUAGE ScopedTypeVariables, NoMonomorphismRestriction #-}

import Data.Generics

q1 :: Typeable a => a -> Int
q1 = mkQ 0 (\s -> if s == "aString" then 100 else 0)

q2 :: Typeable a => a -> Int
q2 = extQ q1 (\(f :: Float) -> round f)

Load this and experiment with it:

Prelude Data.Generics> q2 "foo"
0
Prelude Data.Generics> q2 "aString"
100
Prelude Data.Generics> q2 (10.1 :: Float)
10

This doesn't necessarily conflict with answers that claim types have no computational significance. That's because these examples require the Typeable constraint, which reifies types into data values that are accessible during runtime.

Most so-called generic functions (e.g. SYB) rely on either a Typeable or a Data constraint. Some packages introduce their own alternative functions to serve essentially the same purpose. Without something like these classes, it's not possible to do this.

John L
  • 27,937
  • 4
  • 73
  • 88