23

Many statically typed languages have parametric polymorphism. For example in C# one can define:

T Foo<T>(T x){ return x; }

In a call site you can do:

int y = Foo<int>(3);

These types are also sometimes written like this:

Foo :: forall T. T -> T

I have heard people say "forall is like lambda-abstraction at the type level". So Foo is a function that takes a type (for example int), and produces a value (for example a function of type int -> int). Many languages infer the type parameter, so that you can write Foo(3) instead of Foo<int>(3).

Suppose we have an object f of type forall T. T -> T. What we can do with this object is first pass it a type Q by writing f<Q>. Then we get back a value with type Q -> Q. However, certain f's are invalid. For example this f:

f<int> = (x => x+1)
f<T> = (x => x)

So if we "call" f<int> then we get back a value with type int -> int, and in general if we "call" f<Q> then we get back a value with type Q -> Q, so that's good. However, it is generally understood that this f is not a valid thing of type forall T. T -> T, because it does something different depending on which type you pass it. The idea of forall is that this is explicitly not allowed. Also, if forall is lambda for the type level, then what is exists? (i.e. existential quantification). For these reasons it seems that forall and exists are not really "lambda at the type level". But then what are they? I realize this question is rather vague, but can somebody clear this up for me?


A possible explanation is the following:

If we look at logic, quantifiers and lambda are two different things. An example of a quantified expression is:

forall n in Integers: P(n)

So there are two parts to forall: a set to quantify over (e.g. Integers), and a predicate (e.g. P). Forall can be viewed as a higher order function:

forall n in Integers: P(n) == forall(Integers,P)

With type:

forall :: Set<T> -> (T -> bool) -> bool

Exists has the same type. Forall is like an infinite conjunction, where S[n] is the n-th elemen to of the set S:

forall(S,P) = P(S[0]) ∧ P(S[1]) ∧ P(S[2]) ...

Exists is like an infinite disjunction:

exists(S,P) = P(S[0]) ∨ P(S[1]) ∨ P(S[2]) ...

If we do an analogy with types, we could say that the type analogue of ∧ is computing the intersection type ∩, and the type analogue of ∨ computing the union type ∪. We could then define forall and exists on types as follows:

forall(S,P) = P(S[0]) ∩ P(S[1]) ∩ P(S[2]) ...
exists(S,P) = P(S[0]) ∪ P(S[1]) ∪ P(S[2]) ...

So forall is an infinite intersection, and exists is an infinite union. Their types would be:

forall, exists :: Set<T> -> (T -> Type) -> Type

For example the type of the polymorphic identity function. Here Types is the set of all types, and -> is the type constructor for functions and => is lambda abstraction:

forall(Types, t => (t -> t))

Now a thing of type forall T:Type. T -> T is a value, not a function from types to values. It is a value whose type is the intersection of all types T -> T where T ranges over all types. When we use such a value, we do not have to apply it to a type. Instead, we use a subtype judgement:

id :: forall T:Type. T -> T
id = (x => x)

id2 = id :: int -> int

This downcasts id to have type int -> int. This is valid because int -> int also appears in the infinite intersection.

This works out nicely I think, and it clearly explains what forall is and how it is different from lambda, but this model is incompatible with what I have seen in languages like ML, F#, C#, etc. For example in F# you do id<int> to get the identity function on ints, which does not make sense in this model: id is a function on values, not a function on types that returns a function on values.


Can somebody with knowledge of type theory explain what exactly are forall and exists? And to what extent is it true that "forall is lambda at the type level"?

Jules
  • 6,318
  • 2
  • 29
  • 40
  • If you are into this kind of stuff, you should be *very* interested in [Types and Programming Languages](http://www.cis.upenn.edu/~bcpierce/tapl/). – Dan Burton Apr 09 '12 at 05:54

3 Answers3

17

Let me address your questions separately.

  • Calling forall "a lambda at the type level" is inaccurate for two reasons. First, it is the type of a lambda, not the lambda itself. Second, that lambda lives on the term level, even though it abstracts over types (lambdas on the type level exist as well, they provide what is often called generic types).

  • Universal quantification does not necessarily imply "same behaviour" for all instantiations. That is a particular property called "parametricity" that may or may not be present. The plain polymorphic lambda calculus is parametric, because you simply cannot express any non-parametric behaviour. But if you add constructs like typecase (a.k.a. intensional type analysis) or checked casts as a weaker form of that, then you loose parametricity. Parametricity implies nice properties, e.g. it allows a language to be implemented without any runtime representation of types. And it induces very strong reasoning principles, see e.g. Wadler's paper "Theorems for free!". But it's a trade-off, sometimes you want dispatch on types.

  • Existential types essentially denote pairs of a type (the so-called witness) and a term, sometimes called packages. One common way to view these is as implementation of abstract data types. Here is a simple example:

    pack (Int, (λx. x, λx. x)) : ∃ T. (IntT) × (TInt)

    This is a simple ADT whose representation is Int and that only provides two operations (as a nested tuple), for converting ints in and out of the abstract type T. This is the basis of type theories for modules, for example.

  • In summary, universal quantification provides client-side data abstraction, while existential types dually provides implementor-side data abstraction.

  • As an additional remark, in the so-called lambda cube, forall and arrow are generalised to the unified notion of Π-type (where T1→T2 = Π(x:T1).T2 and ∀A.T = Π(A:&ast;).T) and likewise exists and tupling can be generalised to Σ-types (where T1×T2 = Σ(x:T1).T2 and ∃A.T = Σ(A:&ast;).T). Here, the type &ast; is the "type of types".

Andreas Rossberg
  • 34,518
  • 3
  • 61
  • 72
  • So a value of type `forall T. Q` is a function of type `* -> Q`, and in priciple it can be any such function, except that some languages only allow you to express a certain subset that happens to obey parametricity? And the model that I described in "A possible explanation..." is invalid, or at least not used in type theory? The Π and Σ seem very similar to the forall and exists that I described there, except that they do product & tagged union rather than union & intersection? – Jules Apr 08 '12 at 13:43
  • Yes, pretty much. In type theory, you usually work with fully explicit proof terms, i.e. all products and sums, whether finite or infinite, come with explicit introduction and elimination forms. That said, people have also looked into notions of union and intersection types without such explicit forms, but those tend to have rather hairy meta-theoretical properties even in the finite case (e.g. type checking often is undecidable). – Andreas Rossberg Apr 08 '12 at 14:04
  • So just like you can violate parametricity if you can dispatch based on the type passed in (like `f` and `f` above), you can also violate the information hiding of existentials in the same way? For example if you have a package `(T, (a,b))`, you can analyze the first component, and if it is a type you know about you can violate information hiding. For example if you checked whether `T` is an Int, and if it is, you can now read the values returned by `a`? – Jules Apr 08 '12 at 14:14
  • Parametricity is a property of values, right? For example if we have a value `f`, whether `f` is a parametric identity function is an observable property. Isn't saying that `f` is a parametric identity function exactly the same as saying that `f` has BOTH the type `int -> int`, `string -> string`, etc. for all types? (i.e. not that for every type we have a different `f`, we have a single value `f`). So the type that does guarantee parametricity regardless of what language features the language has, is the infinite intersection of `t -> t` for all types `t`? – Jules Apr 08 '12 at 14:22
  • Yes, exactly, lack of parametricity breaks data abstraction through existentials. To remedy, you can introduce a mechanism for generating new type names at runtime (you can find a couple of papers on this very topic on my homepage if you are interested). -- Not sure I fully follow your second question. That a single function has several (or even infinitely many) types does not imply that it behaves the same on all. It depends on what primitives the language has. – Andreas Rossberg Apr 08 '12 at 14:46
  • Sorry, I should have explained it better. First, lets define the set of types as the powerset of the set of values, or equivalently, a type is any predicate on values (could be a non-computable one, and in fact most are not computable). What does it *mean* for the function `id : 'a -> 'a` to be parametric? I think it means that if we apply it to any value of type T, then the return value will also be in type T. So even with typecase, `f x = if x is int then x+1 else x` is not parametric, because take the singleton type T = {2}. We have 2 in T, but f 2 is not in T, so f is not parametric. – Jules Apr 08 '12 at 18:43
  • So we could introduce a new construct along with forall, that explicitly ensures parametricity in the type system regardless of what features the language has. Lets call it `intersect`. For example `intersect a:Type. a -> a` takes the infinite intersection type with a ranging over all types (i.e. any predicate on values). I think this ensures parametricity. So now a language can have both values that take types as arguments (and possibly dispatch on them), *and* you can have parametricity. Has something like this been explored? Or do you see obvious reasons why it can't work? – Jules Apr 08 '12 at 18:48
  • Yes, different solutions along these lines have been used. The simplest one perhaps is to distinguish between analyzable and non-analyzable types in the kind system. So you can have either ∀A:&ast;.T or ∀A:$.T, where $ is the kind of analyzable types (a subkind of &ast;). Typecase requires its type arguments to be analyzable. A somewhat different solution is used in Haskell/GHC, where analyzable types are expressed with the `Typeable` type class. – Andreas Rossberg Apr 08 '12 at 19:37
  • Thanks for your help. Your work on non parametric parametricity was especially helpful. I'm still digesting all this information. Do you have a book recommendation? I do have TAPL but it does not explain all these things in detail. – Jules Apr 10 '12 at 18:21
  • You're welcome. Unfortunately, I'm not aware of any decent book discussing this matter in any detail beyond TAPL. You probably have to look at actual research papers at that point. What exactly are you looking for? – Andreas Rossberg Apr 10 '12 at 19:05
  • Can you recommend some must read papers? I'm interested in any interesting topics related to types. For example one issue is that all stuff in types comes in pairs: products & sums, universals & existentials, intersections & unions, etc. This seems somehow related to continuations. Also related is that the meaning of the type `f : int -> string` is not only saying something about f "when called with an int, f returns a string", but it's also saying something about the continuation: "the continuation will only call f with an int, and it will be happy with a string". – Jules Apr 11 '12 at 17:58
  • Do you have a recommendation that clears up this confusion? Another topic that I'd like to learn more about is this. With dynamic checking there is a notion of completeness. When you put an assertion in your program, there is a complete procedure that can tell you if the assertion can fail: just run the program on all possible inputs. Because there are countably many, if your program can fail the assertion, then it will find the problem. Is there a similar notion of types/type inference in stronger type systems? You could of course enumerate all possible types and check if they are valid. – Jules Apr 11 '12 at 18:01
  • However, in the assertion problem finding case, there is a lot of work to actually make it practical. For example MSR's Pex is a tool that employs a SMT solver to efficiently find assertion failures instead of brute force iteration over all possible inputs. Is there similar work to make *practical* complete type inference algorithms for expressive type systems? – Jules Apr 11 '12 at 18:03
  • @Jules, I don't believe SO comments are a good technical medium and have adequate visibility for the interesting technical questions you have. Why not create a "forum topic" on [Lambda The Ultimate](http://lambda-the-ultimate.org/)? – gasche Apr 11 '12 at 20:16
  • I'm afraid that "any interesting topic related to types" is a bit too broad, cause there are thousands of papers. And I don't understand your other question -- enumerating countably many possibilities doesn't give you a complete algorithm. It's semi-complete at best. – Andreas Rossberg Apr 11 '12 at 22:58
  • _universal quantification provides client-side data abstraction_ - why do you refer to data? Aren't we abstracting from types? Otherwise a very educational answer. Thank you for sharing your knowledge! –  Oct 21 '17 at 11:50
  • @ftor, "data abstraction" as in abstract data types. You are right that this is abstracting types, but with them the representation of associated data, which is where the term comes from. – Andreas Rossberg Oct 22 '17 at 16:37
8

A few remarks to complement the two already-excellent answers.

First, one cannot say that forall is lambda at the type-level because there already is a notion of lambda at the type level, and it is different from forall. It appears in system F_omega, an extension of System F with type-level computation, that is useful to explain ML modules systems for example (F-ing modules, by Andreas Rossberg, Claudio Russo and Derek Dreyer, 2010).

In (a syntax for) System F_omega you can write for example:

type prod =
  lambda (a : *). lambda (b : *).
    forall (c : *). (a -> b -> c) -> c

This is a definition of the "type constructor" prod, such as prod a b is the type of the church-encoding of the product type (a, b). If there is computation at the type level, then you need to control it if you want to ensure termination of type-checking (otherwise you could define the type (lambda t. t t) (lambda t. t t). This is done by using a "type system at the type level", or a kind system. prod would be of kind * -> * -> *. Only the types at kind * can be inhabited by values, types at higher-kind can only be applied at the type level. lambda (c : k) . .... is a type-level abstraction that cannot be the type of a value, and may live at any kind of the form k -> ..., while forall (c : k) . .... classify values that are polymorphic in some type c : k and is necessarily of ground kind *.

Second, there is an important difference between the forall of System F and the Pi-types of Martin-Löf type theory. In System F, polymorphic values do the same thing on all types. As a first approximation, you could say that a value of type forall a . a -> a will (implicitly) take a type t as input and return a value of type t -> t. But that suggest that there may be some computation happening in the process, which is not the case. Morally, when you instantiate a value of type forall a. a -> a into a value of type t -> t, the value does not change. There are three (related) ways to think about it:

  • System F quantification has type erasure, you can forget about the types and you will still know what the dynamic semantic of the program is. When we use ML type inference to leave the polymorphism abstraction and instantiation implicit in our programs, we don't really let the inference engine "fill holes in our program", if you think of "program" as the dynamic object that will be run and compute.

  • A forall a . foo is not a something that "produces an instance of foo for each type a, but a single type foo that is "generic in an unknown type a".

  • You can explain universal quantification as an infinite conjunction, but there is an uniformity condition that all conjuncts have the same structure, and in particular that their proofs are all alike.

By contrast, Pi-types in Martin-Löf type theory are really more like function types that take something and return something. That's one of the reason why they can easily be used not only to depend on types, but also to depend on terms (dependent types).

This has very important implications once you're concerned about the soundness of those formal theories. System F is impredicative (a forall-quantified type quantifies on all types, itself included), and the reason why it's still sound is this uniformity of universal quantification. While introducing non-parametric constructs is reasonable from a programmer's point of view (and we can still reason about parametricity in an generally-non-parametric language), it very quickly destroys the logical consistency of the underlying static reasoning system. Martin-Löf predicative theory is much simpler to prove correct and to extend in correct way.

For a high-level description of this uniformity/genericity aspect of System F, see Fruchart and Longo's 97 article Carnap's remarks on Impredicative Definitions and the Genericity Theorem. For a more technical study of System F failure in presence of non-parametric constructs, see Parametricity and variants of Girard's J operator by Robert Harper and John Mitchell (1999). Finally, for a description, from a language design point of view, on how to abandon global parametricity to introduce non-parametric constructs but still be able to locally discuss parametricity, see Non-Parametric Parametricity by George Neis, Derek Dreyer and Andreas Rossberg, 2011.

This discussion of the difference between "computational abstraction" and "uniform abstract" has been revived by the large amount of work on representing variable binders. A binding construction feels like an abstraction (and can be modeled by a lambda-abstraction in HOAS style) but has an uniform structure that makes it rather like a data skeleton than a family of results. This has been much discussed, for example in the LF community, "representational arrows" in Twelf, "positive arrows" in Licata&Harper's work, etc.

Recently there have been several people working on the related notion of "irrelevance" (lambda-abstractions where the result "does not depend" on the argument), but it's still not totally clear how closely this is related to parametric polymorphism. One example is the work of Nathan Mishra-Linger with Tim Sheard (eg. Erasure and Polymorphism in Pure Type Systems).

gasche
  • 31,259
  • 3
  • 78
  • 100
  • Yes! That is exactly what I mean. In ML a type `forall a. a -> a` means that there is a *single* value that has both type `int -> int`, `string -> string`, etc. That is what made me confused and still feels a little dirty about modeling that as a function that takes a type and returns the same value every time. The only thing that guarantees that it's the same value is the absence of certain features from ML. Having an explicit way of saying "this *single* value has *all* the types `a -> a`" seems cleaner (e.g. the intersection of `a -> a` where `a` ranges over types). But this doesn't exist? – Jules Apr 08 '12 at 19:08
  • "The only thing that guarantees that it's the same value is the absence of certain features from ML." I don't think this is an accurate description, or rather I don't think that this is an *useful* description. You can always lose a good property by messing things up, it does not mean that the property is not well-justified. System F is at the edge of inconsistency because it is very powerful logically and computationally, so it is to be expected that small changes break the system. – gasche Apr 08 '12 at 22:04
  • You have the impression that System F parametricity is not robust because you have a syntactic point of view. If you instead studied the system as a Curry-style system complemented with type derivations -- untyped terms plus proofs of well-typing -- the type-erasability property would jump in your face. Similarly if you defined typings semantically, using for example a logical relation model (see for example neelk's [Adding Equations to System F Types](www.cs.cmu.edu/~neelk/esop12.pdf)), you would define `forall` as an intersection. Yet it's the exact same language you would be talking about. – gasche Apr 08 '12 at 22:09
  • Right, I have no doubts about System F on its own, as I see that no construct depends on the types. However, you do not always want parametricity for everything. It is often useful to write things that *do* depend on a type parameter. On the other hand, you do not want to lose parametricity for functions like id, map, etc. I guess my question is: how do you combine parametricity and parametrization? Part of this is: if a function like id does not depend on the type, then why is it a function of said type in the first place? – Jules Apr 08 '12 at 22:28
  • Part of the motivation does not come from types, but from contracts. Values are easily parametrized by contracts, as contracts are values. However, if you write `id` in this way: `id c x = x`, and you apply the dependent contract `(c:Contract) -> (c -> c)` to it to make sure that it behaves correctly, this does not give you parametricity. Rather than relying on the language not being able to dispatch on types/contracts, some other construct is needed that lets you explicitly say "I want parametricity here". There is a paper on parametric contracts, but the same reasons to dislike it apply. – Jules Apr 08 '12 at 22:39
  • [Why it does not satisfy parametricity: `id c x = if c==Int then x+1 else x` is a valid value for that contract] – Jules Apr 08 '12 at 22:42
  • So I was hoping that there was something like this in the type theory literature that can be transposed to contracts. – Jules Apr 08 '12 at 22:51
  • @Jules, I disagree with your idea that you want to lose parametricity in the manner you describe. Type-erasability is a good property that you want to preserve; it can be abandoned for some reasons (for example type-classes), but you want to explain those constructs by translating them to an evidence-passing language with type erasure again. But you should have a look at Neis, Dreyer and Rossberg's [Non-Parametric Parametricity](http://www.mpi-sws.org/~dreyer/papers/npp/jfp.pdf). I'll add this reference, and more goodness from Andreas, in my answer. – gasche Apr 09 '12 at 07:24
  • Regarding contracts, I think that you still want them to not affect the computation they're guarding. You need to formalize a notion of "contract erasability" and show that *none* of the contract affect the returned result (besides precipitating contract-failure which is distinct from all other computation results). `choose c x y = if c x then x else y` should be disallowed if `c` has a contract type. That's not only an opinion on the type system, but on your language design. Now you *could* use type-system features to control erasability of a contract, but I claim that you *always* want it. – gasche Apr 09 '12 at 07:27
  • With contracts that's a little difficult because you do want to be able to use contracts. Another problem is that if you apply the literal translation of parametric types to contracts then some errors will not be caught. For example if you write id c x = x+1 with dependent contract (c:Contract) -> (c -> c) then if you only use id Int then you will never discover that id was not parametric (though you will discover this if you use for example id NegativeInt (-1)). Another problem is that with contracts you will have to explicitly pass in the contracts to the functions, since you can't infer it. – Jules Apr 10 '12 at 18:25
  • So I think there is still something to be discovered about parametric contracts. Thanks for your help. The links you provided were very helpful, especially Andreas' work on non parametric parametricity, because a similar sealing mechanism can be applied to contracts. – Jules Apr 10 '12 at 18:25
  • I'm not convinced by your example: the contract `(c : Contract) -> c -> c` does, I think, enforce parametricity. Your `id Int` example only says that dynamically checking a single instance of this contract does not tell you if the contract was respected (hence you could pass non-parametric function unnoticed). That's an orthognal question, dynamic or static checking of contracts satisfaction, and this already happens without polymorphism considerations: you could define a function at contract `True -> False` if you don't use it. – gasche Apr 11 '12 at 04:33
  • Well, with other contracts you will always get either a run-time error or your program will run as if the value was valid for the contracts. In other words you can never observe an invalid value passing a contract. For example if your function passes the contract `int -> string`, then you can never observe it returning something that's not a string. However with the `(c : Contract) -> c -> c` applied to the evil id function, you *can* observe non-parametricity. So you lose the essential property of contracts. – Jules Apr 11 '12 at 18:08
  • While you can fix that with dynamic sealing, that does not solve the problem that it's simply ugly to pass in contracts to every polymorphic value. – Jules Apr 11 '12 at 18:16
  • Thanks for the "observing invalid" argument; indeed you consider parametricity an inherent part of a "parametric contract", and you are right it should fail directly. We should look into contract inference. – gasche Apr 11 '12 at 20:13
7

if forall is lambda ..., then what is exists

Why, tuple of course!

In Martin-Löf type theory you have Π types, corresponding to functions/universal quantification and Σ-types, corresponding to tuples/existential quantification.

Their types are very similar to what you have proposed (I am using Agda notation here):

Π : (A : Set) -> (A -> Set) -> Set
Σ : (A : Set) -> (A -> Set) -> Set

Indeed, Π is an infinite product and Σ is infinite sum. Note that they are not "intersection" and "union" though, as you proposed because you can't do that without additionally defining where the types intersect. (which values of one type correspond to which values of the other type)

From these two type constructors you can have all of normal, polymorphic and dependent functions, normal and dependent tuples, as well as existentially and universally-quantified statements:

-- Normal function, corresponding to "Integer -> Integer" in Haskell
factorial : Π ℕ (λ _ → ℕ)

-- Polymorphic function corresponding to "forall a . a -> a"
id : Π Set (λ A -> Π A (λ _ → A))

-- A universally-quantified logical statement: all natural numbers n are equal to themselves
refl : Π ℕ (λ n → n ≡ n)


-- (Integer, Integer)
twoNats : Σ ℕ (λ _ → ℕ)

-- exists a. Show a => a
someShowable : Σ Set (λ A → Σ A (λ _ → Showable A))

-- There are prime numbers
aPrime : Σ ℕ IsPrime

However, this does not address parametricity at all and AFAIK parametricity and Martin-Löf type theory are independent.

For parametricity, people usually refer to the Philip Wadler's work.

Rotsor
  • 13,655
  • 6
  • 43
  • 57
  • Thanks! Another excellent answer. Indeed as you say the the infinite sums and products are different in exactly the way I was asking about, namely that they result in non-parametricity. They illustrate my concern with forall in ML precisely, because they are a generalization that *does* let you create non-parametricity if you can dispatch on the types. The sum and product look like very general and beautiful ideas, I'll read about them some more :) – Jules Apr 08 '12 at 18:59