Can we have type variables in constructor position in the Hindley Milner type system?

Question

In Haskell we can write the following data type:

data Fix f = Fix { unFix :: f (Fix f) }

The type variable f has the kind * -> * (i.e. it is an unknown type constructor). Hence, Fix has the kind (* -> *) -> *. I was wondering whether Fix was a valid type constructor in the Hindley Milner type system.

From what I read on Wikipedia, it seems that Fix is not a valid type constructor in the Hindley Milner type system because all type variables in HM must be concrete (i.e. must have the kind *). Is this indeed the case? If type variables in HM were not always concrete then would HM become undecidable?

score 6 · Accepted Answer · edited May 06 '16 at 19:09

What matters is whether type constructors form a first-order term language (no reduction behavior of type constructor expressions) or a higher-order one (with lambdas or similar constructs at type level).

In the former case, constraints arising from Fix are always unifiable in a most general way (assuming we stick to HM). In each c a b ~ t equation, t must be resolved to a concrete type application expression with the same shape as c a b, since c a b cannot possibly reduce to some other expression. Higher-kinded parameters aren't a problem, since they too just sit there in a static manner, for instance c [] ~ c f is solved by f = [].

In the latter case, c a b ~ t may or may not be solvable. In some cases it's solved by c = \a b -> t, in other cases there's no most general unifier.

chi · Answer 2 · 2016-05-06T14:40:38.227

2

Higher kinds go beyond the basic Hindley-Milner type system, but they can be handled in the same way.

Very roughly, HM parses the syntax tree of an expression, associates a free type variable to every subexpression, and generates a set of equational constraints over type-terms involving type variables according to the typing rules. The same can be done using higher kinds.

Then, the constraints are solved through unification. A typical step in the unification algorithm is (pseudo-haskell follows)

(F t1 ... tn := G s1 ... sk) =
  | n/=k || F/=G  -> fail
  | otherwise     -> { t1 := s1 , ... , tn := sn }

(Note that this is only a part of the unification algorithm.)

Above F, G are type constructor symbols, and not variables. In higher kinded unification, we need to account for F,G being variables as well. We could try the following rule:

(f t1 ... tn := g s1 ... sk) =
  | n/=k          -> fail
  | otherwise     -> { f := g , t1 := s1 , ... , tn := sn }

But wait! The above is not correct, since e.g. f Int ~ Either Bool Int must unify when f ~ Either Bool. So, we need to also consider the case where n/=k. In general, a simple rule set is

(f t := g s) =
  { f := g , t := s }
(F := G) =      -- rule for atomic terms
  | F /= G    -> fail
  | otherwise -> {}

(Again, this is only a part of the unification algorithm. Other cases must also be handled, as Andreas Rossberg points out below.)

edited May 06 '16 at 14:40

answered May 06 '16 at 07:50

chi

111,837
3
133
218

This answer is misleading, I think. Higher-order unification is more complicated than that. For example, I don't see your rules handling base cases like `(F := g t)`. In general, there is no unique solution, nor is the problem decidable. You need to severely restrict the problem space to avoid that (which Haskell does, e.g. by constraining the use of type synonyms). – Andreas Rossberg May 06 '16 at 12:47
1

@AndreasRossberg I explicitly avoided to mention higher-order unification in my answer, since higher kinds do not require it. Also, the above unification rule is not meant to provide a complete algorithm, but only a step ("A typical step in the unification algorithm is..."). I will try to make this points clear, to prevent confusion. – chi May 06 '16 at 14:38
well, in general, higher kinds _do_ require higher-order unification. So the answer should probably clarify what restrictions you assume on the language. – Andreas Rossberg May 06 '16 at 15:38
@AndreasRossberg I cannot help but think that you're wrong. Firstly, type synonyms are entirely unrelated to higher kinded type variables. You can have a language with type synonyms but without higher kinded type variables or (worse!) one with higher kinded type variables an no type synonyms. (Worst is Java that has no type synonyms and only * kinded type variables). Also, AFAIK, type synonyms are resolved long before type checking in GHC, so they don't matter in type/kind-checking at all. – Ingo May 06 '16 at 18:11
@Ingo, consider the type definitions `type T a = a; type U a = Int`, and the unification problem `(c Int == Int)`. Both `T` and `U` would be solutions for `c` in an unconstrained language. But Haskell requires type synonym constructors to always be fully applied(!), and furthermore, does not consider them (or more generally, type lambdas) further during unification, which are the kind of restrictions I was mentioning. – Andreas Rossberg May 06 '16 at 18:48
@AndreasRossberg, there is no such thing as an unconstrained language. In Haskell `T` and `U` are not types, so naturally they are not possible values for the type variable `c`. A language lacking `\a -> a` as a type is no more a "restriction" than lacking dependent types or linear types or higher inductive types or any other type system feature. – Reid Barton May 06 '16 at 21:46
@AndreasRossberg As I said before, type aliases don't exist anymore at type checking time. Hence, if such an unification problem arises in type checking, it is clear indication of a type error. It can arise, for example with something `u :: c Int -> c Bool; u = undefined; u 42`, but never are type synonyms consulted for a possible solution. (I'm not so sure about type families, however. But this is beyond HM anyways.) – Ingo May 06 '16 at 21:59
@Ingo, that the language defines synonyms to "not exist anymore" or "not be types" is just another way of saying that the language severely restricts type lambdas, i.e., higher kinds. At higher kind it only allows _nominal_ types (i.e., constants) to instantiate variables. Compare that to the full generality of System F-omega (the lambda calculus with higher kinds), where such restrictions do not exist. – Andreas Rossberg May 06 '16 at 23:20
@ReidBarton, I disagree with the analogy. Haskell _does_ introduce higher kinds and type lambdas (in the form of synonyms), but makes them second-class citizens. Again, compare with F-omega. The fact that you can partially apply _nominal_ type constructors but not _structural_ ones is clearly a restriction. And it's whole purpose is to keep type inference decidable. – Andreas Rossberg May 06 '16 at 23:24
@AndreasRossberg yes I understand, but the question was not about System F, but specifically about HM. Now the answer is completely adequate in HM context, and yet you call it misleading because "higher order unification" (that doesn't happen in HM, for good reason) "is more complicated". Makes no sense to me. – Ingo May 07 '16 at 08:10
@Ingo, the point is: you cannot just generalise HM to "higher kinds" and expect it to work. You have to be very careful _how_ and to what extent to add them. For example, restrict them to nominal types. Only then higher-order unification "doesn't happen"; there is no automatism. – Andreas Rossberg May 07 '16 at 11:59

Can we have type variables in constructor position in the Hindley Milner type system?

2 Answers2