1

I'm trying to understand rules around type inference as I'd like to incorporate it into my own language, and in that spirit I've been playing around with F#'s type inference, and the following struck me as odd.

This compiles, and id is 'a -> 'a, which (if I am not mistaken), means that every invocation is using a "fresh" type.

let id x = x

let id1 = id 1
let id2 = id "two"

But when using an operator, it seems to be that the first invocation determines the signature for that function going forward.

Here, mul is reported as being int -> int -> int

let mul x y = x * y

let mul1 = mul 1 2
let mul2 = mul 1.1 2.2 // fails here

If I reorder them, then mul is float -> float -> float:

let mul x y = x * y

let mul2 = mul 1.1 2.2
let mul1 = mul 1 2 // fails here

Could you explain (in preferably non-academical) terms what the rules are and perhaps how it works from the perspective of the type checking implementation? Does it walk over the functions to check their types every time they are referenced? Or is there some other approach?

Jeff
  • 12,085
  • 12
  • 82
  • 152
  • 1
    Possible duplicate of [Hindley Milner Type Inference in F#](https://stackoverflow.com/questions/8396582/hindley-milner-type-inference-in-f) – glennsl May 25 '19 at 12:26
  • @glennsl I read that one before asking and felt the question and answer was quite a bit more intricate than what I am asking. – Jeff May 25 '19 at 13:06
  • 2
    Yes, although this question is being asked frequently from time to time, the link provided is only related to the title of the question. See for instance this: https://stackoverflow.com/questions/6285493/type-of-addition-in-f – Gus May 25 '19 at 16:36

2 Answers2

6

First note that this will not happen if we declare mul as an inline function:

let inline mul x y = x * y

let mul1 = mul 1 2  // works
let mul2 = mul 1.1 2.2 // also works

Here the inferred type of mul will be as follows:

x: ^a -> y: ^b ->  ^c
    when ( ^a or  ^b) : (static member ( * ) :  ^a *  ^b ->  ^c)

This type means that the parameters x and y can have any type (doesn't even have to be the same type) as long as at least one of them has a static member named * that takes arguments of the same types as x and y. The return type of mul will be the same as that of the * member.

So why don't you get the same behavior when mul isn't inline? Because member constraints (i.e. type constraints that say that a type must have specific members) are only allowed on inline functions - that's also why the type variables have a ^ in front instead of the usual ': to signify that we're dealing with a different, less limited kind of type variable.

So why does this limitation on non-inline functions exist? Because of what .NET supports. Type constraints like "T implements the interface I" are expressible in .NET bytecode and thus allowed in all functions. Type constraints like "T must have a specific member named X with type U" are not expressible and therefore not allowed on ordinary functions. Since inline functions don't have a corresponding method in the generated .NET bytecode, there is no need for their type to be expressible in .NET bytecode and therefore the limitations don't apply to them.

sepp2k
  • 363,768
  • 54
  • 674
  • 675
  • 1
    I didn't even know about inline functions — I am not an F# dev so my experience with it is limited to some experimentation with the type system. Does this mean that inline functions use structural typing for their parameters? Do you have any ideas as to how the constraint collection and solving is implemented in broad strokes? – Jeff May 25 '19 at 14:33
  • 1
    @Jeff It means they *can* use member constraints. This isn't quite as general as structural typing though: Something like `let inline f x = x.o` will still complain that `x` needs an explicit type before you can access members on it (to be honest I'm not exactly sure what the limitations of member constraints are exactly - I've only ever seen them used with operators like `+`, `*` etc.). – sepp2k May 25 '19 at 14:45
  • In my own implementation, unification is pretty simple, "if left is a function and right is a function, unify their parameters and return type", but I haven't yet figured out how to be able to define a function that uses operators like `+-/*` on their parameters which get solved whenever the type variables are replaced, because by the time they've been replaced the analyzer has already analyzed the function body. Do you have any tips on how the type checking for such constraints are done? – Jeff May 25 '19 at 14:51
  • @Jeff I'm not quite sure I follow. Knowing which concrete types will eventually be substituted for `^a` and `^b` isn't necessary to typecheck the function body. You "simply" add the constraint "there must be an `*` operator for ^a and ^b" when you see the `*` operator being used on values of types `^a` and `^b`. PS If your own language doesn't heavily feature OOP, I'd suggest maybe looking more in the direction of Haskell's type classes than how F# does it. ... – sepp2k May 25 '19 at 15:00
  • ... F# works the way it does (with operators being static members) because that's how .NET works. Without that constraint, other designs make more sense for a functional language. – sepp2k May 25 '19 at 15:00
  • my current (broken) approach is whenever I walk over a BinaryExpression, I prune both operand types and check whether they are compatible based on the operator (no operator overloading yet), but if one (or both of) the types are type variables, then I don't know what to do in that case and so I just return a "not compatible" error right now. So I guess I am trying to solve the constraint immediately whereas what you seem to be suggesting is to solve them later somehow? – Jeff May 25 '19 at 15:06
  • All the implementation samples I've found don't cover this. – Jeff May 25 '19 at 15:08
  • @Jeff Again, I'm not sure I follow. If you don't have operator overloading, that means that something like `*` has a fixed type, such as `*: Int -> Int -> Int`, right? Then an application of the `*` operator would not be any different than an application of any other function with that type and would simply result in the given type variables being assigned the type `Int`. – sepp2k May 25 '19 at 15:27
  • Sorry, I should've been more specific. For example, in my language, I want to allow `string + string = string`, and `number + number = number`, *as well as* `string + number = string`, but if I don't know both types up front, then I can't make that check. – Jeff May 25 '19 at 15:29
  • @Jeff Once you do support overloading (be it via type classes ala Haskell or member ala F#), seeing `*` should add the given typeclass/member constraint to the list of constraints on the involved type variables.Then when a concrete type is assigned to the type variable, you check that it implements all the typeclasses/has all the members or else you produce a type error. If the type variable is never assigned (i.e. if you infer a polymorphic/generic type), the constraints become part of the inferred type (e..g `Num a => a -> a`) and then you make sure ... – sepp2k May 25 '19 at 15:32
  • ... that they're satisfied when the function is called, – sepp2k May 25 '19 at 15:32
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/193926/discussion-between-sepp2k-and-jeff). – sepp2k May 25 '19 at 15:33
  • Marking this as the answer because `inline` was the reason that the 2nd example didn't work. – Jeff May 25 '19 at 17:51
4

This aspect of F# type inference is not particularly academically elegant, but it works great in practice. The way the F# type inference works is that the compiler initially treats everything as a type variable (generic type) and collects constraints on those. It then tries to solve those constraints.

For example, if you have:

let callWithTen f = f 10   

Then initially, the compiler assigns types such that callWithTen has type 'a and f has type 'b. It also collects the following constraints:

  • 'a = 'a0 -> 'a1 because callWithTen is syntactically defined as a function
  • 'a0 = 'b because the variable f is the argument of the function
  • 'b = 'b0 -> 'b1 because the variable f is used as a function
  • 'b0 = int becasue the argument of f is an int.
  • 'b1 = 'a1 because the result of calling f is the result of callWithTen.

Solving these constraints, the compiler then infers that callWithTen has a type (int -> 'b1) -> 'b1.

When you have + in your code, you cannot quite decide what exactly the numerical type is. Some other ML languages solve this by having + for integers and +. for floating-point numbers, but this is very ugly, so F# takes a different approach, which is somewhat ad-hoc.

As far as I know, F# has a constraint along the lines of 'a supports (+). So, what happens in your case (in a slightly simplified description) is that add is a function 'a0 -> 'a0 -> 'a0 where 'a0 supports (+).

When processing the rest of the code, the compiler also collects constraints 'a0 = int (on the first call) and 'a0 = float (on the second call). It first resolves the first one, which is fine (because int supports +) but then it fails on the second constraint because int != float and it reports an error there.

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • So constraints are collected *once* per function and then solved (and cached) *once* on the first unification? I'm trying to figure out how it's implemented. :D – Jeff May 25 '19 at 14:31
  • 1
    Generally, generalisation happens at the function level (so further constraints will not affect the type of the function), but I think the case with numerical type constraints is an exception - I think the compiler still works with a generic function (with a `+` constraint) and resolves the constraint based on how to function is called later. – Tomas Petricek May 25 '19 at 15:51
  • 1
    To be honest, if I was trying to implement a language, with type inference, I would probably ignore this part of the problem, at least initially, because that's where F# does a bit of an ugly hack to make the inference work nicely in practice. – Tomas Petricek May 25 '19 at 15:52
  • "resolves the constraint based on how to function is called later." Do you have any tips on how is this implemented in practice? – Jeff May 25 '19 at 16:10