6

I'm making a strongly typed toy functional programming language. It uses the Hindley Milner algorithm as type inference algorithm.

Implementing the algorithm, I have a question on how to infer types of the mutually recursive functions.

let rec f n = if n == 0 then 0 else g (n - 1)
let rec g n = if n == 0 then 0 else f (n - 1)

f and g are mutually recursive functions. Now, when the type checker is inferring the type of function f, it should also be able to infer the type of function g, since it is a subexpression.

But, in that moment, function g is not defined yet. Therefore, the type checker doesn't even know the existence of function g, as well as the type of function g, obviously.

What are some solutions that real world compilers/intepreters use?

suhdonghwi
  • 955
  • 1
  • 7
  • 20
  • This problem (family of problems?) in general is called [unification](https://en.wikipedia.org/wiki/Unification_(computer_science)), and there are several known algorithms for solving it. This applies to both types and logical formulas because the two [are quite similar](https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspondence) – Cubic Mar 06 '18 at 16:19
  • @Cubic I didn't know this problem is related to unification algorithm. Actually, I already implemented `unify` function, maybe without actually deeply understanding it. – suhdonghwi Mar 06 '18 at 16:31

1 Answers1

4

In OCaml, mutually recursive values are separated by the keyword and instead of another let rec. When the typing system arrives at a recursive definitions, it adds all the recursive names to the environment and then continues pretty much as usual.

UPDATE (thanks to K.A. Buhr):

It is totally possible to create a new variable with type 'a (with 'a being fresh) and then later unify it. Be sure to generalize your variable at the right place (usually, after the definition).

PatJ
  • 5,996
  • 1
  • 31
  • 37
  • Thank you very much. However, how does type system add the recursive names to the environment, although it does't know the actual type of the function yet? – suhdonghwi Mar 06 '18 at 16:07
  • 6
    Typically, it assigns them a fresh type variable and lets the regular unification process take care of the rest. This is the same way you'd handle a single recursive function. – K. A. Buhr Mar 06 '18 at 16:11
  • @K.A.Buhr Thank you. Do I have to assign fresh type variables to all the mutually recursive functions except the function which is directly inferred? – suhdonghwi Mar 06 '18 at 16:24
  • 7
    Ah, but start with the simpler problem of inferring the type of a single recursive function (think `factorial`). How will you "directly" infer its type when it depends on a function (namely itself) whose type is unknown to the compiler at the time of definition? Again, the answer is to assign it a fresh type variable while it's being defined. So, for a block of mutually recursive functions (including the special case of a single recursive function), you actually want to assign fresh type variables to all of the functions and then start inferring their types. – K. A. Buhr Mar 06 '18 at 17:17
  • @K.A.Buhr I really appreciate your kind help. Thank you very much. Perhaps the last question, what does the word "generalize" mean in the answer? I can find the meaning by searching the term(adding free type variables which do not appear in the current type environment to universal quantification), but I don't see what does this *actually* do(why it doesn't add type variables that appear in the current type environment?) and why this is needed here. – suhdonghwi Mar 06 '18 at 18:11
  • 2
    This answer may help: https://stackoverflow.com/a/904715/7203016. Note that even though the comments say that answer is wrong, they mean that it's not the correct answer to the question being asked, but it does address your question (and explains @PatJ's note about getting the timing of generalisation right). However, this is getting too complicated for comments. You might want to ask this as a separate question. – K. A. Buhr Mar 06 '18 at 18:55