sylvio July 29, 2011 at 23:42

Just about Hindley Milner

Transfer

Introduction

If you've ever been interested in not-so-popular languages, you must have heard of the Hindley Milner. This type inference algorithm is used in F # and Haskell and OCaml, as well as in their predecessor ML. Some researchers even try to use XM to optimize dynamic languages like Ruby, JavaScript, and Clojure.

And despite its prevalence, so far there has not been a simple and understandable explanation of what it is. How does this magic work? Will inferred types always be true? Or is Hindley Milner better than, say, Java? And while those who really know what XM will recover from the next mental strain, we will try to figure it out ourselves.

First step

Essentially, Hindley-Milner (or Damas-Milner) is an algorithm for deriving value types based on how they are used. Literally, he formalizes the intuitive idea that a type can be inferred from the operations it supports. Consider the following pseudo-Scala code ^[4] :

def foo(s: String) = s.length
// заметьте: без указания типов
def bar(x, y) = foo(x) + y

Just by looking at the definition of a function bar, we can say what its type should be (String, Int)=>Int. It is not difficult to deduce. We just look at the body of the function and see two ways to use the xand parameters y. xpassed in foo, which is waiting String. Therefore, it xmust have a type Stringin order for this code to compile. Moreover, it fooreturns a type value Int. The +class method Intexpects a parameter Int, therefore it ymust also be of type Int. Finally, we know what +returns Int, which means that this is the type returned bar.

And this is exactly what Hindley-Milner does: looking at the definition of a function, he calculates a lot of restrictions imposed on how the value is used. And this is exactly what we did when we noticed that it footakes a type parameter String. As many constraints are created, the algorithm completes the type recovery by unifying these constraints. If the expression is well-typed, then in the end the restrictions will lead to an unambiguous type. Otherwise, one (or several) restrictions will be mutually exclusive or unsolvable with this set of types.

Informal algorithm

Simple example

In order to make it easier to understand the algorithm, we perform each step manually. First, there are a lot of restrictions to be deduced. To begin with assignment of each variable ( xand y) of a new type (i.e., "non-existing"). If we wrote barwith these types, something like ^[1] would have come out :

def bar(x: X, y: Y) = foo(x) + y

It does not matter what the types are called, the main thing is that they do not intersect with the “real” types. They are needed so that the algorithm can explicitly refer to the not yet known type of each value. Without which it is impossible to create many restrictions.

Next, we plunge into the body of a function in search of operations that impose some type restrictions. And so, the first operation is a method call foo. ^[2] We know what type fooit is String=>Intand this allows us to write down a constraint: The next operation will be related to the value . Scala represents all statements as a method call, i.e. this expression really means " ". We already know that this is an expression of type (from type returned

X ↦ String

+yfoo(x).+(y)foo(x)Intfoo), as well as what is +defined as a class method Intwith a type Int=>Int(of course, it is not entirely correct to use the magic inherent only to Scala). This allows us to add one more restriction to our set: The last step in the restoration of types is the unification of all these restrictions in order to obtain real types, which we substitute for the variables and . Unification, literally, is the process of going through all the restrictions in an attempt to find the only type that satisfies them all. For example, imagine you were given the following facts:

X ↦ String
Y ↦ Int

XY

Masha is tall
Vasya tall
Masha loves red
Vasya loves blue

Now, look at these limitations:

Someone is tall
Someone loves red

Hmm, what could Someone be? This process of combining many constraints with many facts can be mathematically formalized in the form of unification. In the case of type recovery, just rename “types” to “facts”.

In this example, unification of the set of constraints is trivial. We had only one constraint for each value of ( xand y) both of them are displayed in real types. The only thing that was required was to substitute " String" instead of " X" and " Int" instead of " Y".

Complex example

To feel the power of unification, consider an example more complicated. Suppose we have the following function ^[3] :

def baz(a, b) = a(b) :: b

Here a function is declared baz, which of the parameters takes a function and something else, calling the function from the second parameter passed to it and then attaching the result to it itself. It is easy to infer many restrictions for this function. As before, we will start by creating new types for each value.
Note that in this case we annotate not only the parameters, but also the return value.

def baz(a: X, b: Y): Z = a(b) :: b

The first limitation that we derive will be that the parameter amust be a function that takes a type value Yand returns the value of the new type Y'. Moreover, we know that ::this is a function over the class List[A]that takes a new element Aand returns a new list List[A]. Thus, we know that Yand Zmust be of type . Limitations recorded in a formal form: Now unification is not so much trivial. It is important that the variable depends on , that is, at least one additional step is required: This is the same set of restrictions, only this time we substituted the well-known map for into the map forList[Y']

X ↦ (Y=>Y')
Y ↦ List[Y']
Z ↦ List[Y']

XY

X ↦ (List[Y']=>Y')
Y ↦ List[Y']
Z ↦ List[Y']

YX. With the help of this substitution, we got rid of X, Yand Zfrom the types we deduced, having received the following, typed function baz:

def baz(a: List[Y`]=>Y`, b: List[Y`]): List[Y`] = a(b) :: b

Of course, this is still not the end result. Even if the type name Y'were syntactically correct, the absence of such a type would be an error. This happens quite often when working with the Hindley-Milner type recovery algorithm. Somehow, after all the restrictions and unification are deduced, we have “remnants” of variable types for which there are no more restrictions.
The solution can be considered unlimited variables like type parameters . In the end, if the parameter is not limited by anything, then we can simply substitute any type, including the generic. And so, the final version of the function baz, where we added an unlimited parameter of type " A" and replaced it with all the occurrences Y'in the output types:

def baz[A](a: List[A]=>A, b: List[A]): List[A] = a(b) :: b

Conclusion

And it's all! Hindley-Milner is no more complicated than what we described here. It’s easy to imagine how this algorithm can be used to reconstruct types of expressions that are much more complex than we examined.

I hope this article has helped you better understand how the Hindley-Milner algorithm works. This type of type inference can be very useful, reducing the amount of code needed for static typing to the minimum necessary. The example " bar" was specifically cited in a Ruby-like syntax to show that this information is still sufficient for type checking. This may be useful the next time someone says that all statically typed languages are too expressive.

Footnotes and comments

The return type is omitted for ease of reading. Technically, types are always inferred for expression in general.
This is a search in depth by SDA , which means that we primarily look at operations with high priority. Although the workaround is not technically important, it is thus easier to think about the algorithm.
" ::" Is the so-called cons-operator in Scala. He is responsible for attaching the left element to the right list.
About the correctness of using Scala for code examples

Tags: