35

I often hear that F# lacks support for OCaml row types, that makes the language more powerful than F#.

What are they? Are they algebraic data types, such as sum types (discriminated unions) or product types (tuples, records)? And is it possible to write row types in other dialects, such as F#?

MiP
  • 5,846
  • 3
  • 26
  • 41
  • what are they : https://caml.inria.fr/pub/docs/manual-ocaml/extn.html#s-private-rows – Pierre G. Jan 04 '18 at 10:08
  • 2
    @PierreG. that's a description on how the `private` keyword behaves in the case of row types. Not a description of row types. – PatJ Jan 04 '18 at 10:44
  • 3
    I don't know ocaml, but row polymorphism is an alternative to subtyping in conjunction with record types. Row polymorphism allows you to preserve "unused" type information (part of the record's structure) within a row variable (such data types are polymorphic in their row variable). With subtyping, however, you lose this type information. –  Jan 04 '18 at 11:09
  • 1
    @AndreyTyukin I see you added bounty to the question. Is there anything specific you are interested in that's missing in the existing answers? – Tomas Petricek Jul 18 '18 at 20:08
  • @TomasPetricek No. As the bounty message says, it's there because I found one of the answers exemplary and worthy of an additional bounty. I couldn't award it immediately, the system told me to wait 24 hours. – Andrey Tyukin Jul 18 '18 at 20:13
  • 2
    @AndreyTyukin Ah, I missed the message. I completely agree, the existing answer is very thorough and deserves an extra credit! – Tomas Petricek Jul 18 '18 at 20:21

3 Answers3

54

First of all, let's ensure that we use terminology that is consistent with the OCaml type system and corresponding white papers. There is no such thing as "row type" in the type system of OCaml, however, it has "row polymorphism" and we will discuss it below0.

Row polymorphism is a form of polymorphism. OCaml provides two kinds of polymorphism - parametric and row, and lacks the other two - ad hoc and inclusion (aka subtyping)1.

First of all, what is polymoprhism? In the context of type systems, polymorphism allows a single term to have several types. The problem here is that the word type itself is heavily overloaded in the computer science and programming language community. So to minimize the confusion, let's just reintroduce it here, to be on the same page2. A type of a term usually denotes some approximation of the term semantics. Where semantics could be as simple as a set of values equipped with a set of operations or something more complex, like effects, annotations, and arbitrary theories. In general, semantics denotes a set of all possible behaviors of a term. A type system denotes a set of rules, that allows some language constructs and disallows others based on their types. I.e., it verifies that compositions of terms behave correctly. For example, if there is a function application construct in a language the type system will allow an application only to those arguments that have types that match the types of parameters. And that's where polymorphism comes into play. In monomorphic type systems, this match could be only one to one, i.e., literal. Polymorphic type systems provide mechanisms to specify some regular expression that will match with a family of types. So, different kinds of polymorphism are simply different kinds of regular expressions that you may use to denote the family of types.

Now let's look at different kinds of polymorphism from this perspective. For example, parametric polymorphism is like a dot in regular expressions. E.g., 'a list is . list - that means we match literally with list and a parameter of the list type could be any type. The row polymorphism is a star operator, e.g., <quacks : unit; ..> is the same as <quacks : unit; .*>. And it means that it matches with any type that quacks and does whatever else3. Speaking of nominal subtyping, in this case, we have nominal classes (aka character classes in regexp), and we specify a family of types with the name of their base class. E.g., duck is like [:duck:] and any value that is properly registered as a member of class matches with this type (via class inheritance and the new operator)4. Finally, ad-hoc polymorphism is in fact also nominal and maps to character classes in regular expressions. The main difference here is that the notion of type in ad-hoc polymorphism is applied not to a value, but rather to the name. So a name, like a function name or the + operator, may have multiple definitions (implementations) that should be statically registered using some language mechanism (e.g., overloading an operator, implementing a method, etc). So, ad-hoc polymorphism is just a special case of nominal subtyping.

Now, when we are clear, we can discuss what row polymorphism gives us. Row polymorphism is a feature of structural type systems (also known as duck typing in dynamically typed languages) as contrasted to nominal type systems, which provide subtyping polymorphism. In general, as we discussed above, it allows us to specify, a type as "anything that quacks" as opposed to "anything that implements the IDuck interface". So yes, you can, of course, do the same with the nominal typing by defining the duck interface and explicitly registering all implementations as instances of this interface using some inherit or implements mechanisms. But the main problem here is that your hierarchy is sealed, i.e., you need to change your code to register an implementation in a newly created interface. That breaks the open/closed principle and hampers code reuse. Another problem with the nominal subtyping is that unless your hierarchy forms a lattice (i.e., for any two classes there is always a least upper bound) you can't implement type inference on it5.

Further Reading

---- 0) As was pointed in comments by @nekketsuuu, I was using the terminology a little bit voluntaristic, as my intention was to give an easy-to-understand and high-level idea, without going deep into details. I've revised the post since then, to make it a little bit more strict.

1) Yet OCaml provides classes with inheritance and a notion of subtype, it still not a subtyping polymorphism according to the common definition, as it's not nominal. It should come more clear from the rest of the answer.

2) I'm just fixing the terminology, I'm not claiming that my definition is right. Many people think that type denotes a representation of a value, and historically this is correct.

3) Perhaps a better regexp would be <.*; quacks : unit; .*> but I think you got the idea.

4) Thus OCaml doesn't have subtyping polymorphism, although it has a notion of subtype. When you specify a type it will not match with the subtype, it will only match literally, and you need to use an explicit upcasting operator to make a value of type T to be applicable in a context where super(T) is expected. So although there is subtyping in OCaml it is not about polymorphism.

5) And although the lattice requirement doesn't look impossible, it is hard in real life to impose this restriction on hierarchies, or if it is imposed the precision of the type inference will be always bound with the precision of the type hierarchy. So in practice, it doesn't work, cf. Scala

(skip this note on a first read) Though in OCaml there exist row variables that are used to embed row polymorphism into OCaml type inference that has only parametric polymorphism.

‡) Often the word typing is used interchangeably with the type system to refer to a particular set of rules in the overall type system. For example, sometimes we say "OCaml has row typing" to denote the fact, that the OCaml type system provides rules for "row polymorphism".

ivg
  • 34,431
  • 2
  • 35
  • 63
  • 3
    Very thorough answer - especially the lattice structure for hierarchies is very enlightening! However, I couldn't flollow you on _So, ad-hoc polymorphism is just a special case of nominal subtyping_. I picture ad-hoc as two types that are equivalent for a specific purpose. I can't see a subtype relation between these two types, though. –  Jan 05 '18 at 15:20
  • 2
    Yeah, I was trying to be brief, because I don't want to go into peculiarities of the ad-hoc typing, as a completely different question. An ad-hoc polymorphic type denotes a family of types which nominally (i.e., via binding to some name, not via the structure) places themselves into the class of applicable types for this name. This is basically a special case of nominal subtyping, except that the interface is specified via the name of the overloaded function, and the hierarchy height is one. – ivg Jan 05 '18 at 15:31
  • For example let's consider C++. It doesn't provide a mechanism to define an interface of an ad-hoc polymorphic function without at least one implementation (something like `defgeneric` in CLOS). So when we define `void f(int x) {...}` we implicitly define an interface `f(x)`, that makes all values that implement method `f`, e.g., `void f(float x) {...}` a part of the family of types that function `f` accepts as an argument. Btw, CLOS is a good example, where nominal subtyping is implemented actually via overloading. – ivg Jan 05 '18 at 15:36
  • @ivg Precisely, row polymorphism and subtyping are different things. See https://cs.stackexchange.com/q/53998/58774 or [this another post](https://news.ycombinator.com/item?id=13047934). – nekketsuuu Apr 26 '18 at 18:17
  • @nekketsuuu of course, and never did I say that hat are not. In fact I'm even claiming that OCaml doesn't provide subtyping polymorphism. – ivg Apr 27 '18 at 10:16
  • @ivg I'm care about this sentence: "Row polymorphism is also known as structural subtyping and duck typing." I think this might be misleading. How about changing the sentence a bit? – nekketsuuu Apr 27 '18 at 13:36
  • well, let me admit, that this is a little bit voluntaristic as I mixed oranges with apples and bananas. So, this sentence sort of contains a type error. I will revise the post, to remove the confusion. But the "another post" that you've referenced is not really correct. In fact, there are no such thing as "Row typing". There exists "row polymorphism" and "structural polymorphism", and indeed those two polymoprhisms differ. But both of polymorphisms operate on structural types and are forms of structural subtyping. Anyway, I will clarify this a little more in the post. – ivg Apr 27 '18 at 14:12
  • @nekketsuuu, I've tried to fix the post as much as possible (without rewriting it from scratch) and also added links to the literature. If you have any questions or disagreements, feel free to ask (ideally in the SO post) or drop by OCaml Discuss forum or Discord server or for discussion. – ivg Apr 27 '18 at 15:09
  • nice and refreshing perspective ! with nominal type, I imagine one would have, as a translation, a class `HasQuack` and instances of such (written or deduced) for each quaking structure, such as `Duck`. – nicolas Apr 14 '21 at 04:17
  • @ivg great answer, but I'm struggling with this statement: "But the main problem here is that your hierarchy is sealed, i.e., you need to change your code to register an implementation in a newly created interface" Why would that be the case? Any logic written against a duck interface would keep working without any modifications to code containing that logic. New implementations of duck can be implemented in another unit-of-build such as a library. I cannot see what is sealed in this case. On the contrary, isn't the ease of adding types the point of nominal subtyping? – GrumpyRodriguez May 30 '21 at 14:24
  • I was talking about cases when you add a new interface and, e.g., IFloating, and want to implement that your duck is a also floating, so you have to change the implementation of Duck and also register it as an implementation of IFloating. – ivg May 31 '21 at 15:26
16

Row types are weird. And very powerful.

Row types are used to implement objects and polymorphic variants in OCaml.

But first, here's what we cannot do without row types:

type t1 = { a : int; b : string; }
type t2 = { a : int; c : bool; }

let print_a x = print_int x.a

let ab = { a = 42; b = "foo"; }
let ac = { a = 123; c = false; }

let () =
 print_a ab;
 print_a ac

This code will of course refuse to compile, because print_a must have a unique type: either t1, or t2, but not both. However, in some cases, we may want that exact behavior. That's what row types are for. That's what they do: a more "flexible" type.

In OCaml, there are two main uses of row types: objects and polymorphic variants. In terms of algebra, objects give you "row product" and polymorphic variants "row sum".

What's to note about row types is that you can end up with some subtyping to declare, and very counter intuitive typing and semantics (notably in the case classes).

You can check this paper for more details.

Lhooq
  • 4,281
  • 1
  • 18
  • 37
PatJ
  • 5,996
  • 1
  • 31
  • 37
  • 6
    your answer started with so much promise, and then it was over. you essentially give two links. well, three. :( – Will Ness Jan 04 '18 at 17:18
  • 1
    I don't see anything weird with row polymorphism. Moreover, it is most natural and intuitive, that's just duck typing, and is happily used by python and javascript programmers, without any doubts. I think that it is the first kind of polymorphism that comes to mind. – ivg Jan 05 '18 at 15:03
  • @ivg That was more of a note of humour than anything else. In the matter of dynamic typing row polymorphism is indeed the first thing that comes to mind, but coming from the rest of the OCaml world, row types seem weird to me. I mean, GADTs and modules fit in way more than row polymorphism in my mind. I guess that must be a question of how you think. – PatJ Jan 05 '18 at 18:35
  • hehe, yeah, I might be too boring sometimes :) In fact, you're using the row typing all the time in OCaml without even noticing it. When you're using a functor and passing it an argument that has more fields than necessary you're using structural subtyping. And functors are polymorphic in their variables, as they accept any modules whose module types are structural subtypes of the parameter type. The only problem is that functors do not allow the capture the row type variable, so the rest of structure is forgotten. But this is not a part of row polymorphism, this is another story. – ivg Jan 05 '18 at 18:40
  • @ivg Functors cheat, you have to name your types. I think the ad-hoc part really bugs me. Actually, all of the semantics around the objects, classes and (to a lesser extent) polymorphic variants bug me. – PatJ Jan 05 '18 at 18:52
  • Not necessary, you don't need to name your type (I know a name is the most valuable asset to a programmer). You can just write `module F(A : sig type t end)`. Besides, objects, classes, and polymorphic variants are not ad-hoc. So far, there are no ad-hoc polymorphism in OCaml, until modular implicits are added. – ivg Jan 05 '18 at 18:58
  • @ivg wow I never even noticed. How come I keep discovering stuff about this typing system? (also I didn't mean ad hoc but nameless, I'm always getting lost with those terms, thanks for the refreshing). – PatJ Jan 06 '18 at 07:48
9

I'll complete PatJ's excellent answer with his example, written using classes.

Given the classes below:

class t1 = object
  method a = 42
  method b = "Hello world"
end

class t2 = object
  method a = 1337
  method b = false
end

And the objects below:

let o1 = new t1
let o2 = new t2

You can write the following:

let print_a t = print_int t#a;;
val print_a : < a : int; .. > -> unit = <fun>

print_a o1;;

42
- : unit = ()

print_a o2;;

1337
- : unit = ()

You can see the row type in print_a's signature. The < a : int; .. > is a type that literally means "any object that has at least a method a with signature int".

Richard-Degenne
  • 2,892
  • 2
  • 26
  • 43
  • 6
    What I don't understand with the ocaml syntax is why `..` instead of a type variable is used. With purescript I can define `forall r. { label :: String | r } -> { label :: String | r } -> String`. Now both records must have the same row type. But I could also replace the second `r` with another type var to denote that both row types may be different. How does this work with ocaml? –  Jan 04 '18 at 14:25
  • 1
    could you add an example for "polymorphic variants" mentioned in the other answer, as well? – Will Ness Jan 04 '18 at 17:20
  • @ftor: This is an excellent question. I must say I have no idea, since you don't come across these kind of types too often, but I'll look into it. – Richard-Degenne Jan 05 '18 at 09:04
  • @WillNess: This example won't work well with polymorphic variants, because it uses **row products** and, as Patj has pointed out, polymorphic variants are used to represent **row sums**. – Richard-Degenne Jan 05 '18 at 09:39
  • 1
    @RichouHunter I guess ocaml's approach is just sligthly less strict and still sufficient for most cases. Anyway, in purescript row polymorphism (for product types) is common, because there is no subtyping and it works well with type inference/unification. Moreover, row polymorphism doesn't entail contra-/co-/bivariants as subtyping does. It's a pity that it is not supported by more languages. –  Jan 05 '18 at 10:08
  • 2
    > What I don't understand with the ocaml syntax is why .. instead of a type variable is used. Because `..` is an implicit type variable. You can make it explicit with `as`, e.g., your example in OCaml would be: `(< label : string; .. > as 'a) -> 'a` – ivg Jan 05 '18 at 14:22
  • Is the issue able to be resolved by structural typing? – ca9163d9 Jan 06 '18 at 05:57
  • @ivg but you're wrong, because `'a` in you snippet is the type of the whole record not the type of the record without `label` which is actually what is most useful. – Yttrill Jul 15 '22 at 14:05
  • @Yttrill, the `'a` type variable indeed corresponds to the whole type and in OCaml it is not possible to capture the `..` part as a type variable. I am sorry that I didn't make it explicit. I don't know typescript very well, so I am not sure what meaning is ascribed to the type variable `r` there. Also, it would be interesting to learn of use cases, where you have to capture the `..` part of the row type. If you have one, consider asking a question. – ivg Jul 15 '22 at 16:45
  • @ivg: I don't really know typescript either, I'm an Ocaml programmer. But in my language Felix I can write: `fun[T] (x:(a:int | r:T))=>x.r` but it is actually equivalent to `fun[T](x:(a:int | T))=> (x without a)` so you can actually consistently define `r`, but the annotated sugar is more convenient. – Yttrill Jul 21 '22 at 04:16