5

There is a Tuple as a Product of any number of types and there is an Either as a Sum of two types. What is the name for a Sum of any number of types, something like this

data Thing a b c d ... = Thing1 a | Thing2 b | Thing3 c | Thing4 d | ...

Is there any standard implementation?

Erik Kaplun
  • 37,128
  • 15
  • 99
  • 111
ais
  • 2,514
  • 2
  • 17
  • 24
  • Related (though I think not a duplicate): [I write about why in most cases one should avoid tuples and `Either`-alikes](http://stackoverflow.com/a/19073039/791604). New data types are syntactically quite cheap in Haskell compared to other languages -- use that fact to your advantage! – Daniel Wagner Jan 25 '16 at 21:26

4 Answers4

6

Before I make the suggestion against using such types, let me explain some background.

Either is a sum type, and a pair or 2-tuple is a product type. Sums and products can exist over arbitrarily many underlying types (sets). However, in Haskell, only tuples come in a variety of sizes out of the box. Either on the other hand, can to be (arbitrarily) nested to achieve that: Either Foo (Either Bar Baz).

Of course it's easy to instead define e.g. the types Either3 and Either4 etc, in the spirit of 3-tuples, 4-tuples and so on.

data Either3 a  b  c = Left     a |     Middle b     | Right     c
data Either4 a b c d = LeftMost a | Left b | Right c | RightMost d

...if you really want. Or you can find a library the does this, but I doubt you could call it "standard" by any standards...

However, if you do define your own generic sum and product types, they will be completely isomorphic to any type that is structurally equivalent, regardless of where it is defined. This means that you can, with relative ease, nicely adapt your code to interface with any other code that uses an alternative definition.

Furthermore, it is even very likely to be beneficial because that way you can give more meaningful, descriptive names to your sum and product types, instead of going with the generic tuple and either. In fact, some people advise for using custom types because it essentially adds static type safety. This also applies to non-sum/product types, e.g.:

employment :: Bool  -- so which one is unemplyed and which one is employed?

data Empl = Employed | Unemployed
employment' :: Empl  -- no ambiguity

or

person :: (Name, Age)  -- yeah but when you see ("Erik", 29), is it just some random pair of name and age, or does it represent a person?

data Person = Person { name :: Name, age :: Age }
person' :: Person  -- no ambiguity

— above, Person really encodes a product type, but with more meaning attached to it. You can also do newtype Person = Person (Name, Age), and it's actually quite equivalent anyway. So I always just prefer a nice and intention-revealing custom type. The same goes about Either and custom sum types.

So basically, Haskell gives you all the tools necessary to quickly build your own custom types with very clean and readable syntax, so it's best if we use it not resort to primitive types like tuples and either. However, it's nice to know about this isomorphism, for example in the context of generic programming. If you want to know more about that, you can google up "scrap your boilerplate" and "template your boilerplate" and just "(datatype) generic programming".


P.S. The reason they are called sum and product types respectively is that they correspond to set-union (sum) and set-product. Therefore, the number of values (or unique instances if you will) in the set that is described by the product type (a, b) is the product of the number of values in a and the number of values in b. For example (Bool, Bool) has exactly 2*2 values: (True, True), (False, False), (True, False), (False, True).

However Either Bool Bool has 2+2 values, Left True, Left False, Right True, Right False. So it happens to be the same number but that's obviously not the case in general.

But of course this can also be said about our custom Person product type, so again, there is little reason to use Either and tuples.

Erik Kaplun
  • 37,128
  • 15
  • 99
  • 111
  • > so it's best if we use it not resort to primitive types like tuples and either< Sometime you just need Sum of types you already have: data Foo = Foo Int, data Bar = Bar Int so you need to duplicate constructors data FooBar = Foo2 Foo | Bar2 Bar – ais Jan 25 '16 at 17:06
  • @ais: Depending on the context, `type FooBar = Either Foo Bar` is suitable. But again, that depends on the actual context. – Zeta Jan 25 '16 at 17:21
  • @Zeta Yes, the main problem it only works for two types. – ais Jan 25 '16 at 17:36
  • `FooBar = Foo2 Foo | Bar2 Bar` isn't an excruciatingly recurring pattern though. OCaml does it conciser than Haskell though; so some people argue for adding it into the language, but it wouldn't be too well fitted without effort and cascading changes. Also, note that the type class mechanism fits quite well in many scenarios like this, and the ML family does not have type classes. So all in all, @ais, I'd say look at the entire picture, not just its individual constituents. – Erik Kaplun Jan 25 '16 at 17:59
4

There are some predefined versions in HaXml package with OneOfN, TwoOfN, .. constructors.

karakfa
  • 66,216
  • 7
  • 41
  • 56
  • I immediately thought of generalizing it on `NofM`... so the type `TwoOfThree` would be a sum of 3 products (`a*b + b*c + a*c`); but that would require renaming the existing HaXml data constructors to e.g. `FirstOf3`/`SecondOf3` instead of `OneOf3`/`TwoOf3` (and such a rename wouldn't hurt in any case). Also, I think without dependent types, it would be quite ugly and hard to generalize over as well. – Erik Kaplun Jan 26 '16 at 19:40
2

In a generic context, this is usually done inductively, using Either or

data (:+:) f g a = L1 (f a) | R1 (g a)

The latter is defined in GHC.Generics to match the funny way it handles things.

In fact, the generic approach is to break every algebraic datatype down into (:+:) and

data (:*:) f g a = f a :*: f a

along with some extra stuff. That is, it turns everything into binary sums and binary products.

In a more concrete context, you're almost always better off using a custom algebraic datatype for things bigger than pairs or with more options than Either, as others have discussed. Slightly larger tuples (triples and maybe 4-tuples) can be useful for local one-off constructs, but it's hard to see how you'd use larger general sum types as one-offs.

dfeuer
  • 48,079
  • 5
  • 63
  • 167
0

Such a type is usually called a sum, variant, union, or tagged union type. Because the capability is a built-in feature of data types in Haskell, there's no name for it widely used in Haskell code. The Report only calls them "algebraic datatypes" (usually abbreviated to ADT), so that's the name you'll see most often in comments, but this name includes types with only one data constructor, which are only sum types in the trivial sense.

Dan Hulme
  • 14,779
  • 3
  • 46
  • 95
  • I don't think the question is asking generally about ADTs but about a family of types with N type parameters and N corresponding constructors which indicate a value of the associated type. – Lee Jan 25 '16 at 15:55
  • @Lee I agree. All the names I gave in italics are terms for that family of types, and I added a note about why you don't see those names much in Haskell. As for the second part of the question (is there a standard implementation), I can't give a better answer than karakfa's. – Dan Hulme Jan 25 '16 at 16:17