Why isn't every type part of Eq in Haskell?

Question

Or rather, why isn't (==) usable on every data type? Why do we have to derive Eq ourseleves? In other languages, such as Python, C++, and surely others, it has a default implementation for everything! I can't think of types that can't be compared.

Note that in most languages that do have a default implementation of ==, == compares object identity/pointer values, which in my experience is not what you want in the vast majority of cases. So you'd still need to define your own == if you want it to behave usefully. — sepp2k, Jun 08 '12 at 22:56
One thing to note: `==` means different things in those languages. Particularly, it is *reference* equality rather than *semantic* equality (except in special cases like primitives where it *is* semantic). So in Python or Java, `x == y` just tells you that `x` and `y` point to the same place in memory. (Try comparing equivalent lambdas.) In Haskell, reference equality does not make sense, so `==` *always* represents *semantic* equality, which is undecidable for certain types like functions. For example, in Java, you have to define `.equals()` yourself to get similar behavior. — Tikhon Jelvis, Jun 08 '12 at 22:56
@TikhonJelvis Except that in Python, only the default implementation compares reference equality. Nearly all overloads, which are used far more frequently than the default implementation, check "semantic equality". A minor point, and I fully agree otherwise, but you seem to imply something very wrong about Python. — , Jun 09 '12 at 11:50

Ben · Accepted Answer · 2012-06-09T00:43:06.150

In Python the default equality implementation compares identity, not value. This is useful for user-defined classes, which by default are mutable and do not have to have a well-defined notion of "value". But even in that setting, it is more normal to use the is operator to directly compare identities for mutable objects.

With Haskell's immutability and sharing this notion of "identity" doesn't make much sense. If you could compare two terms by identity you could find out whether or not they are shared, but it's generally up to the implementation whether two terms that might be shared actually are shared, so such information shouldn't be able to affect the behaviour of the program (unless you like programs that change their behaviour under different compiler optimisation strategies).

So equality in Haskell is always value equality; it tells you whether two terms represent the same value (not necessarily whether they have equal structure; if you implement a set with an unordered list then two lists with different structure can represent the same set).

Almost all of the built in types are members of Eq already; the big exception are function types. The only really sensible notion of value equality for functions is extensional equality (do they return the same output for every input). It's tempting to say we'll use that and let the compiler access a representation of the function definition to compute this, but unfortunately determining whether two arbitrary algorithms (here encoded in Haskell syntax) always produce the same output is a known uncomputable problem; if the compiler could actually do that it could solve the Halting Problem, and we wouldn't have to put up with the bottom value being a member of every type.

And unfortunately the fact that functions can't be members of Eq means lots of other things can't be either; lists of integers can be compared for equality, but lists of functions can't, and the same goes for every other conatiner-ish type when it's containing functions. This also goes for ADTs that you write, unless there is a sensible notion of equality you can define for that type that doesn't depend on the equality of the contained functions (maybe the function is just a convenience in the implementation, and which function it is doesn't affect the value you're representing with ADT).

So, there are (1) types that are already members of Eq, (2) types that can't be members of Eq, (3) types that can be members of Eq in an obvious way, (4) types that can be a member of Eq but only in a non-obvious way, and (5) types that can be members of Eq in an obvious way, but the programmer would prefer an alternative way. I think the way Haskell handles these cases is actually the right way. (1) and (2) don't require anything from you, and (4) and (5) are always going to require an explicit instance declaration. The only case where the compiler could help you out a little more is (3), where it could potentially save you 12 characters of typing (4 if you're already deriving anything else).

I think that would be a pretty small win for the cost. The compiler would have to try to construct an instance of everything and presume that anything for which that fails isn't supposed to have an Eq instance. At the moment if you want to derive an Eq instance and accidentally write a type for which that doesn't work, the compiler tells you then and there that there's a problem. With the proposed "implicitly make everything Eq that you can" strategy, this error would show up as an unexplained "no Eq instance" error at the point that you go to use the assumed instance. It also means that if I'm thinking of the type as representing values for which the reasonable equality relation isn't simple structural equality (case (5) above; remember the set represented by an unordered list?), and I forget to write my own Eq instance, then the compiler might automatically generate a wrong Eq instance for me. I'd much rather be told "you haven't written an Eq instance yet" when I go to use it than have the program compile and run with a bug introduced by the compiler!

The first Haskell standard actually prescribed that if you omitted the deriving clause for a data type you would get as many classes as possible derived. This was later changed because it was somewhat unpredictable what classes would be derived. — augustss, Jun 09 '12 at 07:32
This is clearly the best answer so far. I recommend it for acceptance. — usr, Jun 09 '12 at 09:21

score 20 · Answer 2 · answered Jun 08 '12 at 21:55

20

You can't imagine a noncomparable type? Well, the classic example are functions. Consider functions [()]->Bool. Two such functions are equal when they return the same value for every possible input. But "unfortunately", there are infinitely many such lists: since Haskell is lazy, the list size isn't even bound by memory. Of course you can compare, for every list input with a length less than some fixed lMax, but where will you draw the line? It's impossible to ever be sure that the functions you compare won't, after 1000000000 equal returns, suddenly return different results for replicate 1000000001 (). So (==) :: ([()]->Bool) -> ([()]->Bool) -> Bool could never actually return True, only either False (if an input for which the functions differ is found) or ⟂ (if the functions are actually equal). But you can't evaluate ⟂.

answered Jun 08 '12 at 21:55

leftaroundabout

117,950
5
174
319

But looking at the implementation of these functions, could the compiler say whether they're equal or not? If he can't, could `(==)` at least be sure to answer correctly for types whose kind is `*`? – L01man Jun 08 '12 at 22:09
7

Just because two functions have different implementations, that doesn't mean they're not equal. – dave4420 Jun 08 '12 at 22:16
9

e.g. Consider `heapSort, quickSort, bubbleSort :: [Int] -> [Int]`. They're all equal in the sense that for any given input they each produce the same output. But they are different implementations with different performance characteristics. How is a comparison function supposed to work out that they're equal? – dave4420 Jun 08 '12 at 22:27
Compare the two functions `f x = x + 1 + 1` and `f x = x + 2`. Different implementations, but they are obviously the same (assuming numbers that behave reasonably). Or what about `g x = f x * 1`? What about something even simpler: `f x = h (g x)` and `f = h . g`? There really is no satisfactory solution. – Tikhon Jelvis Jun 08 '12 at 22:48
4

recursive data is also rather hard to compare for equality unless you have extra knowledge that allows you to avoid infinite loops... – Thomas M. DuBuisson Jun 08 '12 at 23:08
I could imagine that a definition of equality based on whether they are the same implementation is still somewhat useful (even though it would mean that equivalent implementations are not equal) – user102008 Jun 08 '12 at 23:42
4

@user102008: Yes, it would be *useful*, but it is *not* the same thing as equality. So having a separate `equalImplemtation` function for functions could be useful; using `==` for this would not. – Tikhon Jelvis Jun 09 '12 at 00:08
3

@L01man (Also, since nobody else said it yet: functions are classified by types of kind `*`.) – Daniel Wagner Jun 09 '12 at 00:12
@ThomasM.DuBuisson Actually, the compiler can infer `Eq` instances for recursive types just fine. If you use knot-tying or laziness to create a finite structure that represents an infinitely large object, then yes comparing it for equality with another infinitely large object is probably going to cause you problems. – Ben Jun 09 '12 at 00:37
3

@user102008: such a thing *might* be useful, but it violates referential transparency: the core principle in Haskell that you can replace a function call by the value it returns without changing the behaviour of your program. In the presence of such a function it would be much harder to make changes to your code while being sure nothing would break. (Side note: as long as you restrict yourself to *total* functions, I believe equality on `[()] -> Bool` is actually decidable. If you require `f (repeat ())` non-bottom, `f` can only look at finitely many elements of the list). – Ben Millwood Jun 09 '12 at 09:47
cf. http://math.andrej.com/2007/09/28/seemingly-impossible-functional-programs/ ; a bulletproof example is `Integer -> Bool`. – Ben Millwood Jun 09 '12 at 09:47
@DanielWagner: that's right I meant functions without parameters (variables?): `a` against `a -> b` for example. – L01man Jun 09 '12 at 09:48
@dave4420: Thanks for the great example. However, some people would maybe want to check the algorithm used and not the fact that it sorts the list, no? – L01man Jun 09 '12 at 09:48
@TikhonJelvis: When `f x = x + 1 + ` is evaluated it produces `f x = x + 2`. The two functions that use composition are the same because `h . g` equals `h (g x)`. f x = h (g x)` is not well coded because it should use `.`, so it's the programer's fault, and this use of `(==)` would be useful in Hoogle for example to check if a function is already defined in the standard library. – L01man Jun 09 '12 at 09:50
@benmachine As discussed at your link, totality is not enough to make equality on `Integer -> Bool` decidable; you must also make the restriction that the functions are uniformly continuous. – Daniel Wagner Jun 09 '12 at 09:52
@DanielWagner: that's exactly what I meant by "bulletproof example" :) the question specifies `[()] -> Bool`, for which there *is* decidable equality (unless I've got that wrong) and hence theoretically could be an Eq instance. – Ben Millwood Jun 09 '12 at 10:02
@benmachine No, you've misunderstood the link you posted. Equality on the Haskell type `[()] -> Bool` is not decidable; however, equality on the restricted type of uniformly continuous functions of type `[()] -> Bool` is decidable (and even this easier problem having a solution is surprising and seems impossible). – Daniel Wagner Jun 09 '12 at 10:06
@L01man I can't think why that would be useful in a program, but if you had a function that compared functions in that way, it shouldn't be called `==`. – dave4420 Jun 09 '12 at 11:37
@dave4420: It depends what the purpose of `(==)` is: same input gives same output? Though, there're differences between these algorithms like speed or amount of processed data. Instead of making them equal, they could be instances of a `Sort` class, which is logical since they all have `Sort` in their names. In terms of OOP, instances are the same because they have the same parent and can receive a general processing, but they have their own characteristics, otherwise they would be an only data type. Thus, `(==)` is good for exact comparison and pattern matching with classes for similarity. – L01man Jun 09 '12 at 13:45
@DanielWagner: I didn't misunderstand it, I just read it a long time ago and kind of forgot the details >_> – Ben Millwood Jun 09 '12 at 13:54
A quick comment: equality where two functions have the same output for every input is called *extensional* equality; equality that also includes "higher-level" properties of a function, such as implementation, complexity, memory usage, etc. is known as *intensional* equality. As discussed, both are quite difficult to achieve: the former mostly because of infinite-domain functions, the latter because it is hard to determine if two functions have "the same" implementation. – dorchard Jun 09 '12 at 20:50
@BenMillwood Why would it violate referential transparency? – Veedrac Sep 17 '16 at 04:37
3

@Veedrac: Coming back to this four (!) years later, I'm actually not sure if "referential transparency" is the property I mean. But the ability to distinguish one implementation of a function from another would violate abstraction boundaries: you can no longer refactor some internal function, keeping the contract and type the same, and be sure client code won't change behaviour. – Ben Millwood Sep 17 '16 at 12:36
@BenMillwood Apologies for throwing you in the deep end! I appreciate the response. – Veedrac Sep 17 '16 at 16:01

amindfv · Answer 3 · 2012-06-10T13:18:55.000

You may not want to derive Eq - you might want to write your own instance.

For example, imagine data in a binary tree data structure:

data Tree a = Branch (Tree a) (Tree a)
            | Leaf a

You could have the same data in your Leafs, but balanced differently. Eg:

balanced = Branch (Branch (Leaf 1) 
                          (Leaf 2)) 
                  (Branch (Leaf 3) 
                          (Leaf 4))

unbalanced = Branch (Branch (Branch (Leaf 1) 
                                    (Leaf 2)) 
                            (Leaf 3)) 
                    (Leaf 4)

shuffled = Branch (Branch (Leaf 4) 
                          (Leaf 2)) 
                  (Branch (Leaf 3) 
                          (Leaf 1))

The fact that the data is stored in a tree may only be for efficiency of traversal, in which case you'd probably want to say that balanced == unbalanced. You might even want to say that balanced == shuffled.

score 7 · Answer 4 · answered Jun 09 '12 at 04:41

7

I can't think of types that can't be compared.

let infiniteLoop = infiniteLoop

let iSolvedTheHaltingProblem f = f == infiniteLoop
-- Oops!

answered Jun 09 '12 at 04:41

Jörg W Mittag

363,080
75
446
653

2

Man, people just love reducing to the halting problem :P In this case I think it's entirely reasonable to say `==` returns, well, an infinite loop. Just like if you did `[] == filter (not . sumOfTwoPrimes) [2, 4 ..] ` (note that this *does* typecheck) – Ben Millwood Jun 09 '12 at 09:55
When you say `let infiniteLoop = infiniteLoop`, do you mean, for example, `let infiniteLoop = [1..]`? – L01man Jun 09 '12 at 10:09
No. You can e.g. take the `head` and `tail` of `[1..]`, and in general compute it as far as you can. `infiniteLoop` will just hang if you try to extract any value from it. – hmp Jun 16 '12 at 17:10

score 4 · Answer 5 · answered Jun 08 '12 at 21:51

4

Because the way that values are compared may be custom. For example, certain "fields" might be excluded from comparison.

Or consider a type representing a case-insensitive string. Such a type would not want to compare the Chars it contains for identity.

answered Jun 08 '12 at 21:51

usr

168,620
35
240
369

Thanks. Then, why isn't there a default implementation of `(==)`, or another operator in addition that checks if the two values are exactly the same, like `(===)` or `is`. I now understand some types need a different implementation of `(==)`, but don't all types need at least the default implementation? is there a special design case where comparison is not wanted? – L01man Jun 08 '12 at 21:58
@usr: yup, apart from the theoretical answer that it's impossible to test equality for function types, this is the other, more practical answer to the question of "what about for non-function types?". Another example is search trees representing sets (or maps): two different search trees may represent the same set. – Luis Casillas Jun 08 '12 at 23:04
1

@L01man: the undecidability of equality for functions has the potential infects the whole language once you add modularity and encapsulation. I can write a library that exports a two-parameter type `Foo` but not its constructors; the only way of using `Foo` is the constants and functions I export. You can't know if values of type `Foo a b` contain a hidden function, and I can change the implementation to use one if appropriate. Under your proposal, however, if `Foo` gets an implicit `Eq` instance, you can write code that will break under a later change to my type's implementation. – Luis Casillas Jun 08 '12 at 23:15
3

@L01man Having another operator that compares if two values are "exactly the same" would break abstraction. I might have an abstract data type (e.g. a set implemented by a list) where I want to consider several internal representations as being the same. If you were allowed to compare my internal representations you would be able to write code that could break when I change the internals of my data type. So it's important to have control over equality not only from a theoretical perspective, but also from a practical one. – augustss Jun 09 '12 at 05:19
@sacundim: You mean that in the first version of your library you would not provide an implementation of `(==)` and then you would provide one that excludes certain fields? – L01man Jun 09 '12 at 09:58
@augustss: You mean to use `(==)` with different types `a` and `b`? Like in, for instance, ` (x:xs) == (y Cons ys) = (x == y) and (xs == ys)`, where `[1, 2, 3] == 1 Cons 2 Cons 3 Cons List` returns `True`? Is this possible? I thought that `(==)` was defined like that: `(==) :: a -> a -> Bool`. – L01man Jun 09 '12 at 10:07
@L01man No. Here's what I mean. Say that implement a `Set` type by `data Set a = Set [a]`. Then I define an `Eq` instance that says that two sets are equal if the have the same elements regardless of order. Now, if you could compare the sets in some other way that actually looks at the representation you'd be able to see that some sets that I want you to think are equal are actually represented by differently. Perhaps you rely on that. If I then change the representation of `Set` to have a unique representation your code would break. Using just `==` you would be safe. – augustss Jun 09 '12 at 10:33
@L01man: Also, note that comparing object identity can break referential transparency. If we had a function `(===) :: a -> a -> Bool` which (like Python's `is` or Scheme's `eq?`) compared objects for (reference) identity, then consider the following: `id === id`. That should evaluate to `True`. But replacing `id` by it's definition, we get `(\x -> x) === (\x -> x)`, which could evaluate to `False`. Or consider `map (+ 1) [1..5] === map (+ 1) [1..5]`---whether that evaluates to `True` or `False` depends on what optimizations are performed. So such a function can't exist in Haskell. – Antal Spector-Zabusky Jun 09 '12 at 16:42
@L01man: No, I mean that in the first version you'd define a datatype `Foo a b` so that no constructors take functions as arguments, and in the second version you'd change it so that at least one constructor took a function as an argument. In the second version of the library it's impossible to derive an `Eq` instance—but if the language created one automatically, clients of the first one may have been coded to rely on it. – Luis Casillas Jun 11 '12 at 17:06

Don Stewart · Answer 6 · 2012-06-09T03:08:41.710

4

How do you compare functions? Or existential types? Or MVars?

There are incomparable types.

Edit: MVar is in Eq!

instance Eq (MVar a) where
        (MVar mvar1#) == (MVar mvar2#) = sameMVar# mvar1# mvar2#

But it takes a magic primop to make it so.

edited Jun 09 '12 at 03:08

answered Jun 08 '12 at 23:33

Don Stewart

137,316
36
365
468

3

[`MVar` is an instance of `Eq`.](http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent-MVar.html#g:1) – Luis Casillas Jun 09 '12 at 00:14
1

Indeed it is! But that's new (since Simon's IO rewrite) -- and it takes magic to make it happen. – Don Stewart Jun 09 '12 at 03:13

score 4 · Answer 7 · answered Jun 09 '12 at 09:30

4

Consider the following Python example:

>>> 2 == 2
True
>> {} == {}
True
>>> set() == set()
True
>>> [1,2,3] == [1,2,3]
True
>>> (lambda x: x) == (lambda x: x)
False

False? o_O This of course makes sense if you realize that Python == compares pointer values, except when it doesn't.

>>> f = (lambda x: x)
>>> f == f
True

Haskell encourages == to always represent structural equality (and it always will if you use deriving Eq. Since nobody really knows a completely sound and tractable way to declare for certain whether or not two functions are structurally equivalent, there is no Eq instance for functions. By extension, any data structure that stores a function in it cannot be an instance of Eq.

answered Jun 09 '12 at 09:30

Dan Burton

53,238
27
117
198

It would be more correct to say that any data structure that stores an *arbitrary* function cannot be an instance of Eq, or at least not in a natural way (lots of types have Eq instances that aren't strictly structural equality: an example of when this is a good idea is lazy bytestrings that are chunked differently comparing equal). Some function types are comparable. – Ben Millwood Jun 09 '12 at 09:58
@benmachine "Some function types are comparable" - which ones? `() -> Foo` would be one I suppose, just define `(==) = on (==) ($ ())`. Are there any more interesting examples of comparable function types? – Dan Burton Jun 09 '12 at 18:10
3

Well, `Bool -> Bool`, for one. Generally any finite enumerable type to any equatable type will do, although the domain will have to be small for the implementation to be practical :) however, more magic is possible - see http://math.andrej.com/2007/09/28/seemingly-impossible-functional-programs/ (even if I did misuse it in a different comment thread) – Ben Millwood Jun 09 '12 at 18:22
2

Even with Bool -> Bool you need to use care, since their behavior in the presence of bottoms can vary. :/ – Edward Kmett Jun 11 '12 at 16:03
I've always felt encouraged to use `==` to represent semantic equality rather than structural equality. – AndrewC Sep 16 '12 at 21:18

Why isn't every type part of Eq in Haskell?

7 Answers7

Linked