1

I would like to know if there is a specific reason because of which C# doesn't have automatically generated Equals, GetHashCode, and operator ==, operator != geared towards value comparison in reference types.

*Explanation: I do not see an easy way to quickly request "compare actual objects" operation for values/contents of reference types. Coming from C++ background I have impression that it is something that should be done automatically by compiler at simple request of user. The lack of that feature most likely indicates that it might be against language's "design goal"/"vision"/"philosophy". So I would like to know for which reason this functionality was deemed to be unimportant.

--original text--

As far as I can tell, Equals pretty much amounts to few comparisons to null, attempted cast and field-by-field comparison.

GetHashCode pretty much amounts to combination of all hashes for members using some operations (multiply with overflow, xor, anything).

As far as I can tell, it should either automated: the methods should be generated by default OR there should be a simple way to request default implementation. However, there is no such thing. Why?

As I understand it, it is either massive technical oversight that persisted for years, or some kind of language philosophy I'm not aware of.

So, what is the reason?

SigTerm
  • 26,089
  • 6
  • 66
  • 115
  • They are already defaulted.. It's your choice to override the default equality for that class how you would like it. – austin wernli Apr 13 '15 at 22:31
  • Comparing by value is expensive, and most times not necessary. So the good default is comparing by reference. – poke Apr 13 '15 at 22:35
  • 4
    I am confused by the question. Value types automatically get value equality, reference types automatically get reference equality. That's why they're called value types and reference types. What exactly are you asking for? Is your question why there is not an easy way for reference types to have value equality? I would invite you to propose an implementation that correctly fulfills the contract of `GetHashCode` for an arbitrary reference type; attempting to do so will show you why no one else has done it. – Eric Lippert Apr 13 '15 at 22:38
  • In particular, by "arbitrary reference type" I mean types that can have fields that contain arrays, that contain arbitrary generic types, and that can have arbitrarily complex reference topologies. Your solution must be able to handle a type T that has a field of type T that refers to itself, for instance. – Eric Lippert Apr 13 '15 at 22:43
  • @EricLippert: Coming for C++ behavior exhibited by C# classes makes little sense. The class is pretty much handle attached to floating object. Default operator compares handles. However there is no easy way to compare objects being pointed at - I hate to write that myself. Why? C++ does this easily. Your statement about "correct gethashcode" also does not make sense to me, since the way I see it, it amounts pretty much to "for all fields in class, calculate hash code, then combine using operation X". Won't be the best hash distribution, of course, but it will work. – SigTerm Apr 13 '15 at 22:46
  • @EricLippert It would be enlightening to understand how these complexities were overcome (or if they were just accepted) with anonymous types that have value-type equality semantics. I'm sure there are many more edge cases with reference types in general, but it would be an interesting case study. – D Stanley Apr 13 '15 at 22:49
  • @EricLippert: Were you implying that the problem is object being part of graph formed by object references and said graph forms loops? It still can be done and automatically generated. Won't be a quick hash of course. Could be made quicker if the user could specify whether member member of a class require "deep" traversal or just reference comparison for the purposes of calculating hash. – SigTerm Apr 13 '15 at 22:50
  • @SigTerm Can you explain more how C++ "easily" lets you define equality generically between classes? – D Stanley Apr 13 '15 at 22:51
  • @DStanley: In C++ Classes are value types. You get field-by-field comparison by default as operator==/!=. That may or may not be what you want, but works when you need a quick struct. Different types by default are not comparable to each other. When you need it to be a "reference" (in C# sense), you use one of the pointer types, depending on how you want to track shared ownership and object lifetime. However, you (safely) can attempt to cast anything into anything else using runtime type information, which basically walk through inheritance hierarchy for you. – SigTerm Apr 13 '15 at 22:56
  • @DStanley: You're right, that would be an interesting comparison. Anonymous types have the distinct benefits of being (1) immutable, (2) free of cycles, and (3) typically small; none of this is true in the general case. – Eric Lippert Apr 13 '15 at 23:04
  • @SigTerm: OK, let's think about it. Suppose we are traversing a network of objects and keeping track of which ones have been traversed before to detect cycles. How precisely are we doing that? **Are we putting them in a hash set**? Because if we are, I submit to you that we have made a recursive step that does not simplify towards a smaller problem. – Eric Lippert Apr 13 '15 at 23:05
  • @EricLippert: Nope. All objects will have internal handle or address designating memory area where object data is currently located. That handle will be unique. Track handles instead of tracking hashes. That how the problem would be approached from C++. (In C++) it is absolutely trivial, not exactly efficient, though. – SigTerm Apr 13 '15 at 23:09
  • 2
    @SigTerm: More generally, I would discourage you from asking "why not?" questions on stackoverflow. The "why not?" question presupposes that the world ought to be different, and that the people who failed to make it different must justify their actions. That's not how language design works. Features are not implemented by default and then removed for good reasons. If the feature you want isn't there, either it wasn't thought of at all, or there were other features that were higher on the priority list. Your proposed feature has never to my knowledge been even considered. – Eric Lippert Apr 13 '15 at 23:10
  • I note also that your proposed feature has much in common with the feature of "cloning" a reference type using value semantics. The `ICloneable` interface is deprecated in .NET because it was never at all clear what the semantics of a clone operation ought to be; in the past couple decades we've seen that we can simply live without a general mechanism for cloning a reference type. – Eric Lippert Apr 13 '15 at 23:14
  • @EricLippert: Look, if people won't ask "why not?" how are they going to learn? Let me explain the quesiton: Languages are designed with some goal/vision/philosophy/whatever, then evolve based on their practical use. When I bump into area where language starts furiously resisting whatever I'm doing it indicates that it was not designed with that purpose in mind AND there might be another way to do it which I'm not aware of. So, how am I going to learn about the other way without asking "why not?" question? There was a reason why the feature wasn't added. I want to know the reason. Simple, no? – SigTerm Apr 13 '15 at 23:14
  • @SigTerm: Sure, I take your point, but think about it from the perspective of the stack overflow users who are attempting to answer your questions. The only people who can answer your questions definitively were in a particular room at a particular time back in 1999. I wasn't in that room then (though I've spent a lot of time in that room since.) That's why these are bad questions for SO; the only people who can *definitively* answer them aren't here; everyone else (me included) is making educated guesses or offering opinions. – Eric Lippert Apr 13 '15 at 23:17
  • @EricLippert: I've noticed lack of information on `ICloneable`. The reason why I asked this question in the first place is because language obviously does not want to let me compare actual *objects* of reference types easily. The reason why I used reference types is because they allow me to specify default constructors. "people who can definitively answer them aren't here" I can onlly sadly sigh at that and that's about it. The idea was that language design goal is documented somewhere, and said design goal explains lack of those features. – SigTerm Apr 13 '15 at 23:21
  • 1
    I note also that you offer a false dichotomy, which makes your question actually sound more like a rant, even if that is not what was intended. (Why not? questions often sound like rants, even those that are, as you point out, genuinely looking for explanations.) There are options other than "massive oversight" or "philosophical objection" -- the option "reasonable feature idea that never made it anywhere close to the top of a list of potential features longer than your arm", for example, is a third option, and there are many more. – Eric Lippert Apr 13 '15 at 23:21
  • @EricLippert: "actually sound more like a rant" ... good point here. I'll try to fix the question and then check it again tomorrow. – SigTerm Apr 13 '15 at 23:23
  • Sure. To be clear, I think this is an *interesting* question that touches on many subtle points of language design, and I will probably write a blog post about it at some point. I just think it's not a particularly good question for this format. – Eric Lippert Apr 13 '15 at 23:24
  • Oh, and incidentally C# 6 will allow struct types to have default constructors. A bit of a weird feature, as now `default(S).x` and `(new S()).x` could have different values. – Eric Lippert Apr 13 '15 at 23:26
  • @EricLippert: I heard about C#6, but the reason why I touched C# is unity 3d (started as a toy project and then grew into few hundred kbytes of code), they're (IIRC) based on Mono and only guarantee availability of .NET framework 2.0. Meaning features of C#6 won't be available for a while. – SigTerm Apr 13 '15 at 23:35
  • @SigTerm In what version of C++ do you get equality (`==`) built-in? It;s been a long time since I used C++ regularly but I distinctly remember having to define `==` for classes since it's not defined by default. Are you thinking of the assignment operator? (`=`)? – D Stanley Apr 14 '15 at 13:07
  • @DStanley: .... Please excuse me while I go punch a wall and swear profusely for 15 minutes or so. You're correct. Somehow I confused `==` with `=` and was **real** sure I get `==` by default. Sigh. – SigTerm Apr 14 '15 at 13:40

2 Answers2

3

For value types, Equals and GetHashCode are implemented for you automatically (though the implementation uses reflection, so it's faster to write your own).

And for reference types, it's not clear whether you want to compare the contents or compare the references. I've used both. If I'm writing an immutable type, I probably want its Equals to compare its contents. For anything else, I probably want the default Equals implementation that only returns true if I compare an instance to itself (reference equality); comparing contents would be the wrong thing in this case.

So, for value types (which are defined by their contents), .NET gives you what you want (but not as performant as what you could write yourself). For reference types, you have to opt into content equality, since often that wouldn't be what you want.

Joe White
  • 94,807
  • 60
  • 220
  • 330
  • "it's not clear whether you" I was speaking about automatic generation for comparison of contents, because in my opinion there should be an easy way for user to request that. So, are you suggesting that when I need those functions generated by default I should switch to value type? – SigTerm Apr 13 '15 at 23:00
  • And I'm explaining why automatic generation wouldn't make sense, except as an opt-in. As for a simple way to request it, I suppose the default ValueType implementations could have been made available as utility methods that you could use when you want, but keep in mind that they're slow. You're likely better off using something like [ReSharper](https://www.jetbrains.com/resharper/) or [PostSharp](https://www.postsharp.net/) ([example](http://stackoverflow.com/q/26005468/87399)) to codegen your implementations, so there's no perf hit at runtime. – Joe White Apr 13 '15 at 23:14
3

In order for a compiler/framework to usefully auto-generate equivalence-related methods, it would need to be able to distinguish two kinds of equivalence and multiple kinds of reference. For example, suppose Foo has a single field of type int[], and two instances of Foo hold references to different arrays holding the sequence {1,2,3}. Whether or not a comparison between references to those instances should report them equal would depend upon the purpose for which Foo holds the array reference and the purpose for which the references to Foo objects are held by the code requesting the comparison.

If neither array's contents will ever be altered, the two Foo instances should report each other as being permanently equivalent (and also presently equivalent); if the arrays can be modified, but only at the request of code holding references to the Foo instances, then the instances should report themselves as being presently equivalent, but not permanently equivalent [if code which holds the only reference to a Foo instance and never shares it or calls any of its mutating methods, then it can know that the state of that instance will never change even if that instance doesn't know that]. If references to the arrays are in the hands of outside code that might modify them, then the instances are not equivalent even though the arrays presently hold the same value.

Since the type system has no way of knowing how what kind of comparison to do on int[] fields, there's no way it can generate a semantically-meaningful equality override.

supercat
  • 77,689
  • 9
  • 166
  • 211
  • "system has no way of knowing " Why not let programmer specify that with a keyword, though? Instead of writing the whole comparison. – SigTerm Apr 13 '15 at 23:03
  • 1
    @SigTerm: Now you are proposing a language feature. Language features have large costs, costs which must be justified against their benefits. Keep in mind as well that there are opportunity costs; effort spent designing, specifying, implementing, debugging, testing, documenting, shipping and maintaining your proposed feature is effort not spent on features that people actually would use. – Eric Lippert Apr 13 '15 at 23:08
  • 1
    @SigTerm: To make auto-generated comparisons work properly with nested encapsulation, code which is performing comparisons on inner objects at behest of outer objects' comparison methods would need to know the purpose for which the code requesting outer-object comparisons is holding the outer-object references. Engineering such a feature into a framework from the start would probably not be very expensive, and would enable huge cost savings down the road, but since .NET has no such feature there's no way for auto-generated comparison methods to know how they should behave. – supercat Apr 13 '15 at 23:10
  • @supercat: "would need to know the purpose" Good point. I missed this part of the problem. – SigTerm Apr 13 '15 at 23:39
  • @SigTerm: I hope the "next big thing" OO framework includes a dual-mode equivalence method, since any object can answer "Will you always be equivalent to the object identified by this reference, no matter what anyone does do you" and "Would you be equivalent to the object identified by this reference if neither of you would ever be modified *directly*?" Another way to think about things would be to say that two objects have permanent equivalence if one could arbitrarily replace any combination of references to one with references to the other without affecting their semantics... – supercat Apr 14 '15 at 15:43
  • ...and presently equivalence if globally *swapping all references* to one with references to the other would not alter semantics, even though swapping only some references would. The fact that all objects in .NET and Java expose their identity (a design weakness, IMHO) means those definitions of equality may not work in cases involving external weak identity-hash tables, but I like them conceptually. – supercat Apr 14 '15 at 15:45
  • @supercat: IMO, identifying permanent equivalence is more difficult problem. There are pretty much only two ways to ensure it - 1. make object and all its members (recursively) constants/unchangeable. 2. make initial object "reference" constant and forbid passing any references (directly/indirectly) to the outside of class. The idea of permanent equivalence you mentioned kinda reminds me of functional programming (and "function should have no side-effects" idea) and "predict program's output without compiling/running it" problem. – SigTerm Apr 14 '15 at 19:05
  • @SigTerm: It doesn't seem hard to me. An object is permanently equivalent to another if it is immutable, all nested objects whose *identity* is encapsulated in its equatable state report themselves permanently equivalent their counterparts in the other object, and all nested objects whose *state* is encapsulated in its equatable state report themselves as being presently equivalent to their counterparts. – supercat Apr 14 '15 at 19:10
  • @supercat: It is hard in the sense that if you make ONE mistake anywhere, you lose guarantee of permanent equivalence. It is very closely related to the idea of having no side effects in functional programming. Basically, as long as object is simple and does not communicate with anything, ensuring permanent equivalence is fairly trivial. However, as soon as object starts communicating with something external and storing "handles", it all immediately goes to hell, because the external "black box" is probably doing something behind the scenes that breaks guarantee of equivalence. – SigTerm Apr 14 '15 at 19:16
  • @supercat: If you're interested in such idea, it may indicate that you're in need to look at hardcore functional languages (those that do not allow reassignment) or at least try some "software" functional programming in languages like list. I'm not a fan of them, but some people find them enlgihtening. Having permanent equivalence is closely related to concept of having no side effects in your function. As long as there are no side effect, you can guarantee permanent equivalence. As soon as side effects are introduced, possibility of permanent equivalence becomes less than 100% (continued) – SigTerm Apr 14 '15 at 19:19
  • @SigTerm: The pattern doesn't work well with the style of "popsicle" immunity that uses the same object reference before and after freezing (as opposed to having the `Freeze` method return a new reference) but the former pattern is fraught with peril for other reasons anyhow, so I don't think that's a problem. – supercat Apr 14 '15 at 19:20
  • @supercat: (continued) ..less than 100%, meaning that relying on it is unsafe. (Since murphy's law says your program will hit the situation when objects are not permanently equal while your program thinks they are). That's how I see the problem anyway. I'm very used to coding defensively and I have C++ backgroud, so my opinion may or may not be correct. – SigTerm Apr 14 '15 at 19:21
  • @supercat: "popsicle" I'm not very interested in permanent equivalence at the moment and just stated few things that I thought may be of use to you since you appear to be interested in the topic. Aside from that, thanks for the answer and have fun. – SigTerm Apr 14 '15 at 19:24