1

i need to perform simple set-operations in linq (for example Union, Except and Intersect)

class Person {
        public int Id { get; set; }
        public string Name { get; set; }

        public Person() { }

        public Person(int id, string name) {
            Id = id; Name = name;
        }

    }

Comparer implementation:

class PersonComparer : IEqualityComparer<Person> {
        public bool Equals(Person x, Person y) {
            return x.Id == y.Id;
        }

        public int GetHashCode(Person p) {
            return p.GetHashCode();
        }
    }

Populating lists:

var list1 = new List<Person>();
        list1.Add(new Person(1, "John"));
        list1.Add(new Person(2, "Peter"));
        list1.Add(new Person(3, "Mike"));

        var list2 = new List<Person>();
        list2.Add(new Person(2, "Peter"));
        list2.Add(new Person(3, "Mike"));
        list2.Add(new Person(4, "Fred"));

    var comparer = new PersonComparer();

    var list3 = list1.Intersect(list2, comparer).ToList(); // **Empty List**
    var list4 = list1.Except(list2, comparer).ToList(); // **"John", "Peter", "Mike"**

It seems that my comparer does not work. Why?

Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
Ivan Stelmakh
  • 195
  • 1
  • 6

1 Answers1

4

The problem is your implementation of GetHashCode(Person p). As noted in MSDN:

Implementations are required to ensure that if the Equals method returns true for two objects x and y, then the value returned by the GetHashCode method for x must equal the value returned for y.

In your case, p.GetHashCode() may return a different value for each p in memory, even if they have the same Id—that is, two different instances of Person may have the same Id but different hash codes—so this isn't sufficient to satisfy the requirement noted above for a proper implementation of GetHashCode(Person p). Instead, use something like this:

public int GetHashCode(Person p) {
    return p.Id;
}
p.s.w.g
  • 146,324
  • 30
  • 291
  • 331
  • 2
    @Ivan And remember that often solving the GetHashCode/Equals is quite complex if Person is an entity that is saved to the DB: before saving Id is 0, while after saving Id is != 0, so the hash code has changed. So if you had a HashSet or a Dictionary then the hashset/dictionary are ko, because they "cache" the hashcode internally. – xanatos Mar 03 '15 at 13:44
  • As i understood you correct, i can avoid the implementation of IEqualityComparer , by overriding methods: ToString() , GetHashCode(), Equals() in my Person class. And set-operations methos will work correctly by default overload (without IEqualityComparer parametr)? – Ivan Stelmakh Mar 03 '15 at 14:19
  • 1
    @IvanStelmakh In theory, yes, but you should observe *xanatos*'s caution. If the `Id` can change over the lifetime of the object (and it can, because there the setter is public) then the object will not behave nicely in any collections that rely on the hash code. I'd recommend not modifying the default behavior of classes unless you know what you're doing. The `IEqualityComparer` route is usually better because it preserves the expected behavior of C# classes in all other cases. – p.s.w.g Mar 03 '15 at 14:25