0
HashSet<ReadOnlyCollection<int>> test1 = new HashSet<ReadOnlyCollection<int>> ();
for (int i = 0; i < 10; i++) {
    List<int> temp = new List<int> ();
    for (int j = 1; j < 2; j++) {
        temp.Add (i);
        temp.Add (j);
    }
    test1.Add (temp.AsReadOnly ());
}

Here test1 is {[0,1], [1,1], [2,1], [3,1], [4,1], [5,1], [6,1], [7,1], [8,1], [9,1]}

HashSet<ReadOnlyCollection<int>> test2 = new HashSet<ReadOnlyCollection<int>> ();
for (int i = 5; i < 10; i++) {
    List<int> temp = new List<int> ();
    for (int j = 1; j < 2; j++) {
        temp.Add (i);
        temp.Add (j);
    }
    test2.Add (temp.AsReadOnly ());
}

Here test2 is {[5,1], [6,1], [7,1], [8,1], [9,1]}

test1.ExceptWith(test2);

After doing this, I want test1 to be {[0,1], [1,1], [2,1], [3,1], [4,1]}, but it gives me the original test1.
How fix this problem? Or is there any other way to do the same thing? Thank you!

lynkewsw
  • 5
  • 3
  • Short answer : you could define a custom EqualityComparer for your collections. Currently, they are compared by their instance (as objects), every new HashSet or List is different, even if they contain the same elements. – Pac0 Mar 16 '20 at 22:10
  • Do you understand WHY it is acting like that? – mjwills Mar 16 '20 at 22:11
  • @Pac0 Do you mean I should define a class to contain the values and override its __Equals__ and __GetHashCode__ method? Thank you! – lynkewsw Mar 16 '20 at 22:15
  • @mjwills My guess is that when C# hashes the __same ReadOnlyCollection__, say [1,2], it returns __different hash values__ for the value-same collections, but I don't know how to fix that, thank you for your reply! – lynkewsw Mar 16 '20 at 22:17
  • If this is the data you actually want to work with, make it a Tuple instead. – gnud Mar 16 '20 at 22:19
  • Hey @gnud, that is a good idea, but if the length of the tuple is not constant, like I may have {(1,2,3), (4,5,6)} and {(1,2), (3,4)} as the return values, how should I declare the type of return when writing the function? Thank you! – lynkewsw Mar 16 '20 at 22:30

2 Answers2

2

Objects in c# are usually compared by reference, not by value. This means that new object() != new object(). In the same way, new List<int>() { 1 } != new List<int>() { 1 }. Structs and primitives, on the other hand, are compared by value, not by reference.

Some objects override their equality method to compare values instead. For example strings: new string(new[] { 'a', 'b', 'c'}) == "abc", even if object.ReferenceEquals(new string(new[] { 'a', 'b', 'c'}), "abc") == false.

But collections, lists, arrays etc. do not. For good reason - when comparing two lists of ints, what do you want to compare? The exact elements, regardless of order? The exact elements in order? The sum of elements? There's not one answer that fits everything. And often you might actually want to check if you have the same object.

When working with collections or LINQ, you can often specify a custom 'comparer' that will handle comparisons the way you want to. The collection methods then use this 'comparer' whenever it needs to compare two elements.

A very simple comparer that works on a ReadOnlyCollection<T> might look like this:

class ROCollectionComparer<T> : IEqualityComparer<IReadOnlyCollection<T>>
{
    private readonly IEqualityComparer<T> elementComparer;

    public ROCollectionComparer() : this(EqualityComparer<T>.Default) {}
    public ROCollectionComparer(IEqualityComparer<T> elementComparer) {
        this.elementComparer = elementComparer;
    }

    public bool Equals(IReadOnlyCollection<T> x, IReadOnlyCollection<T> y)
    {
        if(x== null && y == null) return true;
        if(x == null || y == null) return false;
        if(object.ReferenceEquals(x, y)) return true;

        return x.Count == y.Count && 
            x.SequenceEqual(y, elementComparer);
    }

    public int GetHashCode(IReadOnlyCollection<T> obj)
    {       
        // simplistic implementation - but should OK-ish when just looking for equality
        return (obj.Count, obj.Count == 0 ? 0 : elementComparer.GetHashCode(obj.First())).GetHashCode();
    }
}

And then you can compare the behavior of the default equality check, and your custom one:

var std = new HashSet<int[]>(new[] { new[] { 1, 2 }, new[] { 2, 2}});
std.ExceptWith(new[] { new[] { 2, 2}});
std.Dump();

var custom = new HashSet<int[]>(new[] { new[] { 1, 2 }, new[] { 2, 2 } }, new ROCollectionComparer<int>());
custom.ExceptWith(new[] { new[] { 2, 2 }});
custom.ExceptWith(new[] { new int[] { }});
custom.Dump();

You can test the whole thing in this fiddle.

gnud
  • 77,584
  • 5
  • 64
  • 78
0

Here you have the implementation of ExceptWith:

https://github.com/microsoft/referencesource/blob/3b1eaf5203992df69de44c783a3eda37d3d4cd10/System.Core/System/Collections/Generic/HashSet.cs#L532

What it actually does is:

 // remove every element in other from this
 foreach (T element in other) {
    Remove(element);
 }

And Remove implementation:

https://github.com/microsoft/referencesource/blob/3b1eaf5203992df69de44c783a3eda37d3d4cd10/System.Core/System/Collections/Generic/HashSet.cs#L287

 if (m_slots[i].hashCode == hashCode && m_comparer.Equals(m_slots[i].value, item)) {

So if the hashcode is not the same, Remove will do nothing.

A small test to prove that hashcode is not the same:

    List<int> temp = new List<int> ();
     temp.Add(1);
     temp.Add(2);

    HashSet<ReadOnlyCollection<int>> test1 = new HashSet<ReadOnlyCollection<int>> ();
    HashSet<ReadOnlyCollection<int>> test2 = new HashSet<ReadOnlyCollection<int>> ();
    test1.Add (temp.AsReadOnly ());
    test2.Add (temp.AsReadOnly ());

    Console.WriteLine(test1.First().GetHashCode() == test2.First().GetHashCode());
Carlos Garcia
  • 2,771
  • 1
  • 17
  • 32
  • 1
    Thank you for the links of source codes and the examples! Now I can understand why it doesn't work! – lynkewsw Mar 16 '20 at 23:01