2

I'm trying to store (name: string, value: long) pair in a set.

public class NameValuePair
{
  public string name;
  public long value;
}

public NameValuePairComparer comparer = new NameValuePairComparer();
public HashSet<NameValuePair> nameValueSet = new HashSet<NameValuePair>(comparer);

Two pairs are equal if either they have equal name or equal value - this is implemented in NameValuePairComparer overriding Equals method from EqualityComparer:

public class NameValuePairComparer : EqualityComparer<NameValuePair>
{
   public override bool Equals(NameValuePair x, NameValuePair y)
   {
      return (x.value == y.value) || (x.name == y.name);
   }

The problem is: GetHashCode(NameValuePair obj) should return the same value for two objects for which Equals return true, so for given NameValuePair, GetHashCode() should return either value.GetHashCode() or name.GetHashCode(), but to do this we have to know which field in both pairs is equal:

   public override int GetHashCode(NameValuePair obj)
   {
      /* ??? */
      /* // Using unknown reference to x
        if (obj.value == x.value) return obj.value.GetHashCode();
        else if (obj.name == x.name) return obj.name.GetHashCode();
        else return base.GetHashCode(obj);
      */
   }
}

But we can't know this, and that means I can't use HashSet to store these pairs nor EqualityComparer.

Q: Is there not-hash-based implementation of set in C# (.net 3.5) ?

Q: What would be better approach to storing unique NameValuePairs with custom equality comparer ?

Mariusz Ceier
  • 33
  • 1
  • 4

1 Answers1

6

Two pairs are equal if either they have equal name or equal value

You fundamentally can't implement IEqualityComparer<T> correctly with these criteria. From the documentation of Equals:

The Equals method is reflexive, symmetric, and transitive. That is, it returns true if used to compare an object with itself; true for two objects x and y if it is true for y and x; and true for two objects x and z if it is true for x and y and also true for y and z.

Now consider pairs:

x = { "A", 10 },
y = { "A", 20 },
z = { "B", 20 }

You're saying that x and y must be equal as they have the same name, and y and z must be equal as they have the same value. That means (by transitivity) that x and z should be equal.

As you can't implement IEqualityComparer<T> correctly, you shouldn't expect anything which relies on that correctness to work.

I suspect you'll find that if you look at your requirements in more detail, they either really call for two collections (one by name, one by value) or they don't make sense in the light of transitivity.

For example, imagine you had a set with the characteristics you've suggested, and you add the three elements above. If you add them in the order { x, y, z } you'd end up with a single entry. If you add them in the order { z, x, y } you'd end up with two. How is that a useful kind of set?

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • @ Jon : +1 ,Seriously how do you think such a lighting speed :)? – TalentTuner Mar 18 '13 at 10:43
  • Transitivity is not important here. In your example if there's x in collection, I can't allow adding 'y' to it. If there's 'y' in collection I can't allow adding 'z' nor 'x'. So I thought set with custom equality comparer would be the best solution. – Mariusz Ceier Mar 18 '13 at 10:45
  • If I add them in {x,y,z} or {z,x,y} order, I will end up with either {x,z} or {z,x}. – Mariusz Ceier Mar 18 '13 at 10:50
  • 1
    @MariuszCeier: Okay, so you're not doing replacement: so add them in { y, x, z } order and you only get a single entry. Again, how is that a good thing? Transitivity may not be important to *you*, but it's a requirement for the equality relation represented in `IEqualityComparer`, so you shouldn't implement that interface. Given that you have odd requirements, you shouldn't expect them to be fulfilled in the framework. – Jon Skeet Mar 18 '13 at 10:56
  • Ending with single entry when adding them in {y,x,z} order is a good thing because name and value in a pair are in 1-1 relation. Consider enum-like type - each entry in this type is a name-value pair. These entries must have unique name and unique value in this type, so when there is 'y' in this type, 'x' and 'z' should return error (name-conflict or value-conflict). I don't have to use IEqualityComparer, I just need a set collection that would allow me specify equality comparer without defining HashCode, something like std::set from C++. – Mariusz Ceier Mar 18 '13 at 11:08
  • @MariuszCeier: No, you need a set collection that allows you to specify an equality comparer without transitivity. You could provide a GetHashCode implementation which always returned a constant value - that's always a valid implementation - but it wouldn't help you here. (How would you do this in C++ where the comparison operator needs to provide ordering, anyway?) – Jon Skeet Mar 18 '13 at 11:15