6

I have data I receive from a web service via HTTPWebRequest. After I parse it using NewtonSoft.Deserialize into a custom type (a simple class with public string properties), I want to manipulate this data using LINQ - more specifically, I want to group the data.

My problem is that the grouping works fine if I group by a single string property

from x in myList
group x by x.myStr into grp
select grp;

Since I want to group by more columns, I am returning a custom type with

new MyType { a = ..., b = ... }

The group is however not working. I thought the reason must be the compiler does not know how to compare these objects - so if this type implements IEqualityComparer<MyType> it will solve it.

But no, it is still not grouping accordingly, and it creates several keys with the exact same string values.

The custom type by which I am grouping is something like

public class MyType
{
    public string a;
    public string b;
    public string c;
}

Any ideas of what am I missing?

Here's a concrete example of the scenario described above:

//The type that models the data returned from the web service
public class MyClass
{
    public string a { get; set; }

    public string b { get; set; }

    public string c { get; set; }

    public DateTime d { get; set; }

    public DateTime e { get; set; }
}

// the type by which I want to group my data
public class MyGroup : IEquatable<MyGroup>, IEqualityComparer<MyGroup>
{
    public string f1 { get; set; }

    public DateTime d1 { get; set; }

    public DateTime d2 { get; set; }

    public bool Equals(MyGroup other)
    {
        return string.Compare(this.f1, other.f1) == 0;
    }

    public bool Equals(MyGroup x, MyGroup y)
    {
        return string.Compare(x.f1, y.f1) == 0;
    }

    public int GetHashCode(MyGroup obj)
    {
        return obj.GetHashCode();
    }
}    
    List<MyClass> l = new List<MyClass>();
    l.Add(new MyClass { a = "aaa", b = "bbb", c = "ccc", d = DateTime.ParseExact("20081405", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140101", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaaa", b = "bbb", c = "ccc", d = DateTime.ParseExact("20090105", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140201", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aa", b = "bbbb", c = "cccc", d = DateTime.ParseExact("20081405", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140201", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaa", b = "bbbbb", c = "ccc", d = DateTime.ParseExact("20121111", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140101", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaaaa", b = "bbb", c = "ccc", d = DateTime.ParseExact("20081405", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140101", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaaa", b = "bbbbb", c = "ccc", d = DateTime.ParseExact("20121111", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140101", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaaa", b = "bbbb", c = "cccccc", d = DateTime.ParseExact("20081405", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140201", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaaaa", b = "bbb", c = "cccc", d = DateTime.ParseExact("20090105", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140301", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });
    l.Add(new MyClass { a = "aaa", b = "bbb", c = "cccc", d = DateTime.ParseExact("20081405", "yyyyddMM", Thread.CurrentThread.CurrentCulture), e = DateTime.ParseExact("20140201", "yyyyddMM", Thread.CurrentThread.CurrentCulture) });

    //The following does not really group
    //IEnumerable<IGrouping<MyGroup, MyClass>> r = from x in l
    IEnumerable<IGrouping<string, MyClass>> r = from x in l
                                                //group x by new MyGroup { f1 = x.a /*, d1 = x.d, d2 = x.e*/ } into grp
                                                orderby x.a
                                                group x by x.a into grp
                                                select grp;

    //foreach (IGrouping<MyGroup, MyClass> g in r)
    foreach (IGrouping<string, MyClass> g in r)
    {
        //Console.WriteLine(g.Key.f1);

        Console.WriteLine(g.Key);
    }
ekad
  • 14,436
  • 26
  • 44
  • 46
Veverke
  • 9,208
  • 4
  • 51
  • 95

3 Answers3

6

I thought the reason must be the compiler does not know how to compare these objects - so if this type implements IEqualityComparer<MyType> it will solve it.

Actually, to use a custom "equality" check in Linq functions you need to implement IEquatable<T>. IEquatable<T> is used to compare an instance of an object with another object of the same type - while IEqualityProvider<T> is meant to be implemented by an external class to compare two arbitrary Ts (and/or to have multiple methods of determining "equality").

Note that you should also implement Object.Equals and Object.GetHashCode - IEquatable<T> just allows you to compare in a type-safe manner.

Why the need for overriding Object's Equals and GetHashCode?

To ensure that any method (Object.Equals(object), the static Object.Equals(object, object, etc.) used to compare two objects is consistent. And any time you override Equals, you should also override GetHashCode to ensure that objects can be properly stored in a hash-based collection like a Dictionary or HashSet.

What does it mean IEquitable only compares in a type-safe manner?

When using IEquatable<T>, the object you're comparing to is guaranteed to be a T (or a subtype of T), whereas with Object.Equals, you don't know the type of the other object and must check it's type first.

For example:

// IEquatable<T>.Equals()
public bool Equals(MyGroup other)
{
    return string.Compare(this.f1, other.f1) == 0;
}

versus

// Object.Equals()
public bool Equals(object other)
{
    // need to check the type of the passed in object
    MyGroup grp = other as MyGroup;

    // other is not a MyGroup
    if(grp == null return false);        

    return string.Compare(this.f1, grp.f1) == 0;

    // you could also use
    //    return this.Equals(grp);
    // as a shortcut to reuse the same "equality" logic
}
D Stanley
  • 149,601
  • 11
  • 178
  • 240
  • Indeed, this is the solution, D. Stanley. Could you clarify why the need for overriding Object's Equals and GetHashCode? What does it mean IEquitable only compares in a type-safe manner? The fact that without the overrides, only with IEquitable's implementation it does not work - means I am working type-unsafe? Why? After all I am not working with anonymous types - on the contrary - I took the time to declare each and every type I am working with... – Veverke Jul 16 '14 at 10:01
  • @Veverke Each time a new entry in `myList` is reached, we group by a `new MyType { a = ..., ... }`. Linq will have to decide if this `new MyType` is the "same" as an earlier encountered `MyType`. For that it uses the `Equals` and `GetHashCode` methods to see if they are "the same". If you do not override the methods, a `new` instance of `MyType` is "not equal" to another instance, even if bot `a`, `b` and `c` look the same. – Jeppe Stig Nielsen Jul 16 '14 at 12:11
  • @Veverke I have added my answers to your follow-up questions. – D Stanley Jul 16 '14 at 12:51
  • @Jeppe Stig Nielsen ok, but if you say "if you do not override the methods, a new instance of MyType is not equal to another instance..." - what is the purpose then of implementing IEquitable? I would think it was for this very reason... – Veverke Jul 16 '14 at 17:35
  • `IEquatable.Equals` will be faster than `object.Equals` since it does not need to cast or box/unbox the inputs to determine equality. The output for IEquatable.Equals` and `object.Equals` should be consistent. – D Stanley Jul 16 '14 at 17:38
  • So, summing up the lesson learned - every time we implement interfaces that generate implementations of Equals and/or GetHashCode - we are required to override both Object's static methods? – Veverke Jul 16 '14 at 18:02
  • 2
    More accurately - if you implement `IEquatable` you must also override `object.Equals(object)`. If you override `object.Equals(object)` you must also override `object.GetHashCode()` (and vice-versa). Implementing `IEquatable` is optional but can improve performance. You can't override the static `Object.Equals(object, object)` method (but it calls `object.equals(object)` on the left instance). `IEqualityComparer` is designed to be implemented on a class ohter than `T` (think of it as an "alternate" method of defining equality). – D Stanley Jul 16 '14 at 18:07
  • @DStanley: I was building up on your answer till the point of "the static Object.Equals" - I don't see where/how Object is a static class. More importantly: based on what you write in the beginning - can I thus infer that it would suffice in this case to override Object.Equals (and in turn override GetHashCode as well) ? I mean, implementing IEquitable solely confers me performance enhacement? Or is it part of the requirement for grouping by custom types? – Veverke Jul 16 '14 at 18:46
  • @Veverke `object` is not a static class but `object.Equals(object, object)` is a static method and thus can't be overridden. `IEquatable` is not _required_ but is easier to code and avoids casting for reference types and boxing/unboxing for value types. – D Stanley Jul 16 '14 at 19:10
3

Any ideas of what am I missing?

Something like:

public class MyType : IEquatable<MyType>
{
  public string a;
  public string b;
  public string c;

  public bool Equals(MyType other)
  {
    if (other == null)
      return false;

    if (GetType() != other.GetType()) // can be omitted if you mark the CLASS as sealed
      return false;

    return a == other.a && b == other.b && c == other.c;
  }

  public override bool Equals(object obj)
  {
    return Equals(obj as MyType);
  }

  public override int GetHashCode()
  {
    int hash = 0;
    if (a != null)
      hash ^= a.GetHashCode();
    if (b != null)
      hash ^= b.GetHashCode();
    if (c != null)
      hash ^= c.GetHashCode();
    return hash;
  }
}

Addition: Note that MyType above is mutable, and the hash code changes if one of the fields a, b and c are re-assigned. That is problematic if the re-assignment happens while the instance is being held in a Dictionary<MyType, whatever>, HashSet<MyType> etc.


Alternatively, you could "group by" an anonymous type as suggested in DavidG's answer, or "group by" Tuple.Create(.. , .. , ..).

Jeppe Stig Nielsen
  • 60,409
  • 11
  • 110
  • 181
1

Do your grouping using an anonymous type. The individual values themselves appear to be strings (i.e. simple types that already have built in comparers) so you don't need a custom type for anything other than the return. this way there is no need to worry about IEqualityComparer.

from x in myList
group x by new { x.myStr, x.otherStr, x.AnotherStr } into grp
select new MyTpe 
{
    a = grp.Key.myStr, 
    b = grp.Key.otherStr, 
    c = grp.Key.AnotherStr
};
DavidG
  • 113,891
  • 12
  • 217
  • 223
  • For me it will be a considerable drawback not to have compile check/intelisense in further manipulations in the group results... so I'd rather define my grouping type. – Veverke Jul 16 '14 at 10:06
  • What makes you think none of that is available? – DavidG Jul 16 '14 at 10:08
  • oops, I misunderstood - I thought you were selecting from the anonymous type, not from MyType - meaning the resulting IEnumerable collection of IGrouping<> will have compile time check/intelisense. I thought you were returning an anonymous type. – Veverke Jul 16 '14 at 10:10
  • 1
    Ah I see. Though selecting the anonymous type still gives you compile checks and intellisense. They are still real types, think of them as types that you don't know the name of. – DavidG Jul 16 '14 at 10:18
  • I see. Thanks for the insights, David. – Veverke Jul 16 '14 at 17:31