2

I have came with solution to remove duplicates from generic list<T> in .NET 2.0 as follows:

List<CaseStudy> caseStudies = CaseStudyDAO.FindCaseStudiesByDate(DateTime.Now.Date, DateTime.Now.Date.AddDays(1));
caseStudies.RemoveAll(
        delegate(CaseStudy c)
        {
            return caseStudies.IndexOf(c) != caseStudies.FindIndex(
                delegate(CaseStudy f) { return c.Str == f.Str; });
        });

My questions are:

Is there more efficient way to do this? Only .NET 2.0 solution
What is the complexity of the above solution?

Thanks,
jan2k10

Mariusz Miesiak
  • 638
  • 1
  • 6
  • 19
  • duplicate: http://stackoverflow.com/questions/344519/select-distinct-from-a-list-of-ienumerablet-in-net-2-0 –  Apr 07 '10 at 17:52
  • the question was about specific solution – Mariusz Miesiak Apr 07 '10 at 18:02
  • sry, the answers on that question contained complexity and most efficient way - nearly all the things you asked for... you should maybe restate your question to only asking for the complexity of your solution –  Apr 07 '10 at 18:05

2 Answers2

12

The time complexity of RemoveAll is O(n). The time complexity of the indexing is O(n), so that's a grand total of O(n^2) time complexity. The space complexity is, I think, O(1).

Is there a more efficient way to do it? Yes. You can do it in O(n) time complexity provided you're willing to spend more space on it.

Eric Lippert
  • 647,829
  • 179
  • 1,238
  • 2,067
  • Plus for explaining complexity – Mariusz Miesiak Apr 07 '10 at 17:50
  • Yes, the space complexity of RemoveAll is O(1). As Eric suggests, if you are willing to use more space, you can build a `Dictionary` of the items in the list (use the item as both the key and value, or something similar). Or look for a pre-existing `Set` implementation that people have built. – Neil Apr 07 '10 at 18:01
5

Just to expand on Eric's comment about O(n) time if you're happy to use more space, I'd do something like this:

Dictionary<string, CaseStudy> lookup = new Dictionary<string, CaseStudy>();
foreach (CaseStudy cs in caseStudies)
{
    lookup[cs.Str] = cs;
}
caseStudies = new List<CaseStudy>(lookup.Values);

A couple of notes:

  • This changes the value of caseStudies to refer to a new list. If you wanted it to be within the same List<T>, you could use:

    caseStudies.Clear();
    caseStudies.AddRange(lookup.Values);
    
  • This keeps the last element in the list with each distinct Str value. That was just to make it as short as possible. If you want the first element, use:

    foreach (CaseStudy cs in caseStudies)
    {
        if (!lookup.ContainsKey(cs.Str))
        {
            lookup[cs.Str] = cs;
        }
    }
    
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194