0

Say I have a list of objects, object Fruit. Fruit has a property Name. i.e. Fruit1.Name = "Apple", Fruit2.Name = "Orange", Fruit3.Name = "Apple", Fruit4.Name = "Melon"... etc

List<Fruit> Basket = {Fruit1, Fruit2, Fruit3, Fruit4, Fruit 5...... Fruit 100}.

I want to have a list of Unique Fruits, where every fruit in the list has unique name. I want to optimize for time. I've seen some people do the following. Is this a best way?

public List<Fruit> GetUniqueFruits(List<Fruit> Basket)
{
    Dictionary<string, Fruit> tempUniqueFruits = new Dictionary<string, Fruit>();
    List<Fruit> uniqueFruits = new List<Fruit>();
    foreach(var fruit in Basket)
    {
        if (!tempUniqueFruits.ContainsKey(fruit.Name)
        {
            tempUniqueFruits.Add(fruit.Name, fruit);
            uniqueFruits.Add(fruit);
        }
    }
    return uniqueFruits;
}

I hear dictionary lookup is very fast, so I guess maybe that's why this is used, but I want to know if there is a better way.

Thanks matt burland, i fixed the typo. ( coulnd't comment yet)

Melissa Balle
  • 73
  • 1
  • 5

4 Answers4

2

You can use an IEqualityComparer to clarify the code.

public List<Fruit> GetUniqueFruits(List<Fruit> Basket) {
    var set = new HashSet<Fruit>(Basket, new FruitNameEqualityComparer());
    return set.ToList();
}

public class Fruit {
    public string Name { get; set; }
    public DateTime RipeTime { get; set; }
}

class FruitNameEqualityComparer : IEqualityComparer<Fruit> {
    public int Compare(Fruit a, Fruit b) {
        return a.Name.CompareTo(b.Name);
    }

    public bool Equals(Fruit a, Fruit b) {
        return a.Name.Equals(b.Name);
    }

    public int GetHashCode(Fruit f) {
        return f.Name.GetHashCode();
    }
}

Dictionary<T, U> is best used when you are mapping from keys to values, but if you are only interested in maintaining a set of unique values without any mappings, a HashSet<T> is specifically designed for that purpose.

recursive
  • 83,943
  • 34
  • 151
  • 241
  • I agree, sets should generally be used for uniqueness, but an explanation of why `HashSet` instead of `Dictionary` might be nice – Sam Mar 27 '14 at 19:51
1

A dictionary forces the code to make sure that it only contains unique keys, not values. So if you try to add another key that already exists it will throw an error. When wanting to grab a value you just have to get it by the key name which the dictionary does a lookup using a hash, which makes it very very fast. When wanting to search the list you have to iterate the whole list to find the one you want which can be slow as you are iterating the whole list.

PCG
  • 1,197
  • 9
  • 23
1

A shorter way would be:

return Basket.GroupBy(f => f.Name).Select(grp => grp.First()).ToList();

although this might not keep the first item in Basket with the given name.

Lee
  • 142,018
  • 20
  • 234
  • 287
1

So if the names are the unique part of the object (i.e. the key) and the order of the items isn't important, then a Dictionary<string, Fruit> is a perfectly valid way to store them. Another option would be HashSet, but then you'd need to implement Equals and GetHashCode in your Fruit class (or create a IEqualityComparer<Fruit>).

But for you specific code, there are Linq statements you can use (like Lee's) which are efficient, but with your particular code, you don't need to keep create a list of unique items at the same time as you are building your dictionary (unless the order is important) because your can return tempUniqueFruits.Values.ToList()

Also, if you want to build the list of unique items (to preserve the order), then since you are not actually using the values in the dictionary, just the keys, you could use a HashSet<string> instead.

Matt Burland
  • 44,552
  • 18
  • 99
  • 171