-1

Is there a collection in C# that guarantees me that I will have only unique elements? I've read about HashSet, but this collection can contain duplicates. Here is my code:

public class Bean
{
    public string Name { get; set; }

    public int Id { get; set; }

    public override bool Equals(object obj)
    {
        var bean = obj as Bean;

        if (bean == null)
        {
            return false;
        }

        return this.Name.Equals(bean.Name) && this.Id == bean.Id;
    }

    public override int GetHashCode()
    {
        return Name.GetHashCode() * this.Id.GetHashCode();
    }
}

You may complain about using non-readonly properties in my GetHashCode method, but this is a way of doing (not the right one).

        HashSet<Bean> set = new HashSet<Bean>();

        Bean b1 = new Bean {Name = "n", Id = 1};
        Bean b2 = new Bean {Name = "n", Id = 2};
        set.Add(b1);
        set.Add(b2);
        b2.Id = 1;

        var elements = set.ToList();

        var elem1 = elements[0];
        var elem2 = elements[1];

        if (elem1.Equals(elem2))
        {
            Console.WriteLine("elements are equal");
        }

And in this case, my set contains duplicates.

So is there a collection in C# that guarantees me that it does not contains duplicates?

Buda Gavril
  • 21,409
  • 40
  • 127
  • 196
  • you may use dictionary – RAHUL S R Jul 10 '17 at 06:30
  • It behaves the same – Buda Gavril Jul 10 '17 at 06:33
  • 1
    Your keys are mutable (i.e. the `complain about using non-readonly properties in GetHashCode` remark). Because of this, you can't use any collection that would ensure that there are no duplicates in your list since they only check for uniqueness insertion criteria (by calling `Equals()` and `GetHashCode()`) when adding new items and not when the underlying properties change. If you think about it logically, how would the collection internally know that you've changed the key and that it needs to rehash and potentially throw an exception on finding duplicates? – kha Jul 10 '17 at 06:35
  • This could be achieved with an observable collection where I'm overriding add method and when a property of an object changes, I will check for duplicates and throw an exception if there are duplicates, but this is a nasty thing to do – Buda Gavril Jul 10 '17 at 06:39
  • You're doing a nasty thing to begin with (mutating the hashcode), don't expect a clean solution. This is considered bad practice for a good reason – Kevin Gosse Jul 10 '17 at 06:44
  • 2
    You asked if there was a built in collection to do this and the answer is no. You are of course free to write your own collection that does it but bear in mind that it's not the ListChange event that matters here so it doesn't matter if it's an `ObservableCollection` or not. It's the individual properties.To achieve what you want, you will have to listen to the property changes using `INotifyPropertyChange` or similar mechanisms and rerun the hashing logic. It will be quite counter-intuitive to use though and I recommend making sure your `key` properties don't change instead. – kha Jul 10 '17 at 06:44
  • Can you be more specific about your desired behavior / point of failure? Do you expect an exception or some special behavior on property change (`b2.Id = 1;`) or on subsequent element access (`set.ToList();`)? – grek40 Jul 10 '17 at 07:04
  • I'm not expecting a failure, I'm expecting that if I have the collection with th elements, I need to be sure that all elements are unique without going trough the collection with linq and group them. – Buda Gavril Jul 10 '17 at 07:13
  • If you can guarantee a consistent definition of "duplicate", then `HashSet` works fine. If you can't, then _no_ implementation will work fine. Figure out which one you prefer. – Peter Duniho Jul 10 '17 at 07:14
  • 1
    I don't understand... you have a collection with 2 different elements, then you change the elements to become duplicates but you still don't expect a failure... please be __very__ specific what you expect to happen at each point __in detail__! – grek40 Jul 10 '17 at 07:15
  • Possible duplicate of [HashSets don't keep the elements unique if you mutate their identity](https://stackoverflow.com/questions/11410994/hashsets-dont-keep-the-elements-unique-if-you-mutate-their-identity) – mjwills Jul 10 '17 at 07:58

2 Answers2

1

So is there a collection in C# that guarantees me that it does not contains duplicates?

There is no existing collection class in C# that does this. You could write your own, but there is no existing one.

Some extra information regarding the issue you are experiencing

If you change a HashSet entry after adding it to the HashSet, then you need to regenerate the HashSet. My below RegenerateHashSet can be used to do that.

The reason you need to regenerate is that duplicate detection only occurs at insertion time (or, in other words, it relies on you not changing an object after you insert it). Which makes sense, if you think about it. The HashSet has no way to detect that an object it contains has changed.

using System;
using System.Collections.Generic;
using System.Linq;

namespace Test
{
    public static class HashSetExtensions
    {
        public static HashSet<T> RegenerateHashSet<T>(this HashSet<T> original)
        {
            return new HashSet<T>(original, original.Comparer);
        }
    }

    public class Bean
    {
        public string Name { get; set; }

        public int Id { get; set; }

        public override bool Equals(object obj)
        {
            var bean = obj as Bean;

            if (bean == null)
            {
                return false;
            }

            return Name.Equals(bean.Name) && Id == bean.Id;
        }

        public override int GetHashCode()
        {
            return Name.GetHashCode() * Id.GetHashCode();
        }
    }

    public class Program
    {
        static void Main(string[] args)
        {
            HashSet<Bean> set = new HashSet<Bean>();

            Bean b1 = new Bean { Name = "n", Id = 1 };
            Bean b2 = new Bean { Name = "n", Id = 2 };
            set.Add(b1);
            set.Add(b2);
            b2.Id = 1;

            var elements = set.ToList();

            var elem1 = elements[0];
            var elem2 = elements[1];

            if (elem1.Equals(elem2))
            {
                Console.WriteLine("elements are equal");
            }
            Console.WriteLine(set.Count);
            set = set.RegenerateHashSet();
            Console.WriteLine(set.Count);
            Console.ReadLine();
        }
    }
}

Note that the above technique is not bullet-proof - if you add two objects (Object A and Object B) which are duplicates and then change Object B to be different to Object A then the HashSet will still only have one entry in it (since Object B was never added). As such, what you probably want to do is actually store your complete list in a List instead, and then use new HashSet<T>(yourList) whenever you want unique entries. The below class may assist you if you decide to go down that route.

public class RecalculatingHashSet<T>
{
    private List<T> originalValues = new List<T>();

    public HashSet<T> GetUnique()
    {
        return new HashSet<T>(originalValues);
    }

    public void Add(T item)
    {
        originalValues.Add(item);
    }
}
mjwills
  • 23,389
  • 6
  • 40
  • 63
-1

If you don't write your own collection type and handle property changed events to re-evaluate the items, you need to re-evaluate the items at each access. This can be accomplished with LINQ deferred execution:

ICollection<Bean> items= new List<Bean>();
IEnumerable<Bean> reader = items.Distinct();

Rule: only use items to insert or remove elements, use reader for any read access.

Bean b1 = new Bean { Name = "n", Id = 1 };
Bean b2 = new Bean { Name = "n", Id = 2 };
items.Add(b1);
items.Add(b2);
b2.Id = 1;

var elements = reader.ToList();

var elem1 = elements[0];
var elem2 = elements[1]; // throws exception because there is only one element in the result list.
grek40
  • 13,113
  • 1
  • 24
  • 50