14

I have a Dictionary, where items are (for example):

  1. "A", 4
  2. "B", 44
  3. "bye", 56
  4. "C", 99
  5. "D", 46
  6. "6672", 0

And I have a List:

  1. "A"
  2. "C"
  3. "D"

I want to remove from my dictionary all the elements whose keys are not in my list, and at the end my dictionary will be:

  1. "A", 4
  2. "C", 99
  3. "D", 46

How can I do?

Nick
  • 10,309
  • 21
  • 97
  • 201

4 Answers4

26

It's simpler to construct new Dictionary to contain elements that are in the list:

List<string> keysToInclude = new List<string> {"A", "B", "C"};
var newDict = myDictionary
     .Where(kvp=>keysToInclude.Contains(kvp.Key))
     .ToDictionary(kvp=>kvp.Key, kvp=>kvp.Value);

If it's important to modify the existing dictionary (e.g. it's a readonly property of some class)

var keysToRemove = myDictionary.Keys.Except(keysToInclude).ToList();

foreach (var key in keysToRemove)
     myDictionary.Remove(key);

Note the ToList() call - it's important to materialize the list of keys to remove. If you try running the code without the materialization of the keysToRemove, you'll likely to have an exception stating something like "The collection has changed".

J0HN
  • 26,063
  • 5
  • 54
  • 85
  • what would be the pros and cons of both ways ? – guiomie Dec 06 '13 at 15:31
  • @guiomie former way constructs new dictionary, so you have old one intact, but at cost of increased memory footprint. Later one modifies existing dictionary in-place. In a nutshell, if you're not operating on really huge dictionaries, no pros and cons - the state of programs are not equivalent between these two ways. – J0HN Dec 09 '13 at 10:32
  • @J0HN Both will increase memory footprint because Except() method implementation instantiate an internal data structure (a hashset typically) to be able perform the operation in O(n) time. – tigrou May 21 '21 at 23:19
8
// For efficiency with large lists, for small ones use myList below instead.  
var mySet = new HashSet<string>(myList);

// Create a new dictionary with just the keys in the set
myDictionary = myDictionary
               .Where(x => mySet.Contains(x.Key))
               .ToDictionary(x => x.Key, x => x.Value);
Joachim Isaksson
  • 176,943
  • 25
  • 281
  • 294
  • Don't you think that create a new dictionary could be inefficient? – Nick Nov 25 '12 at 16:45
  • Now this is a good question, what is more efficient, creating a new dictionary, or taking a number of items out. I guess it depends on the numbers involved, it would be interesting to measure it tough. – Robert Nov 25 '12 at 16:56
  • Here is a speed test I've performed, your solution against mine. You can easily see that creating a dictionary is more efficient, since you don't iterate over mySet for every element in the dictionary: http://pastebin.com/iY1LHRM1 – SimpleVar Nov 25 '12 at 17:23
  • I'm copying the dictionary in both test cases, and just as you have the HashSet overhead, I have a second dictionary overhead in my solution. Putting the HashSet out of the loop so it only occurs once doesn't change much, surprisingly. The number of removes is less critical than the number of iterations. The complexity of your algorithm is O(n*m*T(n)), n being size of dictionary, m being the size of list, and T(n) the time needed by function Remove. Mine is O(m*T(n)), assuming TryGetValue is asymptotically the same as Remove. It's not just execution time, it's complexity. Yours is inefficient. – SimpleVar Nov 25 '12 at 17:52
  • @YoryeNathan Huh? *O(n*m*T(n)), n being size of dictionary, m being the size of list, and T(n) the time needed by function Remove.*? I don't call `Remove` and `m` is doubtful since I use a set. Not saying my code is faster, but the fastest is probably @J0HN's solution below anyway, `myDictionary.Keys.Except(myList).ToList().ForEach(x => myDictionary.Remove(x));`. – Joachim Isaksson Nov 25 '12 at 18:18
  • Sorry, I got confused with someone's else answer with a Remove. Mine is without Remove either. Testing against J0HN's solution, looks like you're winning. You two should know that in general LINQ has an overhead and is slower. – SimpleVar Nov 25 '12 at 18:37
  • @YoryeNathan [Premature optimization is the root of all evil (or at least most of it) in programming.](http://en.wikiquote.org/wiki/Donald_Knuth) – J0HN Nov 26 '12 at 11:04
  • 3
    @J0HN Will people stop posting this link inappropriately? This isn't premature, and it is not "optimization". Optimization is where you take the code and tweak it a bit so it performs a tiny bit faster. This is about a whole different algorithm, and you can't even say it's premature, because all I have here is the OP's question, I'm not doing his project. – SimpleVar Nov 26 '12 at 11:52
  • @YoryeNathan I'm not going to debate the definitions, and I'm not going to debate *your* definitions. :) You compate two pieces of code: a one-liner (ok, three-liner) and a function with two loops, and state that the function is better, because it's *faster*. And, as you mentioned, OP does not even a mention performance considerationsm. So, aren't you are trying to trade off the performance, which aren't even mentioned in question, with readability? Isn't it an (possibly) unwanted optimization? And please, left out your didactic tone, I'm aware that in general LINQ have some overhead. :) – J0HN Nov 26 '12 at 12:06
  • @J0HN I wasn't aiming for a tone :O Your cons are my pros and my cons are your pros, I agree. It's a matter of choice. I would personally prefer efficiency, cos I'm an efficiency fanatic, and in this case it doesn't really harm readability that much, but I can totally see why one would choose the shorter and simpler version for maintainability and readability. – SimpleVar Nov 26 '12 at 14:57
4
dict.Keys.Except(list).ToList()
    .ForEach(key => dict.Remove(key));
L.B
  • 114,136
  • 19
  • 178
  • 224
0

Code:

public static void RemoveAll<TKey, TValue>(this Dictionary<TKey, TValue> target,
                                           List<TKey> keys)
{
    var tmp = new Dictionary<TKey, TValue>();

    foreach (var key in keys)
    {
        TValue val;
        if (target.TryGetValue(key, out val))
        {
            tmp.Add(key, val);
        }
    }

    target.Clear();

    foreach (var kvp in tmp)
    {
        target.Add(kvp.Key, kvp.Value);
    }
}

Example:

var d = new Dictionary<string, int>
            {
                {"A", 4},
                {"B", 44},
                {"bye", 56},
                {"C", 99},
                {"D", 46},
                {"6672", 0}
            };

var l = new List<string> {"A", "C", "D"};

d.RemoveAll(l);

foreach (var kvp in d)
{
    Console.WriteLine(kvp.Key + ": " + kvp.Value);
}

Output:

A: 4
C: 99
D: 46
SimpleVar
  • 14,044
  • 4
  • 38
  • 60