2

I'm implementing A* in C# (not for pathfinding) and I need Dictionary to hold open nodes, because I need fast insertion and fast lookup. I want to get the first open node from the Dictionary (it can be any random node). Using Dictionary.First() is very slow. If I use an iterator, MoveNext() is still using 15% of the whole CPU time of my program. What is the fastest way to get any random element from a Dictionary?

Sam
  • 7,252
  • 16
  • 46
  • 65
Yekoor
  • 73
  • 7
  • Do you really need a dictionary or a set? – Joachim Isaksson Jun 21 '14 at 13:34
  • A set could work too. – Yekoor Jun 21 '14 at 13:38
  • 1
    You *could* base a custom class on the reference source for Dictionary which can be found here: http://referencesource.microsoft.com/#mscorlib/system/collections/generic/dictionary.cs and just pluck any random element out of the private `entries` array. From reviewing the reference source, it looks like `MoveNext` (like Gabe stated) should indeed be the fastest way when using a plain Dictionary. – ChristopheD Jun 21 '14 at 14:32

3 Answers3

5

I suggest you use a specialized data structure for this purpose, as the regular Dictionary was not made for this.

In Java, I would probably recommend LinkedHashMap, for which there are custom C# equivalents (not built-in sadly) (see).

It is, however, rather easy to implement this yourself in a reasonable fashion. You could, for instance, use a regular dictionary with tuples that point to the next element as well as the actual data. Or you could keep a secondary stack that simply stores all keys in order of addition. Just some ideas. I never did implemented nor profiled this myself, but I'm sure you'll find a good way.

Oh, and if you didn't already, you might also want to check the hash code distribution, to make sure there is no problem there.

Community
  • 1
  • 1
mafu
  • 31,798
  • 42
  • 154
  • 247
2

Finding the first (or an index) element in a dictionary is actually O(n) because it has to iterate over every bucket until a non-empty one is found, so MoveNext will actually be the fastest way.

If this were a problem, I would consider using something like a stack, where pop is an O(1) operation.

Gabe
  • 84,912
  • 12
  • 139
  • 238
0

Try

Enumerable.ToList(dictionary.Values)[new Random().next(dictionary.Count)].

Should have pretty good performance but watch out for memory usage if your dictionary is huge. Obviously take care of not creating the random object every time and you might be able to cache the return value of Enumerable.ToList if its members don't change too frequently.

Jurgen Camilleri
  • 3,559
  • 20
  • 45
  • I rewrote it to Enumerable.ToList(dictionary.Values)[0], because I just need any element. But ToList() now takes 38% of the CPU time. So looks like enumerator is faster. – Yekoor Jun 21 '14 at 13:53
  • To make this perform well, the returned list definitely needs to be cached, as you pointed out, because it is certainly more effort to create a whole list copy instead of just using the enumerator. If I am not mistaken, A* algorithms change the content of said list all the time, so caching is not possible directly. It should be possible to manually update the list instead, but that doesn't seem to yield much benefit over a more specialized approach. – mafu Jun 22 '14 at 04:07