-3

I had a situation where I was creating a key from concatenating a numerical id number and a string value which created a unique value.

Does it make a difference in the how Dictionary works, or hashtable density or are there any performance implications if the number is first or the string is first the key itself being a string?

for example:

Dictionary<string, bool> dict = new Dictionary<string, bool>();
dict.Add(integerValue + "-" + stringValue, true);

OR

Dictionary<string, bool> dict = new Dictionary<string, bool>();
dict.Add(stringValue + "-" + integerValue, true);
Kelso Sharp
  • 972
  • 8
  • 12
  • 6
    If you want to know if a change makes your program faster, make the change, **run it** and then you'll know. – Eric Lippert Aug 26 '16 at 19:46
  • Yeah I tried that, and I couldn't find a negligible difference, I am looking for deeper understanding of any potential difference, because the internals of c# not how to improve it. – Kelso Sharp Aug 26 '16 at 19:48
  • I would IMAGINE it doesn't make any difference since it would still result in, basically the same hash calculations, so no real performance improvement, but I have no real clue- Hence leaving it as a comment only. – John Bustos Aug 26 '16 at 19:48
  • 2
    @EricLippert I'd like to direct you to this [great article about this exact subject](https://ericlippert.com/2012/12/17/performance-rant/) – Jonesopolis Aug 26 '16 at 19:49
  • 1
    If you can't observe a difference then what does it matter? The point of a performance improvement is to produce an *observable difference in performance*. "We made an imperceptible improvement" is not going to justify your fee. – Eric Lippert Aug 26 '16 at 19:49
  • @EricLippert Maybe it's an effectively imperceptible fee? – 15ee8f99-57ff-4f92-890c-b56153 Aug 26 '16 at 19:52
  • It uses string's `GetHashCode()` for the hash, and I assume it makes no difference but reading this code makes my head spin so I guess it's possible (though unlikely) http://stackoverflow.com/questions/15174477/how-is-gethashcode-of-c-sharp-string-implemented – stephen.vakil Aug 26 '16 at 19:52
  • Dude, sometimes people want to know more than what they see, sure in my current situation I don't see a difference, but I am only able to test on a few thousand records, all seems well, but it goes to production and that works millions of records, and it hammers the system and brings it down, then what? Sometimes you need see if there is more to a story that what is in front of you, P.S. People come here for understanding don't slam them for it – Kelso Sharp Aug 26 '16 at 19:54
  • 3
    `I am only able to test on a few thousand records` why are you not able to test with a dictionary of millions of records? It's a dictionary of simple types, why can't you build one up with a loop? – Jonesopolis Aug 26 '16 at 19:56
  • 1
    You are far more likely to see a performance issue from performing millions of string concatenations to get your keys set up than to see issues with the hash function. – stephen.vakil Aug 26 '16 at 19:56
  • That link is helpful Stephen thank you. – Kelso Sharp Aug 26 '16 at 19:58
  • 5
    You seek to understand whether a small change will produce a catastrophic effect on a large complex system, and your technique is to ask strangers on the internet who have no knowledge of your system about one small part of that system. Instead, *test the system under carefully controlled circumstances* if you want to know how it behaves in the context of a small change. There's no "armchair" performance analysis. Use science: make a change under controlled circumstances and *measure the results*. – Eric Lippert Aug 26 '16 at 20:00
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/121971/discussion-between-kelso-sharp-and-eric-lippert). – Kelso Sharp Aug 26 '16 at 20:46

1 Answers1

4

Trying to build strings a certain way for keys in dictionaries to improve performance is 1000% premature optimization, and will not help.

C# Dictionaries use GetHashCode to build the internal structure. The results of GetHashCode are platform specific, encoding specific, and should NEVER be assumed to have a certain distribution to micro-optimize.

https://msdn.microsoft.com/en-us/library/system.string.gethashcode(v=vs.110).aspx

Optimize by measuring. Not by making stabs in the dark about ways to try to mess with the internals of established algorithms. And above all else, optimize when you notice something is slow. Not before.

Tim
  • 2,878
  • 1
  • 14
  • 19