13

While comparing strings in C#, different clr gives different results on Windows 7 sp1 x64. Here is sample code:

List<string> myList = new List<string>();
myList.AddRange(new[] { "!-", "-!", "&-l", "&l-", "-(", "(-", "-*", "*-", ".-", "-.", "/'", "-/" });
myList.Sort();
Console.WriteLine(Environment.Version);
myList.ForEach(Console.WriteLine);
Console.WriteLine();
Console.WriteLine(string.Compare("!-", "-!"));
Console.WriteLine("!-".CompareTo("-!"));

Here is the sample output:


If I set Target Framework to 4.0:

4.0.30319.18444
!-
-!
&l-
&-l
(-
-(
*-
-*
.-
-.
/'
-/

-1
-1

If I set Target Framework to 2.0:

2.0.50727.5485
-!
!-
&-l
&l-
-(
(-
-*
*-
-.
.-
-/
/'

1
1

Am I missing anything?

user1014639
  • 488
  • 4
  • 18
  • In .NET 4.0 the Unicode changed from 5.0 to 5.1, so perhaps that is the difference. Sadly there is very little documentation on this – xanatos Feb 25 '15 at 11:22
  • 4
    `List.Sort()` uses the default comparer, and the default comparer for `string` uses the current culture. This, in turn, depends on the collation tables supplied with .NET, and those are subject to change. If you want consistent results, use an ordinal-based comparison (`List.Sort(StringComparer.Ordinal)`). – Jeroen Mostert Feb 25 '15 at 11:29
  • 1
    @JeroenMostert The problem is present even with the InvariantCulture, that should be "stable" – xanatos Feb 25 '15 at 11:30
  • 2
    No. The invariant culture is still a culture, and still depends on collation tables. The only thing guaranteed to be stable (as it does not depend on any collation at all) is ordinal-based comparison. – Jeroen Mostert Feb 25 '15 at 11:31
  • 6
    This has been already [discussed here](http://stackoverflow.com/questions/23087995/string-comparison-and-sorting-when-strings-contain-hyphens) and solution is to use [Ordinal comparer](http://stackoverflow.com/a/19371082/2530848) – Sriram Sakthivel Feb 25 '15 at 11:31
  • @JeroenMostert: Or `myList.Sort(string.CompareOrdinal);` – leppie Feb 25 '15 at 11:32
  • @SriramSakthivel Using the Ordinal comparer is good only in limited cases, because with the Ordinal comparer `è > f` – xanatos Feb 25 '15 at 11:44

1 Answers1

2

Please ensure that you are sorting with the MyList.Sort(StringComparer.Ordinal).

Unless Unicode start changing the code of their characters, it should provide a constant sorting order. Ordinal will be based off the actual code ID that were assigned to them.

If I take your first example comparing this :

-!
!-

The hyphen is U+002D and the exclamation mark is U+0021. Those codes haven't changed since at least the ASCII tables. I would consider checking your sorting parameters to make sure you compare only on ordinal and not on actual neutral/specific cultures.

Maxime Rouiller
  • 13,614
  • 9
  • 57
  • 107