20

I have a question related to string comparison vs. character comparison.

Characters > and 0 (zero) have following decimal values 62 and 48 accordingly.

When I compare two characters in the following code, I get value True (which is correct)

Console.WriteLine('>' > '0');

When I compare two one-character strings in the following code, I get value -1 which indicates that ">" is less than "0" (default culture is English)

Console.WriteLine(string.Compare(">", "0"));

Whereas comparison of "3" and "1" (51 and 49 code values) in the following code returns 1 (as expected)

Console.WriteLine(string.Compare("3", "1"));

Also, string.Compare(string str1, string str2) documentation says:

The comparison uses the current culture to obtain culture-specific information such as casing rules and the alphabetic order of individual characters

Would you be able to explain (or provide reference to some documentation) how string comparison is implemented e.g. how alphabetic order of individual characters is calculated etc?

Alexandar
  • 916
  • 2
  • 9
  • 22

4 Answers4

25

When you compare the characters '>' and '0', you are comparing their ordinal values.

To get the same behaviour from a string comparison, supply the ordinal string comparison type:

  Console.WriteLine(string.Compare(">", "0", StringComparison.Ordinal));
  Console.WriteLine(string.Compare(">", "0", StringComparison.InvariantCulture));
  Console.WriteLine(string.Compare(">", "0", StringComparison.CurrentCulture));

The current culture is used by default, which has a sorting order intended to sort strings 'alphabetically' rather in strictly lexical order, for some definition of alphabetically.

Pete Kirkham
  • 48,893
  • 5
  • 92
  • 171
25

The sort order of strings depends on the culture you use.

StringComparer.CurrentCulture sorts the following 1-character strings as follows on my machine:

' -   ! " # $ % & (  ) * , . / : ; ? @ [
\ ] ^ _ ` { | } ~ +  < = > 0 1 2 3 4 5 6
7 8 9 a A b B c C d  D e E f F g G h H i
I j J k K l L m M n  N o O p P q Q r R s
S t T u U v V w W x  X y Y z Z

StringComparer.Ordinal sorts the same strings as follows:

  ! " # $ % & ' ( )  * + , - . / 0 1 2 3
4 5 6 7 8 9 : ; < =  > ? @ A B C D E F G
H I J K L M N O P Q  R S T U V W X Y Z [
\ ] ^ _ ` a b c d e  f g h i j k l m n o
p q r s t u v w x y  z { | } ~
dtb
  • 213,145
  • 36
  • 401
  • 431
  • 8
    You fail to inform us what your current culture is, and that is a shame. I can say that it is not `"fy-NL"` (West Frisian (Netherlands)), because then the letter `y` would be next to the `i`. Also, it can't be `"et-EE"` (Estonian (Estonia)), for then the `z` would be next to the `s`. – Jeppe Stig Nielsen Oct 02 '13 at 08:46
4

It sounds like what you want is the comparison to not use culture-specific rules. Have you tried StringComparison.Ordinal:

Console.WriteLine( string.Compare( ">", "0", StringComparison.Ordinal ) ); // returns a positive number
Oren Melzer
  • 749
  • 4
  • 7
1

it returns -1 because it is comparing str2 to str1, not the other way around. Eg, "is 48 equal to 62". No, it's less than 62 so it returns -1. It's semantically a little confusing when you read the parameter order

DiskJunky
  • 4,750
  • 3
  • 37
  • 66
  • [MSDN](http://msdn.microsoft.com/en-us/library/84787k22.aspx) says "String.Compare(strA, strB) - Less than zero - strA is less than strB." For example, `string.Compare("A", "B")` returns `-1` - `"A"` is less than `"B"`. Why is `">"` less than `"0"`? – dtb Feb 19 '13 at 21:15
  • I've updated my question: when you compare "3" to "1" you get value `1` where "3" has code 51 and "1" has code 49 (as expected). So that does not match with your explanation. – Alexandar Feb 19 '13 at 21:18
  • 1
    @Alexandar good point. I think PeteKirkham answered it better than I did – DiskJunky Feb 19 '13 at 21:26