13

I've found that the test results are different on my machine and the build server. I've managed to find the single line that differs. This is a string comparison. The two strings differ in case of the first character.

The test below passes on my local machine and fails on the build machine.

[TestClass]
public class Tests 
{
    [TestMethod]
    public void Strings()
    {
        Assert.IsFalse(0 == string.Compare("Term’s", "term’s", false, CultureInfo.InvariantCulture));
    }
}

I've also tried to change it to string.Equals:

string.Equals("Term’s", "term’s", StringComparison.InvariantCulture);

string.Equals returns true on the build server and returns false on my local machine.

Ordinal comparison gives same results on both machines:

string.Compare("Term’s", "term’s", StringComparison.Ordinal))

As I understand, InvariantCulture is supposed to return the same results everywhere. How can a case-sensitive culture-invariant string comparison depend on a machine? What settings should I check to identify the problem?

Update: platform and string

The string is important. These results can be observed for strings with "exotic" punctuation like RIGHT SINGLE QUOTATION MARK or RIGHT DOUBLE QUOTATION MARK

It seems the behavior reproduces on Windows 8 machines. You can see it even on https://dotnetfiddle.net/ if you type the following:

using System;
using System.Globalization;

public class Program
{
    public static void Main()
    {
        Console.WriteLine(0 == string.Compare("Terms", "terms", false, CultureInfo.InvariantCulture));
        Console.WriteLine(0 == string.Compare("Term’s", "term’s", false, CultureInfo.InvariantCulture));
        Console.WriteLine(0 == string.Compare("Term“s", "term“s", false, CultureInfo.InvariantCulture));
        Console.WriteLine(0 == string.Compare("Term”s", "term”s", false, CultureInfo.InvariantCulture));

        //outputs
        //False
        //True
        //True
        //True
    }
}

Environment.OSVersion (server's): Microsoft Windows NT 6.2.9200.0
Environment.Is64BitOperatingSystem (server's): True
Environment.Version (server's) 4.0.30319.18449

Environment.OSVersion (local): Microsoft Windows NT 6.1.7601 Service Pack 1
Environment.Is64BitOperatingSystem (local): True
Environment.Version (local): 4.0.30319.18444

Update: related MSDN forums link

It may be a known bug in Windows 8, which is fixed in Windows 8.1.

http://social.msdn.microsoft.com/Forums/vstudio/en-US/4a1ab6b7-6dcc-46bf-8650-e0d9ebbf1735/stringcompare-not-always-casesensitive-on-windows-8?forum=netfxbcl

filhit
  • 2,084
  • 1
  • 21
  • 34
  • 2
    What platform (hardware, OS, CLI) do your PC and build server run on? `InvariantCulture` is supposed to be case-sensitive, so it sounds like a platform bug. – CodeCaster Sep 08 '14 at 15:16
  • 1
    I don't suppose your build machine has some strange processing of the programs, like a C# preprocessor that mangles things, or some kind of obfuscation that mangles things, or some kind of aspect-oriented process injection (PostSharp comes to mind) that mangles things? – RenniePet Sep 08 '14 at 15:51
  • @adriano-repetti I double-checked it now, I've copied the test with no changes from the code. And the test failed when I checked it in. As for encoding, I'm not sure it is not changed somewhere in Local->Source Control->Build server, but (int)str[i] return the same numbers for each character for my local machine and server. So at least it compiles to the same thing. – filhit Sep 08 '14 at 16:05
  • @CodeCaster I've added output of `Environment.OSVersion` and `Environment.Version`. Unfortunately, I have no other access to the server that checking in the code and observing the test results. I will raise a ticket to get the hardware. What exactly should I ask about hardware? – filhit Sep 08 '14 at 16:43
  • @RenniePet No, we don't use such processing. – filhit Sep 08 '14 at 16:57
  • string.Equals("Term’s", "term’s", StringComparison.OrdinalIgnoreCase); – Chad Grant Sep 08 '14 at 17:19
  • Just curious - when you request current culture on the two systems, what does it say? – RenniePet Sep 08 '14 at 17:53
  • @RenniePet en-US for both. – filhit Sep 08 '14 at 18:16
  • "Murphy’s law" == "murphy’s law" ? :-) – RenniePet Sep 08 '14 at 18:54
  • Out of curiosity, do you see a difference (between those two machines) if you use instead `StringComparer.InvariantCulture.Compare("Term’s", "term’s")`? – Jeppe Stig Nielsen Sep 08 '14 at 21:10
  • I tried both Win 7 and Win 8.1 and I can't reproduce. Is it limited to Win 8? D**n it's even a single UTF-16 code point (and it's there from hmmm around Unicode 3?). Does it do same also for String.Equals()? – Adriano Repetti Sep 08 '14 at 21:58
  • @AdrianoRepetti I wasn't able to reproduce it on Windows 7 and Windows 8.1. I was able to reproduce it on the build server (I don't know its exact OS version yet) and on a desktop Windows 8 machine. It is the same for equivalent string.Equals call. I've also found it is reproduced on https://dotnetfiddle.net/ snippets site. You can try it yourself. – filhit Sep 08 '14 at 22:13
  • @AdrianoRepetti It's not limited (on dotnetfiddle) to that specific character, either. For example, "WHAT THE HELL???! ë" and "What the hell???! ë" also compare equal. –  Sep 08 '14 at 22:15
  • @JeppeStigNielsen I will try it on the exact machines tomorrow. But it gives 0 on https://dotnetfiddle.net/ site, which gave me the same results as my build server for the lines I specified in my question. – filhit Sep 08 '14 at 22:16
  • @hvd yes I'm checking that. It produces wrong results with any character > 127. Well they just have to rename it as InvariantUsAsciiCulture and it works as expected. If it's as Eric said and everything is delegated to OS (and then this is an OS related bug) I would see if it's documented. **A lot of code may be broken out there!!!** – Adriano Repetti Sep 08 '14 at 22:21
  • 2
    @AdrianoRepetti, hvd It may be a known Windows 8 bug, which is fixed in Windows 8.1 if I understand [this MSDN forums link](http://social.msdn.microsoft.com/Forums/vstudio/en-US/4a1ab6b7-6dcc-46bf-8650-e0d9ebbf1735/stringcompare-not-always-casesensitive-on-windows-8?forum=netfxbcl) correct. – filhit Sep 08 '14 at 22:27
  • @filhit there isn't an _official_ confirmation there but I think you're right. Wow...it's a SERIOUS thing. – Adriano Repetti Sep 09 '14 at 07:41

1 Answers1

7

InvariantCulture is unfortunately still a linguistic comparison and as such it can vary (and does vary, especially when new characters are added to Unicode) between versions of the OS. Versions of .Net prior to 4.0 carried their own payload of data and thus would not vary but since then they pick up the data from the OS and will potentially vary. Ordinal is the only comparison that will not change and is what you really need to do if you desire stability.

That said, you should not be seeing differences in behavior for the code that you supply. The differences you observe are due to a bug with Windows 8 that has been fixed in Windows 8.1.

Eric MSFT
  • 3,246
  • 1
  • 18
  • 28
  • Do you mean I should use Ordinal comparison even though I compare the strings as english words and not identifiers? – filhit Sep 08 '14 at 17:41
  • Er... What's the point of an `InvariantCulture` that varies between the various OS versions? To quote from [CultureInfo.InvariantCulture](http://msdn.microsoft.com/en-us/library/system.globalization.cultureinfo.invariantculture%28v=vs.110%29.aspx): "Unlike culture-sensitive data, which is subject to change by user customization or by updates to the .NET Framework or the operating system, invariant culture data is stable over time and across installed cultures and cannot be customized by users." –  Sep 08 '14 at 18:47
  • @hvd that is true of the rest of the locale data (things that affect formatting). Unfortunately it is not true for collation (sorting). – Eric MSFT Sep 08 '14 at 19:18
  • 1
    So much for invariant then :( – AVee Sep 08 '14 at 21:01
  • 1
    Please explain with more details! In **this case** only "non ASCII" character is 2019 "RIGHT SINGLE QUOTATION MARK" and it's there from a long long time. I understand security implications (as described in MSDN) but in a _normal_ environment (without hackers trying to break our code) HOW this applies? **It can't be such broken** (and in case of bugs I assume they're fixed or documented) **otherwise the whole point of an invariant culture is absolutely meaningless**. – Adriano Repetti Sep 08 '14 at 21:46