1

I copied blindly one of the stackoverflow answers, but it didn't work for me as I expected. I needed to remove a UTF8 byte order mark from a string, and somehow, the inputText.StartsWith(byteOrderMark) always returns true for whatever string:

internal class Program
{
    static void Main(string[] args)
    {
        var inputText = "hello";
        string byteOrderMark = Encoding.UTF8.GetString(Encoding.UTF8.GetPreamble());
        if (inputText.StartsWith(byteOrderMark))
            inputText = inputText.Remove(0, byteOrderMark.Length);
        Console.WriteLine(inputText); // ello
        Console.WriteLine(inputText[0] == byteOrderMark[0]); // false
    }
}

I can check it character by character, there is no problem. I'm interested why StartsWith returns true even when the string doesn't start with UTF8 preamble?

username
  • 3,378
  • 5
  • 44
  • 75
  • Probably the same as [this](https://stackoverflow.com/questions/47209888/why-does-u1ffffoo-startswith-return-true) and [this](https://stackoverflow.com/questions/29831410/why-does-string-startswith-u2d2d-always-return-true). – ProgrammingLlama Jun 20 '23 at 01:50
  • What answer was that? Probably we should fix it. – dbc Jun 20 '23 at 01:57
  • @ProgrammingLlama - I added an answer, then saw your [first link](https://stackoverflow.com/questions/47209888/why-does-u1ffffoo-startswith-return-true). Should I delete this and close as a duplicate? – dbc Jun 20 '23 at 01:58
  • @dbc I kind of like your answer better than the one on [this question](https://stackoverflow.com/questions/47209888/why-does-u1ffffoo-startswith-return-true), though I'm not sure if it's worth you moving it over there. – ProgrammingLlama Jun 20 '23 at 02:04
  • @dbc I'm thinking it's possible it's [this answer](https://stackoverflow.com/a/29176183/3181933) OP saw so I'll fix it. – ProgrammingLlama Jun 20 '23 at 02:05
  • @ProgrammingLlama - no, that's something different. That question contains its own implementation for `public static bool StartsWith(this byte[] thisArray, byte[] otherArray)`. – dbc Jun 20 '23 at 02:07
  • @dbc Doh! I am an idiot. Thank you for stopping me making too much of a fool of myself. :) – ProgrammingLlama Jun 20 '23 at 02:08
  • 1
    @ProgrammingLlama - maybe it's [this one](https://stackoverflow.com/a/15616035/3744182)? Compare with [this one](https://stackoverflow.com/a/1319226/3744182) by TrueWill which is actually right, the wrong answer comes first in the "trending" sort. – dbc Jun 20 '23 at 02:11
  • @dbc I think you're right, so I've updated it. :) – ProgrammingLlama Jun 20 '23 at 02:14
  • @ProgrammingLlama this was definitely one, your googling skills are truly superior. – username Jun 20 '23 at 02:15
  • @ProgrammingLlama Also, should I now delete this question? – username Jun 20 '23 at 02:25
  • 2
    @username - it's not a bad duplicate, since the title is very clear. – dbc Jun 20 '23 at 02:26
  • 2
    @username Duplicates such as this can serve as signposts for other people searching for a solution to the same or similar issues, and I think your question should show up in other people's searches in future, so it isn't a bad idea to leave it. – ProgrammingLlama Jun 20 '23 at 02:27

1 Answers1

2

You need to use StringComparison.Ordinal:

if (inputText.StartsWith(byteOrderMark, StringComparison.Ordinal))
    inputText = inputText.Remove(0, byteOrderMark.Length);

As explained in the docs, String.StartsWith(String):

This method performs a word (case-sensitive and culture-sensitive) comparison using the current culture.

Since the BOM is non-printing, apparently in your culture your call is equivalent to inputText.StartsWith(string.Empty), which is always true.

The docs further note:

As explained in Best Practices for Using Strings, we recommend that you avoid calling string comparison methods that substitute default values and instead call methods that require parameters to be explicitly specified. To determine whether a string begins with a particular substring by using the string comparison rules of the current culture, signal your intention explicitly by calling the StartsWith(String, StringComparison) method overload with a value of CurrentCulture for its comparisonType parameter.

Seems like you may have gotten bitten by this.

dbc
  • 104,963
  • 20
  • 228
  • 340