36

What is the fastest built-in comparison-method for string-types in C#? I don't mind about the typographical/semantical meaning: the aim is to use the comparator in sorted lists in order to search fast in large collections. I think there are only two methods: Compare and CompareOrdinal. What's the fastest?

Additionally, is there is a faster method for those string-comparisons?

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555

8 Answers8

60

I'm assuming you want a less-than/equal/greater-than comparison rather than just equality; equality is a slightly different topic, although the principles are basically the same. If you're actually only searching for presence in something like a SortedList, I'd consider using a Dictionary<string, XXX> instead - do you really need all that sorting?

String.CompareOrdinal, or using an overload of String.Compare which allows the comparison to be provided, and specifying an ordinal (case-sensitive) comparison, e.g. String.Compare(x, y, StringComparison.Ordinal) will be the fastest.

Basically an ordinal comparison just needs to walk the two strings, character by character, until it finds a difference. If it doesn't find any differences, and the lengths are the same, the result is 0. If it doesn't find any differences but the lengths aren't the same, the longer string is deemed "larger". If it does find a difference, it can immediately work out which is deemed "larger" based on which character is "larger" in ordinal terms.

To put is another way: it's like doing the obvious comparison between two char[] values.

Culture-sensitive comparisons have to perform all kinds of tortuous feats, depending on the precise culture you use. For an example of this, see this question. It's pretty clear that having more complex rules to follow can make this slower.

Community
  • 1
  • 1
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • Is there a difference between `String.Compare(x, y, StringCompare.Ordinal)` and `String.CompareOrdinal(x, y)`? Or does one of those just call the other? – Nick Jul 17 '13 at 14:37
  • @Nick: I'd expect one to just call the other, but I don't know which way round. – Jon Skeet Jul 17 '13 at 14:38
11

I just noticed a 50% performance increase in my own code by comparing string lengths first and if equal then using the string.compare methods. So in a loop I have:

VB:

If strA.length = strB.length then
   if string.compare(strA,strB,true) = 0 then
      TheyAreEqual
   End if
End if

C#:

if(strA.Length == strB.Length)
{
   if(string.Compare(strA,strB,true) == 0)
   {
       //they are equal
   }
}

This could be dependant on your own strings but its seems to have worked well for me.

Max
  • 12,622
  • 16
  • 73
  • 101
  • Since C# uses short-circuit boolean logic, you don't need nested `if` statements -- `bool isEqual = strA.Length == strB.Length && string.Compare(strA, strB, true) == 0;`. VB.NET also supports short-circuit boolean logic using the `AndAlso` and `OrElse` operators -- `Dim isEquals = strA.Length = strB.Length AndAlso String.Compare(strA, strB, true) = 0`. – Zev Spitz Nov 09 '17 at 05:57
  • The problem with this version is, that strings can be null and in that case the code above would raise an exception, because of the strA.Length – Aaginor Jun 12 '18 at 11:23
4

I designed a unit test to test string comparison speed using some of the methods mentioned in this post. This test was ran using .NET 4

In short, there isn't much much difference, and I had to go to 100,000,000 iterations to see a significant difference. Since it seems the characters are compared in turn until a difference is found, inevitably how similar the strings are plays a part.

These results actually seem to suggest that using str1.Equals(str2) is the fastest way to compare strings.

These are the results of the test, with the test class included:

######## SET 1 compared strings are the same: 0
#### Basic == compare: 413
#### Equals compare: 355
#### Equals(compare2, StringComparison.Ordinal) compare: 387
#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: 426
#### String.CompareOrdinal(compare1, compare2) compare: 412

######## SET 2 compared strings are NOT the same: 0
#### Basic == compare: 710
#### Equals compare: 733
#### Equals(compare2, StringComparison.Ordinal) compare: 840
#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: 987
#### String.CompareOrdinal(compare1, compare2) compare: 776

using System;
using System.Diagnostics;
using NUnit.Framework;

namespace Fwr.UnitTests
{
    [TestFixture]
    public class StringTests
    {
        [Test]
        public void Test_fast_string_compare()
        {
            int iterations = 100000000;
            bool result = false;
            var stopWatch = new Stopwatch();

            Debug.WriteLine("######## SET 1 compared strings are the same: " + stopWatch.ElapsedMilliseconds);

            string compare1 = "xxxxxxxxxxxxxxxxxx";
            string compare2 = "xxxxxxxxxxxxxxxxxx";

            // Test 1

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1 == compare2;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Basic == compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 2

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 3

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2, StringComparison.Ordinal);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals(compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 4

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.Compare(compare1, compare2, StringComparison.Ordinal) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 5

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.CompareOrdinal(compare1, compare2) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.CompareOrdinal(compare1, compare2) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            Debug.WriteLine("######## SET 2 compared strings are NOT the same: " + stopWatch.ElapsedMilliseconds);

            compare1 = "ueoqwwnsdlkskjsowy";
            compare2 = "sakjdjsjahsdhsjdak";

            // Test 1

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1 == compare2;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Basic == compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 2

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 3

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = compare1.Equals(compare2, StringComparison.Ordinal);
            }

            stopWatch.Stop();

            Debug.WriteLine("#### Equals(compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 4

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.Compare(compare1, compare2, StringComparison.Ordinal) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.Compare(compare1, compare2, StringComparison.Ordinal) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();

            // Test 5

            stopWatch.Start();

            for (int i = 0; i < iterations; i++)
            {
                result = String.CompareOrdinal(compare1, compare2) != 0;
            }

            stopWatch.Stop();

            Debug.WriteLine("#### String.CompareOrdinal(compare1, compare2) compare: " + stopWatch.ElapsedMilliseconds);

            stopWatch.Reset();
        }
    }
}
gbro3n
  • 6,729
  • 9
  • 59
  • 100
  • This is interesting, as Ordinal comparison is used by default. Looking at the framework source, the extra time seems to be attributed to argument validation. So, for performance's sake, you shouldn't specify StringComparison.Ordinal explicitly, just use the default overload. – arni May 28 '16 at 08:51
4

Fastest is interned strings with reference equality test, but you only get equality testing and it's at the heavy expense of memory - so expensive that it's almost never the recommended course.

Past that, a case-sensitive ordinal test will be the fastest, and this method is absolutely recommended for non-culture-specific strings. Case-sensitive is faster if it works for your use case.

When you specify either StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase, the string comparison will be non-linguistic. That is, the features that are specific to the natural language are ignored when making comparison decisions. This means the decisions are based on simple byte comparisons and ignore casing or equivalence tables that are parameterized by culture. As a result, by explicitly setting the parameter to either the StringComparison.Ordinal or StringComparison.OrdinalIgnoreCase, your code often gains speed, increases correctness, and becomes more reliable.

Source

Sam Harwell
  • 97,721
  • 20
  • 209
  • 280
  • 1
    An example of a linguistic comparison would be encyclopædia == encyclopaedia, which are equal in some cultures. Source: https://msdn.microsoft.com/en-us/library/system.stringcomparison.aspx – arni May 28 '16 at 08:30
3

This is quite an old question, but since I found it others might as well.

In researching this topic a bit further, I came upon an interesting blog post that compares all methods for string comparison. Probably not highly scientific but still a good housenumber.

Thanks to this article I started using string.CompareOrdinal in a scenario where I had to find out if one string was in a list of 170.000 other strings and doing this 1600 times in a row. string.CompareOrdinal made it almost 50% faster compared to string.Equals

buddybubble
  • 1,269
  • 14
  • 33
  • Thanks for sharing that link. Even though probably not practical, I enjoyed the "thinking outside the box" solutions they tried such as loading strings as Hashset Keys and then testing to see if the key already exists. `if (hs.Contains(stringsWeWantToSeeIfMatches)) {}` –  Apr 01 '18 at 06:39
0

I Checked both the string.Compare and string.CompareOrdinal using stop watch

    --Compare Ordinal  case 1 
    Stopwatch sw = new Stopwatch();
    sw.Start();
    int x = string.CompareOrdinal("Jaswant Agarwal", "Jaswant Agarwal");
    sw.Stop();
    lblTimeGap.Text = sw.Elapsed.ToString(); 






    -- Only compare  case 2
    Stopwatch sw = new Stopwatch();
    sw.Start();
    int x = string.Compare("Jaswant Agarwal", "Jaswant Agarwal");
    sw.Stop();
    lblTimeGap.Text = sw.Elapsed.ToString();

In case 1 Average elapsed timing was 00:00:00.0000030 In case 2 Average elapsed timing was 00:00:00.0000086

I tried with different Equal and not equal combinations of string and found that every time CompareOrdinal is faster than only compare..

That is my own observation..you can also try just put two buttons on a form and copy paste this code in regrading event..

Jaswant Agarwal
  • 4,755
  • 9
  • 39
  • 49
  • 8
    This isn't a valid comparison, because your sample size is too small, and you haven't ignored the time it takes to JIT the code. A better comparison would be to 1) build as release 2) do the comparison 10,000 times, 3) compare the average of the two methods. This way the noise associated with JIT effects and background stuff your computer is doing will be minimized. – rianjs Dec 31 '14 at 00:05
  • Totally agree with @rianjs, how can you do it once and say one is faster than the other when you are dealing with millionths of a second difference?! – bytedev Nov 03 '16 at 15:44
0

This might be useful to someone, but changing one line of my code brought the unit testing of my method down from 140ms to 1ms!

Original

Unit test: 140ms

public bool StringsMatch(string string1, string string2)
{
    if (string1 == null && string2 == null) return true;
    return string1.Equals(string2, StringComparison.Ordinal);
}

New

Unit test: 1ms

public bool StringsMatch(string string1, string string2)
{
    if (string1 == null && string2 == null) return true;
    return string.CompareOrdinal(string1, string2) == 0 ? true : false;
}

Unit Test (NUnit)

[Test]
public void StringsMatch_OnlyString1NullOrEmpty_ReturnFalse()
{
    Authentication auth = new Authentication();
    Assert.IsFalse(auth.StringsMatch(null, "foo"));
    Assert.IsFalse(auth.StringsMatch("", "foo"));
}

Interestingly StringsMatch_OnlyString1NullOrEmpty_ReturnFalse() was the only unit test that took 140ms for the StringsMatch method. StringsMatch_AllParamsNullOrEmpty_ReturnTrue() was always 1ms and StringsMatch_OnlyString2NullOrEmpty_ReturnFalse() always <1ms.

tekiegirl
  • 1,229
  • 5
  • 17
  • 30
  • 4
    Running a test once is not a proper way to measure performance. The stated 140ms was probably the added time to handle the thrown NullReferenceException. – arni May 28 '16 at 08:07
0

I think there's a few ways most C# developers go about comparing strings, with the following being the most common:

  • Compare - as you mentioned
  • CompareOrdinal - as you mentioned
  • ==
  • String.Equals
  • writing a custom algorithm to compare char by char

If you want to go to extremes, you can use other objects/methods that aren't so obvious:

  • SequenceEqual example:

    c1 = str1.ToCharArray(); c2 = str2.ToCharArray(); if (c1.SequenceEqual(c2))

  • IndexOf example: if (stringsWeAreComparingAgainst.IndexOf(stringsWeWantToSeeIfMatches, 0 , stringsWeWantToSeeIfMatches.Length) == 0)

  • Or you can implement Dictionary and HashSets, using the strings as "keys" and testing to see if they exist already with the string you want to compare against. For instance: if (hs.Contains(stringsWeWantToSeeIfMatches))

So feel free to slice and dice to find your own ways of doing things. Remember though someone is going to have to maintain the code and probably won't want to spend time trying to figure out why you're using whatever method you've decided to use.

As always, optimize as your own risk. :-)