6

I have an extension method to remove certain characters from a string (a phone number) which is performing much slower than I think it should vs chained Replace calls. The weird bit, is that in a loop it overtakes the Replace thing if the loop runs for around 3000 iterations, and after that it's faster. Lower than that and chaining Replace is faster. It's like there's a fixed overhead to my code which Replace doesn't have. What could this be!?

Quick look. When only testing 10 numbers, mine takes about 0.3ms, while Replace takes only 0.01ms. A massive difference! But when running 5 million, mine takes around 1700ms while Replace takes about 2500ms.

Phone numbers will only have 0-9, +, -, (, )

Here's the relevant code: Building test cases, I'm playing with testNums.

        int testNums = 5_000_000;
        Console.WriteLine("Building " + testNums + " tests");
        Random rand = new Random();
        string[] tests = new string[testNums];
        char[] letters =
        {
            '0','1','2','3','4','5','6','7','8','9',
            '+','-','(',')'
        };
        for(int t = 0; t < tests.Length; t++)
        {
            int length = rand.Next(5, 20);
            char[] word = new char[length];
            for(int c = 0; c < word.Length; c++)
            {
                word[c] = letters[rand.Next(letters.Length)];
            }
            tests[t] = new string(word);
        }

        Console.WriteLine("Tests built");
        string[] stripped = new string[tests.Length];

Using my extension method:

        Stopwatch stopwatch = Stopwatch.StartNew();
        for (int i = 0; i < stripped.Length; i++)
        {
            stripped[i] = tests[i].CleanNumberString();
        }
        stopwatch.Stop();
        Console.WriteLine("Clean: " + stopwatch.Elapsed.TotalMilliseconds + "ms");

Using chained Replace:

        stripped = new string[tests.Length];
        stopwatch = Stopwatch.StartNew();
        for (int i = 0; i < stripped.Length; i++)
        {
            stripped[i] = tests[i].Replace(" ", string.Empty)
                        .Replace("-", string.Empty)
                        .Replace("(", string.Empty)
                        .Replace(")", string.Empty)
                        .Replace("+", string.Empty);
        }
        stopwatch.Stop();
        Console.WriteLine("Replace: " + stopwatch.Elapsed.TotalMilliseconds + "ms");

Extension method in question:

    public static string CleanNumberString(this string s)
    {
        Span<char> letters = stackalloc char[s.Length];
        int count = 0;
        for (int i = 0; i < s.Length; i++)
        {
            if (s[i] >= '0' && s[i] <= '9')
                letters[count++] = s[i];
        }
        return new string(letters.Slice(0, count));
    }

What I've tried:

  • I've run them around the other way. Makes a tiny difference, but not enough.
  • Make it a normal static method, which was significantly slower than extension. As a ref parameter was slightly slower, and in parameter was about the same as extension method.
  • Aggressive Inlining. Doesn't make any real difference. I'm in release mode, so I suspect the compiler inlines it anyway. Either way, not much change.

I have also looked at memory allocations, and that's as I expect. My one allocates on the managed heap only one string per iteration (the new string at the end) which Replace allocates a new object for each Replace. So the memory used by the Replace one is much, higher. But it's still faster!

Is it calling native C code and doing something crafty there? Is the higher memory usage triggering the GC and slowing it down (still doesn't explane the insanely fast time on only one or two iterations)

Any ideas?

(Yes, I know not to bother optimising things like this, it's just bugging me because I don't know why it's doing this)

craig
  • 421
  • 4
  • 13
  • Just to add to the weirdness. I found that Release config (with code optimisations) is SLOWER than Debug by about 3.5 times for low numbers of iterations, but faster at high numbers. – craig Feb 25 '19 at 22:01
  • Note that working on ten numbers is vastly too small to make meaningful timings, unless you repeat in a large loop. – NetMage Feb 25 '19 at 22:03
  • Clearly has to do with `Span letters = stackalloc char[s.Length];` as reproducing with a simple `char[]` gives faster times for `Replace` on all my tests. – Mat Feb 25 '19 at 22:11
  • @NetMage yeah, that's true. But it seems consistent when being run multiple times with very low numbers. And given that most times, methods like this will not be called in a loop, it's important to check. The measurements may not be accurate, but if they're an order of magnitude slower, it's probably a rough guide that it's slower, but you're right that it's very hard to know exactly how much slower. – craig Feb 25 '19 at 22:12
  • @Mat what version of .net are you using? Span can be slower on .Net Framework I think, but on Core it should be faster. I just tried both, and averaged them as below and found very little difference between Span and [], with Span being fractionally faster most times. – craig Feb 25 '19 at 22:27
  • Running in LINQPad with .Net 4.7.2, setup to always do 5,000,000 loops by diving by number of test strings, I get your Clean is about 5X faster at 10 strings, 6.2X faster at 100 strings, 4.4X faster at 1000 strings, and still requires multiple runs to stabilize (caching?). WIth optimization off, Clean is about 1.7X faster at 1000 strings. – NetMage Feb 25 '19 at 22:29
  • Scratch that, I also managed to reproduce without Core; on a debug launch I also get that your `clean` method (with char[]) starts being faster at ~3k too. You might want to check out [this source code](https://github.com/fixdpt/shared-source-cli-2.0/blob/master/clr/src/vm/comstring.cpp#L1572) to see how `Replace` works in depth though. – Mat Feb 25 '19 at 22:35
  • Running it for tiny (10) strings and number of tests (1) but putting a call to `CleanNumberString` outside the timing loop stabilizes the result. It looks to me like the JITter is compiling the method on the first call. – NetMage Feb 25 '19 at 23:29
  • Interestingly, replacing `stackalloc` with `new` seems fractionally faster. – NetMage Feb 25 '19 at 23:41
  • Hmm. on regular .Net, using `char[]` instead of `Span` is much faster. – NetMage Feb 25 '19 at 23:48

3 Answers3

2

I ran the clean method a couple more. interestingly, it is a lot faster than the Replace. Only the first time run was slower. Sorry that I couldn't explain why it's slower the first time but I ran more of the method then the result was expected.

Building 100 tests Tests built Replace: 0.0528ms Clean: 0.4526ms Clean: 0.0413ms Clean: 0.0294ms Replace: 0.0679ms Replace: 0.0523ms

used dotnet core 2.1

daehee kim
  • 21
  • 3
  • Yes, I was just doing this and found the same. Running multiple tests alternating, and ignoring the first of each in the average shows that replace is much slower. But that first loop on clean is still very slow. I wonder what it's doing! – craig Feb 25 '19 at 22:25
2

After doing some benchmarks, I think can safely assert that your initial statement is wrong for the exact reason you mentionned in your deleted answer: the loading time of the method is the only thing that misguided you.

Here's the full benchmark on a simplified version of the problem:

static void Main(string[] args)
{
    // Build string of n consecutive "ab"
    int n = 1000;
    Console.WriteLine("N: " + n);
    char[] c = new char[n];

    for (int i = 0; i < n; i+=2)
        c[i] = 'a';
    for (int i = 1; i < n; i += 2)
        c[i] = 'b';

    string s = new string(c);

    Stopwatch stopwatch;

    // Make sure everything is loaded
    s.CleanNumberString();
    s.Replace("a", "");
    s.UnsafeRemove();

    // Tests to remove all 'a' from the string

    // Unsafe remove
    stopwatch = Stopwatch.StartNew();

    string a1 = s.UnsafeRemove();

    stopwatch.Stop();
    Console.WriteLine("Unsafe remove:\t" + stopwatch.Elapsed.TotalMilliseconds + "ms");

    // Extension method
    stopwatch = Stopwatch.StartNew();

    string a2 = s.CleanNumberString();

    stopwatch.Stop();
    Console.WriteLine("Clean method:\t" + stopwatch.Elapsed.TotalMilliseconds + "ms");

    // String replace
    stopwatch = Stopwatch.StartNew();

    string a3 = s.Replace("a", "");

    stopwatch.Stop();
    Console.WriteLine("String.Replace:\t" + stopwatch.Elapsed.TotalMilliseconds + "ms");

    // Make sure the returned strings are identical
    Console.WriteLine(a1.Equals(a2) && a2.Equals(a3));

    Console.ReadKey();

}

public static string CleanNumberString(this string s)
{
    char[] letters = new char[s.Length];
    int count = 0;
    for (int i = 0; i < s.Length; i++)
        if (s[i] == 'b')
            letters[count++] = 'b';
    return new string(letters.SubArray(0, count));
}

public static T[] SubArray<T>(this T[] data, int index, int length)
{
    T[] result = new T[length];
    Array.Copy(data, index, result, 0, length);
    return result;
}

// Taken from https://stackoverflow.com/a/2183442/6923568
public static unsafe string UnsafeRemove(this string s)
{
    int len = s.Length;
    char* newChars = stackalloc char[len];
    char* currentChar = newChars;

    for (int i = 0; i < len; ++i)
    {
        char c = s[i];
        switch (c)
        {
            case 'a':
                continue;
            default:
                *currentChar++ = c;
                break;
        }
    }
    return new string(newChars, 0, (int)(currentChar - newChars));
}

When ran with different values of n, it is clear that your extension method (or at least my somewhat equivalent version of it) has a logic that makes it faster than String.Replace(). In fact, it is more performant on either small or big strings:

N: 100
Unsafe remove: 0,0024ms
Clean method: 0,0015ms
String.Replace: 0,0021ms
True

N: 100000
Unsafe remove: 0,3889ms
Clean method: 0,5308ms
String.Replace: 1,3993ms
True

I highly suspect optimizations for the replacement of strings (not to be compared to removal) in String.Replace() to be the culprit here. I also added a method from this answer to have another comparison on removal of characters. That one's times behave similarly to your method but gets faster on higher values (80k+ on my tests) of n.

With all that being said, since your question is based on an assumption that we found was false, if you need more explanation on why the opposite is true (i.e. "Why is String.Replace() slower than my method"), plenty of in-depth benchmarks about string manipulation already do so.

Mat
  • 1,440
  • 1
  • 18
  • 38
  • 1
    Thanks heaps. This is true. Interesting use of unsafe here. I think Span is nicer and doesn't require unsafe, but cool all the same. I deleted my answer because I found it was wrong. Calling a method only JIT'ed the called method. I found it's nicer to use RuntimeHelpers.PrepareMethod to force the JIT. And then my benchmarks show the same as yours – craig Feb 26 '19 at 00:56
  • Yeah I haven't fixed the benchmark to allow values of `n` bigger than 100k but really it just boils down to preference if that little of time gain isn't necessary. I'd keep using yours personally haha. Btw, feel free to add and accept your own answer again if you believe your explanation would be more thorough. Otherwise, I think I've adressed most of the points here. – Mat Feb 26 '19 at 01:00
1

So I've found with help from daehee Kim and Mat below that it's only the first iteration, but it's for the whole first loop. Every loop after there is ok.

I use the following line to force the JIT to do its thing and initialise this method: RuntimeHelpers.PrepareMethod(typeof(CleanExtension).GetMethod("CleanNumberString", BindingFlags.Public | BindingFlags.Static).MethodHandle);

I find the JIT usually takes about 2-3ms to do its thing here (including Reflection time of about 0.1ms). Note that you should probably not be doing this because you're now getting the Reflection cost as well, and the JIT will be called right after this anyway, but it's probably a good idea for benchmarks to fairly compare.

The more you know!

My benchmark for a loop of 5000 iterations, repeated 5000 times with random strings and averaged is:

Clean: 0.41078ms

Replace: 1.4974ms

craig
  • 421
  • 4
  • 13