0
string s = "abcabcabcabcabc";

var foundIndexes = new List<int>();

The question came from the discussion here. I was simply wondering

How can this:

for (int i = s.IndexOf('a'); i > -1; i = s.IndexOf('a', i + 1))  

      foundIndexes.Add(i); 

Be better than this :

for (int i = 0; i < s.Length; i++)

      if (s[i] == 'a') foundIndexes.Add(i);

EDIT : Where all does the performance gain come from?

Community
  • 1
  • 1
loxxy
  • 12,990
  • 2
  • 25
  • 56
  • Well, how about profiling it using `Stopwatch`? – Leri Oct 16 '12 at 14:41
  • isn't that the whole point of the second parameter of `IndexOf` - to tell the comparison to skip the initial characters – paul Oct 16 '12 at 14:42
  • 1
    Theoretically, the code from the second variant might do the range checking for each index, whereas the first variant can omit it. Practically, the C# compiler is able to detect that `i` in always "in range", and skip the checks. – Vlad Oct 16 '12 at 14:42
  • The `IndexOf` function would only iterate over each element one time since each subsequent `IndexOf` call begins the search at the index after the last found element. – Dylan Meador Oct 16 '12 at 14:42
  • Well I misphrased earlier... What I meant to ask was where the performance gain was coming from? – loxxy Oct 16 '12 at 14:45
  • @loxxy: did you try the code with full optimization? – Vlad Oct 16 '12 at 14:50
  • You need to provide the code you used to benchmark this; odds are your benchmarking method was biased towards one result in some way. – Servy Oct 16 '12 at 15:13

2 Answers2

2

I did not observe that using IndexOf was any faster than direct looping. Honestly, I don't see how it could be because each character needs to be checked in both cases. My initial results were this:

Found by loop, 974 ms 
Found by IndexOf 1144 ms

Edit: After running several more times I've noticed that you must run release (ie with optimizations) to see my result above. Without optimizations, the for loop is indeed slower.

The benchmark code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
using System.Text;
using System.IO;
using System.Diagnostics;

namespace Test
{
    public class Program
    {
        public static void Main(string[] args)
        {
            const string target = "abbbdbsdbsbbdbsabdbsabaababababafhdfhffadfd";

            // Jit methods
            TimeMethod(FoundIndexesLoop, target, 1);
            TimeMethod(FoundIndexesIndexOf, target, 1);            

            Console.WriteLine("Found by loop, {0} ms", TimeMethod(FoundIndexesLoop, target, 2000000));
            Console.WriteLine("Found by IndexOf {0} ms", TimeMethod(FoundIndexesIndexOf, target, 2000000));           
        }

        private static long TimeMethod(Func<string, List<int>> method, string input, int reps)
        {
            var stopwatch = Stopwatch.StartNew();
            List<int> result = null;
            for(int i = 0; i < reps; i++)
            {
                result = method(input);
            }
            stopwatch.Stop();
            TextWriter.Null.Write(result);
            return stopwatch.ElapsedMilliseconds;
        }

        private static List<int> FoundIndexesIndexOf(string s)
        {
            List<int> indexes = new List<int>();

            for (int i = s.IndexOf('a'); i > -1; i = s.IndexOf('a', i + 1))
            {
                 // for loop end when i=-1 ('a' not found)
                indexes.Add(i);
            }

            return indexes;
        }

        private static List<int> FoundIndexesLoop(string s)
        {
            var indexes = new List<int>();
            for (int i = 0; i < s.Length; i++)
            {
                if (s[i] == 'a')
                indexes.Add(i);
            }

            return indexes;
        }
    }
}
Mike Zboray
  • 39,828
  • 3
  • 90
  • 122
0

IndexOf(char value, int startIndex) is marked with the following attribute: [TargetedPatchingOptOut("Performance critical to inline across NGen image boundaries")].

Also, the implementation of this method is most likely optimized in many other ways, probably using unsafe code, or using more "native" techniques, say, using the native FindNLSString Win32 function.

Fernando Espinosa
  • 4,625
  • 1
  • 30
  • 37