2

Quick question:

Which one is faster?

foreach (Object obj in Collection)
{
     if(obj.Mandatory){ ... }
}

or

foreach (Object obj in Collection.FindAll(o => o.Mandatory))
{
...
}

and if you know a faster suggestion, i'd be pleased to know.

Thank you

Eduardo Mello
  • 925
  • 3
  • 15
  • 32

6 Answers6

16

If your Collection is a List<T> then FindAll is implemented by creating a new List<T> and copying all items that match the predicate. This is obviously slower than just enumerating the collection and deciding for each item if the predicate holds.

If you're using .NET 3.5 you can use LINQ which will not create a copy and and is similar to your first example:

foreach (object obj in someCollection.Where(o => o.Mandatory))
{
    ...
}

Note this isn't necessarily the fastest solution. It's just easy to see that a method that allocates memory and enumerates a collection is slower than a method that only enumerates a collection. If performance is critical: measure it.

dtb
  • 213,145
  • 36
  • 401
  • 431
  • I was about to post exactly this. Great answer. – Meta-Knight Nov 12 '09 at 17:21
  • thanks. I felt that FindAll was slower, but that shuld be something else other than testing each item. – Eduardo Mello Nov 12 '09 at 17:26
  • Performance-wise, the foreach with an if-clause will be fastest. Using .Where will be slightly slower, and using .FindAll will be *much* slower. – Neil Nov 12 '09 at 17:36
  • Using .Where will iterate all items in the collection where Mandatory == true twice. – Chris Shouts Nov 12 '09 at 17:57
  • 1
    @paper: No. Where only creates a thin wrapper around the enumerator returned by the collection that skips all items not matching the predicate. All items are enumerated exactly once. – dtb Nov 12 '09 at 18:02
7

The following test code prints the system ticks (1 tick = 100 nanoseconds) for iterating through 10 million objects. The FindAll is slowest and the for loop is fastest as expected.

But the overhead of the iteration is measured in nanoseconds per item even in the worst case. If you're doing anything significant in the loop (e.g. something which takes a microsecond per item), then the speed difference of the iteration is completely insignificant.

So for the love of Turing don't forbid foreach in your coding guidelines now. It doesn't make any practical difference, and the LINQ statements sure are easier to read.

   public class Test
   {
      public bool Bool { get; set; }
   }

   class Program
   {

      static void Main(string[] args)
      {
         // fill test list
         var list = new List<Test>();
         for (int i=0; i<1e7; i++)
         {
            list.Add(new Test() { Bool = (i % 2 == 0) });
         }

         // warm-up
         int counter = 0;
         DateTime start = DateTime.Now;
         for (int i = 0; i < list.Count; i++)
         {
            if (list[i].Bool)
            {
               counter++;
            }
         }

         // List.FindAll
         counter = 0;
         start = DateTime.Now;
         foreach (var test in list.FindAll(x => x.Bool))
         {
            counter++;
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 7969158

         // IEnumerable.Where
         counter = 0;
          start = DateTime.Now;
         foreach (var test in list.Where(x => x.Bool))
         {
            counter++;
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 5156514

         // for loop
         counter = 0;
         start = DateTime.Now;
         for (int i = 0; i < list.Count; i++)
         {
            if (list[i].Bool)
            {
               counter++;
            }
         }
         Console.WriteLine(DateTime.Now.Ticks - start.Ticks); // prints 2968902


      }
Wim Coenen
  • 66,094
  • 13
  • 157
  • 251
  • This is misleading. Sometimes `IEnumerable` is terribly slow compared to raw arrays. But I agree, don't avoid `IEnumerable`, it is very elegant. Always profile and when it is slow, and you are using `IEnumerable`, switch to an array or `List`. For example, I just now fixed a `yield` loop iterating a `ListView` `SelectedIndices` array that pegged the CPU and froze the UI for almost 2 seconds before showing the context menu, with several hundred items selected. I changed it to use `.CopyTo` an array followed by an old fashioned for loop with `int` index. It's instant now. – doug65536 Sep 04 '17 at 10:59
6

The first will be somewhat faster.

In the second case, you're using List<T>.FindAll to create a temporary list that matches your criteria. This copies the list, then iterates over it.

However, you could accomplish the same thing, with the same speed as your first option, by doing:

foreach (Object obj in Collection.Where(o => o.Mandatory))
{
}

This is because Enumerable.Where uses streaming to return an IEnumerable<T>, which is generated as you iterate. No copy is made.

Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • I have to agree with this. Where() is an extension method being invoked statically against the list so that there is no object is recreated. Additionally, the loop will operate using a yield return for each individual value. – Anthony Mason Aug 06 '14 at 20:48
3

The fastest you could ever get without parallelizing the enumeration into multiple threads taking accounts of number of processors, etc:

for (int i = 0; i < Collection.Count; i++)
{
    var item = Collection[i];
    if (item.Mandatory) { ... }
}

I would recommend you though to always use Linq instead of writing for or foreach loops because in the future it will become so intelligent that it will actually be capable of distributing the work over processors and take into account hardware specific things (see PLinq) and it will eventually be faster than if you wrote the loops yourself: declarative vs imperative programing.

Darin Dimitrov
  • 1,023,142
  • 271
  • 3,287
  • 2,928
  • 2
    I'm not convinced it would be faster. You'd have to profile it to be sure. Anyway the difference will be very small. – Meta-Knight Nov 12 '09 at 17:24
  • 4
    Unless something has changed very recently, for is a little faster than foreach, because you are not creating a new enumerator object and calling its methods as you go. The difference is small enough not to matter unless your code is an extremely critical place. – Neil Nov 12 '09 at 17:39
  • 2
    This method is not "the fastest you could ever get" - the assignment to the local variable (var item = Collection[i]) is unneeded - it's quicker to just call "if (Collection[i].Mandatory) { ... }" ... – David_001 Nov 19 '09 at 14:26
  • 3
    This is also extremely slow for certain types of collections, such as a linked-list. This is because `Collection.Count` is called on every iteration of the loop. – Ray Burns Feb 01 '10 at 20:31
1

FindAll is just syntactic sugar. For example:

    List<string> myStrings = new List<string>();
    foreach (string str in myStrings.FindAll(o => o.Length > 0))
    {

    }

Compiles to:

List<string> list = new List<string>();
if (CS$<>9__CachedAnonymousMethodDelegate1 == null)
{
    CS$<>9__CachedAnonymousMethodDelegate1 = new Predicate<string>(MyClass.<RunSnippet>b__0);
}
using (List<string>.Enumerator enumerator = list.FindAll(CS$<>9__CachedAnonymousMethodDelegate1).GetEnumerator())
{
    while (enumerator.MoveNext())
    {
        string current = enumerator.Current;
    }
}

public List<T> FindAll(Predicate<T> match)
{
    if (match == null)
    {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.match);
    }
    List<T> list = new List<T>();
    for (int i = 0; i < this._size; i++)
    {
        if (match(this._items[i]))
        {
            list.Add(this._items[i]);
        }
    }
    return list;
}

private static bool <RunSnippet>b__0(string o)
{
    return (o.Length > 0);
}
Philip Wallace
  • 7,905
  • 3
  • 28
  • 40
0

If performance is in question this probably is not the bottleneck, however, have you considered using the parallel library or PLINQ? see below:

Parallel.ForEach(Collection, obj =>
{
    if (obj.Mandatory)
    {
        DoWork();
    }
});

http://msdn.microsoft.com/en-us/library/dd460688(v=vs.110).aspx

Also, although perhaps slightly unrelated it seems as though performance peeks your curiousity, if you are dealing with very large sets of data, a binary search may be useful. In my case I have two separate lists of data. I have to deal with lists of millions of records and this saved me literally an exponential amount of time per execution. The only downside is that it is ONLY useful for very large collections and is required to be sorted beforehand. You will also notice that this makes use of the ConcurrentDictionary class, which provides significant overhead, but it is thread safe and was required due to the requirements and the number of threads I am managing asynchronously.

private ConcurrentDictionary<string, string> items;
private List<string> HashedListSource { get; set; }
private List<string> HashedListTarget { get; set; }

this.HashedListTarget.Sort();
this.items.OrderBy(x => x.Value);

private void SetDifferences()
{
    for (int i = 0; i < this.HashedListSource.Count; i++)
    {
        if (this.HashedListTarget.BinarySearch(this.HashedListSource[i]) < 0)
        {
            this.Mismatch.Add(items.ElementAt(i).Key);
        }
    }
}

Example displaying the benefits of using Binary Search This image was originally posted in a great article found here: http://letsalgorithm.blogspot.com/2012/02/intersecting-two-sorted-integer-arrays.html

Hope this helps!

Anthony Mason
  • 165
  • 1
  • 12