2

These 2 ways of working both work, but I'm wondering if there's a difference in performance:

Dim collection As ItemCollection = CType(CellCollection.Where(Function(i) i.IsPending = True), ItemCollection)
For Each item As Item In collection
    'Do something here
Next

and

For Each item As Item In CellCollection.Where(Function(i) i.IsPending = True)
    'Do something here
Next

I thought the second one was better as you'd have a variable less and looks cleaner, but on second thought, I'm not quite sure what happens when you put a linq query in the iteration.

Does it have to reevaluate every time a loop is made? And which one is the cleanest/most performant to use?

Thanks in advance.

Terry
  • 5,132
  • 4
  • 33
  • 66
  • 5
    If you're interested in performance - test it! – Jackson Pope Mar 16 '12 at 14:22
  • Why add '= True' or is this necessary in VB? – Michel Keijzers Mar 16 '12 at 14:25
  • I think the difference is that the first version won't work, because `Where()` won't return your `ItemCollection`, but `IEnumerable(Of T)`. (Assuming this is normal `Enumerable.Where()` and there is no overloaded cast operator on `ItemCollection`.) – svick Mar 16 '12 at 14:50

3 Answers3

2

I've created a simple test console app.

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;

namespace LinqPerformance
{
    class Program
    {
        static void Main(string[] args)
        {
            var data = Enumerable.Range(1, 100000000);

            for (int x = 0; x < 10; x++)
            {
                ExecuteMethods(data);
            }
        }

        private static void ExecuteMethods(IEnumerable<int> data)
        {
            Method1("linq collection", () =>
            {
                var collection = data.Where(d => d % 2 == 0);
                double count = 0;

                foreach (var c in collection)
                {
                    count += c;
                }
            });

            Method1("list collection", () =>
            {
                var collection = data.Where(d => d % 2 == 0).ToList();
                double count = 0;
                foreach (var c in collection)
                {
                    count += c;
                }
            });

            Method1("iterable collection", () =>
            {
                double count = 0;
                foreach (var c in data.Where(d => d % 2 == 0))
                {
                    count += c;
                }
            });
        }

        private static void Method1(string name, Action body)
        {
            Stopwatch s = new Stopwatch();

            s.Start();
            body();
            s.Stop();

            Console.WriteLine(name + ": " + s.Elapsed);
        }
    }
}

After running this I can see that the ToList() is the slowest. The other two approaches appear to be the same.

I suppose this is because the foreach is expanded to a

var enumerator = collection.GetEnumerator();

while(enumerator.MoveNext() )
{
    var c = enumerator.Current;
    count += c;
}
Wouter de Kort
  • 39,090
  • 12
  • 84
  • 103
  • It is the slowest, because you iterate 1 time over data and then again over the "filtered" list. The 2 other methods are the same, except a holding variable for IEnumerable. – Malmi Mar 16 '12 at 14:54
1

Performance is the same, whether you assign the Linq query to a variable, or call it directly in the For Each. In both case the iterator will be created once and the For Each loop will go through each item in the list once.

In the first code sample, the CType is not necessary though (actually I don't think it would work). You can simply do:

Dim collection = CellCollection.Where(Function(i) i.IsPending = True)
For Each item As Item In collection
    'Do something here
Next

But as I mentioned, assigning to a variable is not necessary. Having the Where clause on the For Each line will yield the same performance, and the code will be shorter and more readable.

Meta-Knight
  • 17,626
  • 1
  • 48
  • 58
0

The performance for both is the same. foreach will create the IEnumerable(Of T) then enumerate through it.

However, if you're concerned about performance, try:

Dim collection As IEnumerable(Of Item) _
    = CellCollection.Where(Function(i) i.IsPending)

For Each item As Item In collection 
    'Do something here
Next

It's possible that casting the IEnumerable(Of Item) to ItemCollection would cause it to enumerate (like ToArray or ToList). This will cause the collection to enumerate twice. Keeping it as IEnumerable ensures that the i.IsPending check happens during the enumeration of the For Each and not the CType().

The fastest solution would be to forgo LINQ altogether (LINQ statements, although readable, add some overhead).

For Each item As Item In CellCollection
    If Not item.IsPending Then
         Continue For
    End If
    'Do something here
Next
David
  • 1,143
  • 7
  • 9