3

As a naive tip, you often hear to use IEnumerable.Any() because then the entire enumerable does not necessarily need to be traversed.

I just wrote a little segment of code that tries to see if the Enumerable contains a single item or multiple.

if (reportInfo.OrebodyAndPits.SelectMany(ob => ob.Pits).Count() > 1)
{
    ws.Cells[row, col++].Value = "Pits";
}
else
{
    ws.Cells[row, col++].Value = "Pit";
}

That made me wonder, will the comparison be compiled into a form that is smart enough to return false as soon as it enumerates past the first item?

If not, is there a way to write a linq extension method that would do that?

(Please note, I'm not terribly interested in the performance impact of this piece of code. I'm mainly curious.)

Coxy
  • 8,844
  • 4
  • 39
  • 62
  • 2
    No, `.Count()` returns number of items, e.g. `123456789` and only then check `> 1` condition. Put `Skip(1).Any()` for smart behaviour. In some cases (this one *excluded*) .Net sees that `IEnumerable` is in fact an *array* `T[]` or *list* `List` and call `Length` or `Count` instead of traversing, but that's all we can expect – Dmitry Bychenko Nov 28 '16 at 08:04

2 Answers2

4

No, it will not. Your code will count all the items in the sequence. This is because LINQ statements are not optimized by the compiler, what you write is what you get.

An equivelent, more efficient way of checking whether a sequence contains more than 1 item is:

reportInfo.OrebodyAndPits.SelectMany(ob => ob.Pits).Skip(1).Any();

This will check, after skipping the first item, whether there are any items left.

Wazner
  • 2,962
  • 1
  • 18
  • 24
3

If you want to know how something works why no look at the source code?

Here's the Any() method: https://github.com/dotnet/corefx/blob/master/src/System.Linq/src/System/Linq/AnyAll.cs#L20

Here is the Count() method: https://github.com/dotnet/corefx/blob/master/src/System.Linq/src/System/Linq/Count.cs#L12

The compiler cannot make an optimisation like you describe. It asks for the count and gets a number then it compares that number with what's in your conditional statement.

It does however try and make some sort of optimisation. As you can see from the Count() method it attempts to see if the IEnumerable already supports a Count property and uses that because it is faster than counting all the elements again. If not available it has to move through the entire thing and count each individually.

If you want to write a LINQ method (which is just an extension method on IEnumerable<T>) that determines if there are at least two in an IEnumerable then that should be easy enough. Something like this:

e.g.

    public static bool AtLeastTwo<TSource>(this IEnumerable<TSource> source)
    {
        if (source == null)
        {
            throw Error.ArgumentNull(nameof(source));
        }

        using (IEnumerator<TSource> e = source.GetEnumerator())
        {
            e.MoveNext(); // Move past the first one
            return e.MoveNext(); // true if there is at least a second element.
        }
    }
Colin Mackay
  • 18,736
  • 7
  • 61
  • 88