1

I've been playing with Lists and Enumerables and I think I understand the basics:

  • Enumerable: The elements are evaluated each time they are consumed.
  • List: The elements are evaluated on definition and are not reevaluated at any point.

I've done some tests:

Starting with the Enumerable example:

var myList = new List<int>() { 1, 2, 3, 4, 5, 6 };
var myEnumerable = myList.Where(p =>
    {
        Console.Write($"{p} ");
        return p > 2;
    }
);

Console.WriteLine("");
Console.WriteLine("Starting");
myEnumerable.First();
Console.WriteLine("");
myEnumerable.Skip(1).First();

The output is:

Starting
1 2 3 
1 2 3 4 

If we add .ToList() after the .Where(...) then the output is:

1 2 3 4 5 6 
Starting

I also was able to have a bit of both worlds with this class:

class SingleEvaluationEnum<T>
{
    private IEnumerable<T> Enumerable;

    public SingleEvaluationEnum(IEnumerable<T> enumerable)
        => Enumerable = enumerable;

    public IEnumerable<T> Get()
    {
        if (!(Enumerable is List<T>))
            Enumerable = Enumerable.ToList().AsEnumerable();

        return Enumerable;
    }
}

You can see the output is:

Starting
1 2 3 4 5 6 

This way the evaluation is deferred until the first consumption and is not re-evaluated in the next ones. But the whole list is evaluated.

 

My question is: Is there a way to get this output?

Starting
1 2 3
4

In other words: I want myEnumerable.First() to evaluate only the necesary elements, but no more. And I want myEnumerable.Skip(1).First() to reuse the already evaluated elements.

EDIT: Clarification: I want that any "query" over the Enumerable applies to all the elements in the list. That's why (AFAIK) an Enumerator doesn't work.

Thanks!

raul.vila
  • 1,984
  • 1
  • 11
  • 24

2 Answers2

2

Basically it sounds like you're looking for an Enumerator which you can get by calling GetEnumerator on an IEnumerable. An Enumerator keeps track of it's position.

var myList = new List<int>() { 1, 2, 3, 4, 5, 6 };
var myEnumerator = myList.Where(p =>
    {
        Console.Write($"{p} ");
        return p > 2;
    }
).GetEnumerator();

Console.WriteLine("Starting");
myEnumerator.MoveNext();
Console.WriteLine("");
myEnumerator.MoveNext();

This will get you the output:

Starting
1 2 3
4

Edit to respond to your comment: First of all this sounds like an extremely bad idea. An enumerator represents something that can be enumerated. This is why you can pipe all those fancy LINQ queries on top of it. However all calls to First "visualize" this enumeration (which results in GetEnumerator being called to get an Enumerator and walking over that until we're done and then disposing it). You however ask for every visualization to change the IEnumerable it's visualizing (this is not good practice).

However since you said this is for learning I'll give you code that ends up with an IEnumerable that will give you your desired output. I would not recommend you ever use this in real code, this is not a good and solid way of doing things.

First we create a custom Enumerator that doesn't dispose, but just keeps enumerating some internal enumerator:

public class CustomEnumerator<T> : IEnumerator<T>
{
    private readonly IEnumerator<T> _source;

    public CustomEnumerator(IEnumerator<T> source)
    {
        _source = source;
    }

    public T Current => _source.Current;

    object IEnumerator.Current => _source.Current;

    public void Dispose()
    {

    }

    public bool MoveNext()
    {
        return _source.MoveNext();
    }

    public void Reset()
    {
        throw new NotImplementedException();
    }
}

Then we create a custom IEnumerable class that, instead of creating a new Enumerator everytime GetEnumerator() is called, but will secretly keep using the same enumerator:

public class CustomEnumerable<T> : IEnumerable<T>
{
    public CustomEnumerable(IEnumerable<T> source)
    {
        _internalEnumerator = new CustomEnumerator<T>(source.GetEnumerator());
    }

    private IEnumerator<T> _internalEnumerator;
    public IEnumerator<T> GetEnumerator()
    {
        return _internalEnumerator;
    }

    IEnumerator IEnumerable.GetEnumerator()
    {
        return _internalEnumerator;
    }
}

And finally we create an IEnumerable extension method to convert an IEnumerable into our CustomEnumerable:

public static class IEnumerableExtensions
{
    public static IEnumerable<T> ToTrackingEnumerable<T>(this IEnumerable<T> source) => new CustomEnumerable<T>(source);
}

Finally when we can now do this:

var myList = new List<int>() { 1, 2, 3, 4, 5, 6 };

var myEnumerable = myList.Where(p =>
{
    Console.Write($"{p} ");
    return p > 2;
}).ToTrackingEnumerable();

Console.WriteLine("Starting");
var first = myEnumerable.First();
Console.WriteLine("");
var second = myEnumerable.Where(p => p % 2 == 1).First();
Console.WriteLine("");

I changed the last part so show that we can still use LINQ on it. The output is now:

Starting
1 2 3
4 5
  • I think that's not the same. Maybe I didn't explain it correctly; the idea is that all "queries" apply to the whole list. AFAIK, with an enumerator I cannot use Skip() and other LINQ methods, right? – raul.vila Nov 12 '19 at 12:26
  • 1
    @raul.vila I edited this to add a hacky `IEnumerable` that does what you want. Please only use this for learning purposes:-) –  Nov 12 '19 at 13:15
  • Thanks a lot, this helps to understand it better. Anyway, I'm going to mark my question as duplicated because I found this: https://stackoverflow.com/questions/12427097/is-there-an-ienumerable-implementation-that-only-iterates-over-its-source-e-g – raul.vila Nov 12 '19 at 13:26
2

LINQ is fundamentally a functional approach to working with collections. One of the assumptions is that there are no side-effects to evaluating the functions. You're violating that assumption by calling Console.Write in the function.

There's no magic involved, just functions. IEnumerable has just one method - GetEnumerator. That's all that is needed for LINQ, and that's all that LINQ really does. For example, a naïve implementation of Where would look like this:

public static IEnumerable<T> Where<T>(this IEnumerable<T> @this, Func<T, bool> filter)
{
  foreach (var item in @this)
  {
    if (filter(item)) yield return item;
  }
}

A Skip might look like this:

public static IEnumerable<T> Skip<T>(this IEnumerable<T> @this, int skip)
{
  foreach (var item in @this)
  {
    if (skip-- > 0) continue;

    yield return item;
  }
}

That's all there is to it. It doesn't have any information about what IEnumerable is or represents. In fact, that's the whole point - you're abstracting those details away. There are a few optimizations in those methods, but they don't do anything smart. In the end, the difference between the List and IEnumerable in your example isn't anything fundamental - it's that myEnumerable.Skip(1) has side-effects (because myEnumerable itself has side-effects) while myList.Skip(1) doesn't. But both do the exact same thing - evaluate the enumerable, item by item. There's no other method than GetEnumerator on an enumerable, and IEnumerator only has Current and MoveNext (of those that matter for us).

LINQ is immutable. That's one of the reasons why it's so useful. This allows you to do exactly what you're doing - query the same enumerable twice but getting the exact same result. But you're not happy with that. You want things to be mutable. Well, nothing is stopping you from making your own helper functions. LINQ is just a bunch of functions, after all - you can make your own.

One such simple extension could be a memoized enumerable. Wrap around the source enumerable, create a list internally, and when you iterate over the source enumerable, keep adding items to the list. The next time GetEnumerator is called, start iterating over your internal list. When you reach the end, continue with the original approach - iterate over the source enumerable and keep adding to the list.

This will allow you to use LINQ fully, just inserting Memoize() to your LINQ queries at the places where you want to avoid iterating over the source multiple times. In your example, this would be something like:

myEnumerable = myEnumerable.Memoize();

Console.WriteLine("");
Console.WriteLine("Starting");
myEnumerable.First();
Console.WriteLine("");
myEnumerable.Skip(1).First();

The first call to myEnumerable.First() will iterate through the first three items in myList, and the second will only work with the fourth.

Luaan
  • 62,244
  • 7
  • 97
  • 116
  • Thanks a lot, Memoization is the concept I was looking for. Searching by it, I found [this other question](https://stackoverflow.com/questions/12427097/is-there-an-ienumerable-implementation-that-only-iterates-over-its-source-e-g), so I'm going to mark mine as duplicate. – raul.vila Nov 12 '19 at 13:27