9

A while ago I wrote an IList extension method to enumerate across part of a list by using the indices. While refactoring I realized a similar query could be performed by calling Skip(toSkip).Take(amount). While benchmarking this I noticed that Skip isn't optimized for IList. With a bit of googling I ended up at a Jon Skeet post, discussing why optimizing methods like Skip is dangerous.

As far as I understand the article, the problem is no exception is thrown in the optimized methods when the collection is modified, but as a comment states the msdn documentation conflicts itself.

In IEnumerator.MoveNext():

If changes are made to the collection, such as adding, modifying, or deleting elements, the enumerator is irrecoverably invalidated and the next call to MoveNext or Reset throws an InvalidOperationException.

In IEnumerator.GetEnumerator():

If changes are made to the collection, such as adding, modifying, or deleting elements, the enumerator is irrecoverably invalidated and its behavior is undefined.

I see merit in both conventions, and am a bit lost whether or not to optimize. What is a proper solution? I've been considering an IList.AssumeImmutable() approach along the lines of AsParallel() as mentioned by Kris Vandermotten in the comments. Does any implementation already exist, or is it a bad idea?

Community
  • 1
  • 1
Steven Jeuris
  • 18,274
  • 9
  • 70
  • 161
  • 7
    If Jon concludes it would be a mistake to optimize `Skip` etc, and has a paragraph explaining why... then I'd take that as good reasoning. – Marc Gravell May 02 '11 at 21:12
  • 1
    I think the "undefined" semantics are more pragmatic than the "throws an exception" semantics (even though the latter is more solid), merely because the latter requires every enumerator instance be able to test for modification of the host container. Ugh! – Rafe May 03 '11 at 00:35
  • Jon's point still stands, however. There's nothing to stop you writing an `ItemsInRange(Lo, Hi)` enumerator *without* using `Skip/Take`. – Rafe May 03 '11 at 00:37
  • 1
    @Rafe: I wrote an answer to that effect and deleted it because by symmetry `ItemsInRange` should exist for `IEnumerable` as well and that will throw, so the `IList` version cannot be optimized by Jon's argument. – Rick Sladkey May 03 '11 at 05:11
  • 1
    @Marc: How do we know that Jon was right upon rethinking and not right in the first place? – Gabe May 03 '11 at 05:56

2 Answers2

3

I agree with Rafe that the undefined behavior is more correct. Only versioned collections can throw exceptions and not all collections are versioned (arrays being the largest example). Even versioned collections might misbehave if you make exactly 2^32 changes between calls to MoveNext.

Assuming you really care about the versioning behavior, the solution is to get an Enumerator for the IList and call MoveNext on it for every iteration:

    public static IEnumerable<T> Skip<T>(this IList<T> source, int count)
    {
        using (var e = source.GetEnumerator())
            while (count < source.Count && e.MoveNext())
                yield return source[count++];
    }

This way you get O(1) behavior by indexing, but you still get all the exception throwing behavior of calling MoveNext. Note that we only call MoveNext for the exception side-effects; we ignore the values that it's enumerating over.

Gabe
  • 84,912
  • 12
  • 139
  • 238
0

The ReadOnlyCollection class might help with your immutable collection.

My advice: I personally would not try to "trick" the compiler unless you are having a performance issue. You never know, the next version could make your optimized code run twice as slow as the original. Don't preemptively optimize. The methods provided in the framework can produce some really optimized code that would be difficult to re-implement.

here is an article from msdn that gives info on what collections to use for different purposes. I would use an appropriate collection for the task instead of trying to optimize Skip and Take.

Charles Lambert
  • 5,042
  • 26
  • 47