19

I have a sequence of elements. The sequence can only be iterated once and can be "infinite".

What is the best way get the head and the tail of such a sequence?

Update: A few clarifications that would have been nice if I included in the original question :)

  • Head is the first element of the sequence and tail is "the rest". That means the the tail is also "infinite".

  • When I say infinite, I mean "very large" and "I wouldn't want to store it all in memory at once". It could also have been actually infinite, like sensor data for example (but it wasn't in my case).

  • When I say that it can only be iterated once, I mean that generating the sequence is resource heavy, so I woundn't want to do it again. It could also have been volatile data, again like sensor data, that won't be the same on next read (but it wasn't in my case).

asgerhallas
  • 16,890
  • 6
  • 50
  • 68
  • 4
    What would be the tail of an infinite sequence? – Jon Mar 09 '11 at 12:24
  • 1
    How do you define the "tail" of an infinite sequence? Does the sequence begin to repeat at some point? – Ani Mar 09 '11 at 12:24
  • Can you specify why you can iterate only once? – Davide Piras Mar 09 '11 at 12:24
  • uhhh something that has the possibility of infinite kinda implies that you'll never be able to get the tail end of the IEnumerable – jonezy Mar 09 '11 at 12:24
  • 3
    The tail is an infinite sequence with the remainder of the original sequence. I can not see why that should not be possible? I can only iterate once because the computation generating the sequence is resource heavy. – asgerhallas Mar 09 '11 at 12:32
  • 7
    Of course you can get the tail of an infinite sequence! The tail of [1,2,3,4,...] is [2,3,4,...]. – Squirrelsama Mar 02 '12 at 01:07
  • It seems the term “tail” is understood differently. In functional programming circles, it means “all but the first *N* elements”; apparently, others read it as “the last *N* elements” (for some finite number *N*). – Quirin F. Schroll Aug 23 '22 at 18:09

5 Answers5

24

Decomposing IEnumerable<T> into head & tail isn't particularly good for recursive processing (unlike functional lists) because when you use the tail operation recursively, you'll create a number of indirections. However, you can write something like this:

I'm ignoring things like argument checking and exception handling, but it shows the idea...

Tuple<T, IEnumerable<T>> HeadAndTail<T>(IEnumerable<T> source) {
  // Get first element of the 'source' (assuming it is there)
  var en = source.GetEnumerator();
  en.MoveNext();
  // Return first element and Enumerable that iterates over the rest
  return Tuple.Create(en.Current, EnumerateTail(en));
}

// Turn remaining (unconsumed) elements of enumerator into enumerable
IEnumerable<T> EnumerateTail<T>(IEnumerator en) {
  while(en.MoveNext()) yield return en.Current; 
}

The HeadAndTail method gets the first element and returns it as the first element of a tuple. The second element of a tuple is IEnumerable<T> that's generated from the remaining elements (by iterating over the rest of the enumerator that we already created).

Tomas Petricek
  • 240,744
  • 19
  • 378
  • 553
  • 1
    The IEnumerator parameter in EnumerateTail, shouldn't that be IEnumerator? – asgerhallas Mar 09 '11 at 12:55
  • 3
    I would make the tuple a `Tuple>` - otherwise it *looks* like you can iterate over the tail several times, but you really can't. – Jon Skeet Mar 09 '11 at 12:56
  • 1
    @Jon Skeet: I'm a little confused by that suggestion. Wouldn't that result in an awkward syntax where it's being used, where you for example can't use Linq directly on the tail? – asgerhallas Mar 09 '11 at 13:06
  • @Jon Skeet: The restriction that you can enumerate only once is making it a bit tricky - possibly, the code should just be using `IEnumerator` everywhere (and forget about splitting enumerable into head and tail). I intentionally wrapped it into `IEnumerable` so that you can use `HeadAndTail` recursively. Also, you probably want to be able to use LINQ syntax on the result. – Tomas Petricek Mar 09 '11 at 13:18
  • 1
    @asgerhallas: Yes, it would... but you're *in* an awkward situation already - I don't think it helps to pretend that the tail is a "normal" sequence which you can read repeatedly. – Jon Skeet Mar 09 '11 at 13:19
  • (But as I wrote in the first sentence, `IEnumerable` doesn't fit the functional head-tail style of list processing very well) – Tomas Petricek Mar 09 '11 at 13:20
  • @Tomas: Yes, exactly - `IEnumerator` would make it clearer, although it would still be tricky in other ways, if one piece of code "remembered" the iterator while other bits were possibly iterating over it. – Jon Skeet Mar 09 '11 at 13:20
  • @Jon and Tomas: Thanks both of you. You're right. I might be fetching a little to far with this, but I sure learned something :) Can you elaborate on what the pitfalls are of using the tail in a recursion? Will I end up with many enumerators? or is it something else? – asgerhallas Mar 09 '11 at 14:29
  • 3
    @asgerhallas: Yes, for the 100,000th item you'll ask the 99,999th iterator to move to the next item, which will ask the 99,998th iterator etc. Nasty stack :) – Jon Skeet Mar 09 '11 at 14:30
  • Couldn't EnumerateTail return a new class that wrapped the Enumerator of the tail and just return it? `class TailEnumerable : IEnumerable { IEnumerator enumerator; TailEnumerable(IEnumerator enumerator) { this.enumerator = enumerator; } public IEnumerator GetEnumerator() { return enumerator; } }` – Stuart Sep 10 '15 at 15:35
  • @Stuart Not really - the tail enumerator is mutable (yuck!) and if it was accessed by multiple callers (which it would), then bad things would happen to it. – Tomas Petricek Sep 10 '15 at 22:51
  • Ha, I just came up with this exact code in LINQPad but then started to wonder about the "never enumerated == never disposed" problem that another answer refers to. Jon's point about being able to enumerate only once is another important one I hadn't considered, as is yours about multiple callers. Better to just work explicitly with `IEnumerator`, then! – anton.burger Feb 15 '19 at 08:49
  • 1
    @anton.burger, I came here for having exactly this question answered because I couldn’t figure out a way to dispose the enumerator in all cases. When `EnumerateTail` has a `try`–`finally`, it works if you e.g. break the loop iterating the tail. However, if the tail is never touched, i.e. `EnumerateTail` never starts execution, I don’t see a way to ensure disposing. – Quirin F. Schroll Aug 23 '22 at 18:15
2

Obviously, each call to HeadAndTail should enumerate the sequence again (unless there is some sort of caching used). For example, consider the following:

var a = HeadAndTail(sequence);
Console.WriteLine(HeadAndTail(a.Tail).Tail);
//Element #2; enumerator is at least at #2 now.

var b = HeadAndTail(sequence);
Console.WriteLine(b.Tail);
//Element #1; there is no way to get #1 unless we enumerate the sequence again.

For the same reason, HeadAndTail could not be implemented as separate Head and Tail methods (unless you want even the first call to Tail to enumerate the sequence again even if it was already enumerated by a call to Head).

Additionally, HeadAndTail should not return an instance of IEnumerable (as it could be enumerated multiple times).

This leaves us with the only option: HeadAndTail should return IEnumerator, and, to make things more obvious, it should accept IEnumerator as well (we're just moving an invocation of GetEnumerator from inside the HeadAndTail to the outside, to emphasize it is of one-time use only).

Now that we have worked out the requirements, the implementation is pretty straightforward:

class HeadAndTail<T> {
    public readonly T Head;
    public readonly IEnumerator<T> Tail;

    public HeadAndTail(T head, IEnumerator<T> tail) {
        Head = head;
        Tail = tail;
    }
}

static class IEnumeratorExtensions {
    public static HeadAndTail<T> HeadAndTail<T>(this IEnumerator<T> enumerator) {
        if (!enumerator.MoveNext()) return null;
        return new HeadAndTail<T>(enumerator.Current, enumerator);
    }
}

And now it can be used like this:

Console.WriteLine(sequence.GetEnumerator().HeadAndTail().Tail.HeadAndTail().Head);
//Element #2

Or in recursive functions like this:

TResult FoldR<TSource, TResult>(
    IEnumerator<TSource> sequence,
    TResult seed,
    Func<TSource, TResult, TResult> f
) {
    var headAndTail = sequence.HeadAndTail();
    if (headAndTail == null) return seed;
    return f(headAndTail.Head, FoldR(headAndTail.Tail, seed, f));
}

int Sum(IEnumerator<int> sequence) {
    return FoldR(sequence, 0, (x, y) => x+y);
}

var array = Enumerable.Range(1, 5);
Console.WriteLine(Sum(array.GetEnumerator())); //1+(2+(3+(4+(5+0)))))
penartur
  • 9,792
  • 5
  • 39
  • 50
1

While other approaches here suggest using yield return for the tail enumerable, such an approach adds unnecessary nesting overhead. A better approach would be to convert the Enumerator<T> back into something that can be used with foreach:

public struct WrappedEnumerator<T>
{
    T myEnumerator;
    public T GetEnumerator() { return myEnumerator; }
    public WrappedEnumerator(T theEnumerator) { myEnumerator = theEnumerator; }
}
public static class AsForEachHelper
{
    static public WrappedEnumerator<IEnumerator<T>> AsForEach<T>(this IEnumerator<T> theEnumerator) {return new WrappedEnumerator<IEnumerator<T>>(theEnumerator);}

    static public WrappedEnumerator<System.Collections.IEnumerator> AsForEach(this System.Collections.IEnumerator theEnumerator) 
        { return new WrappedEnumerator<System.Collections.IEnumerator>(theEnumerator); }
}

If one used separate WrappedEnumerator structs for the generic IEnumerable<T> and non-generic IEnumerable, one could have them implement IEnumerable<T> and IEnumerable respectively; they wouldn't really obey the IEnumerable<T> contract, though, which specifies that it should be possible to possible to call GetEnumerator() multiple times, with each call returning an independent enumerator.

Another important caveat is that if one uses AsForEach on an IEnumerator<T>, the resulting WrappedEnumerator should be enumerated exactly once. If it is never enumerated, the underlying IEnumerator<T> will never have its Dispose method called.

Applying the above-supplied methods to the problem at hand, it would be easy to call GetEnumerator() on an IEnumerable<T>, read out the first few items, and then use AsForEach() to convert the remainder so it can be used with a ForEach loop (or perhaps, as noted above, to convert it into an implementation of IEnumerable<T>). It's important to note, however, that calling GetEnumerator() creates an obligation to Dispose the resulting IEnumerator<T>, and the class that performs the head/tail split would have no way to do that if nothing ever calls GetEnumerator() on the tail.

supercat
  • 77,689
  • 9
  • 166
  • 211
-1

probably not the best way to do it but if you use the .ToList() method you can then get the elements in position [0] and [Count-1], if Count > 0.

But you should specify what do you mean by "can be iterated only once"

Davide Piras
  • 43,984
  • 10
  • 98
  • 147
-2

What exactly is wrong with .First() and .Last()? Though yeah, I have to agree with the people who asked "what does the tail of an infinite list mean"... the notion doesn't make sense, IMO.

Marcel Popescu
  • 3,146
  • 3
  • 35
  • 42