38

Let's imagine I want to stream three files to a user all in a row, but instead of him handing me a Stream object to push bytes down, I have to hand him a Stream object he'll pull bytes from. I'd like to take my three FileStream objects (or even cleverer, an IEnumerable<Stream>) and return a new ConcatenatedStream object that would pull from the source streams on demand.

SharpC
  • 6,974
  • 4
  • 45
  • 40
Sebastian Good
  • 6,310
  • 2
  • 33
  • 57

5 Answers5

38
class ConcatenatedStream : Stream
{
    Queue<Stream> streams;

    public ConcatenatedStream(IEnumerable<Stream> streams)
    {
        this.streams = new Queue<Stream>(streams);
    }

    public override bool CanRead
    {
        get { return true; }
    }

    public override int Read(byte[] buffer, int offset, int count)
    {
        int totalBytesRead = 0;

        while (count > 0 && streams.Count > 0)
        {
            int bytesRead = streams.Peek().Read(buffer, offset, count);
            if (bytesRead == 0)
            {
                streams.Dequeue().Dispose();
                continue;
            }

            totalBytesRead += bytesRead;
            offset += bytesRead;
            count -= bytesRead;
        }

        return totalBytesRead;
    }

    public override bool CanSeek
    {
        get { return false; }
    }

    public override bool CanWrite
    {
        get { return false; }
    }

    public override void Flush()
    {
        throw new NotImplementedException();
    }

    public override long Length
    {
        get { throw new NotImplementedException(); }
    }

    public override long Position
    {
        get
        {
            throw new NotImplementedException();
        }
        set
        {
            throw new NotImplementedException();
        }
    }

    public override long Seek(long offset, SeekOrigin origin)
    {
        throw new NotImplementedException();
    }

    public override void SetLength(long value)
    {
        throw new NotImplementedException();
    }

    public override void Write(byte[] buffer, int offset, int count)
    {
        throw new NotImplementedException();
    }
}
Dark Falcon
  • 43,592
  • 5
  • 83
  • 98
Muhammad Hasan Khan
  • 34,648
  • 16
  • 88
  • 131
  • 10
    I wonder. Maybe the solution is so elegant that it is illegal. – Muhammad Hasan Khan Oct 07 '10 at 14:46
  • 2
    I'm wondering why you used a Stack rather than a Queue? Alternately, one can make it even lazier by storing the Enumerator rather than enumerating the IEnumerable at the beginning - this also has the advantage of being constant space. – porges Oct 14 '10 at 02:20
  • Good suggestion! I changed it to queue, actually i decided to use a stack without much thought but then realized that i would have to reverse it. Never thought of queue. Pop and Push thoughts naturally come with stack. – Muhammad Hasan Khan Oct 14 '10 at 03:05
  • 4
    (comment-by-proxy, i.e. "not me") "This solution basically does not work and contains a serious bug. If I try to read all data at once I will get first stream data only." – Marc Gravell Dec 07 '10 at 10:28
  • 2
    @HasanKhan I think that was from a misplaced "moderator message" or similar; like I say, "not me". However, I suspect the person was saying that if they pass in a big buffer, it will only read from the first stream (or first two streams) - should perhaps be a `while` not an `if`. However! I would say that person isn't calling `Read` correctly if they expect it to fill as much of their buffer as is possible. All `Read` **must** return is at least one byte, or an EOF. – Marc Gravell Feb 12 '14 at 13:42
  • @MarcGravell The `Read` inside the `if` block is a recursive call (and not a `streams.Peek().Read`). So a `while` is not required. – Shameer May 23 '14 at 23:41
  • 3
    The bug is that `bytesRead < count` is not sufficient reason to close the stream at the head of the queue. It must be *zero*. So if you're concatenating streams that operate in fixed-size blocks (like CryptoStream in some situations) the streams will be closed before they are fully read, unless `count` matches the block size exactly. – KingPong Feb 14 '15 at 03:07
  • 4
    The idea of this solution is very good, but this implementation is really broken. I also don't understand why this is accepted given the other answers. Nobody should use this code ever. – usr Feb 14 '15 at 10:53
  • 1
    @LasseV.Karlsen This solution is not lazy at all: Queue ctor enumerate the entire collection. – MuiBienCarlota Aug 05 '16 at 12:02
  • The Read method should return as soon as it gets a non-empty read from one of the streams, without attempting to use the whole buffer, as it is allowed by the specification. The way it's written, what happens if you read successfully from one stream and then the next one throws an exception? You can't return how many bytes you've read, you have a modified buffer and you've already destructively modified the first stream. That's difficult to fix and the whole issue can be simply avoided by returning immediately after the first successful read. – relatively_random Aug 12 '22 at 07:10
10

So long as you only need reading, here's my implementation of such a stream:

NOTE! Position and Seek is broken, need to fix it

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;

namespace LVK.IO
{
    /// <summary>
    /// This class is a <see cref="Stream"/> descendant that manages multiple underlying
    /// streams which are considered to be chained together to one large stream. Only reading
    /// and seeking is allowed, writing will throw exceptions.
    /// </summary>
    public class CombinedStream : Stream
    {
        private readonly Stream[] _UnderlyingStreams;
        private readonly Int64[] _UnderlyingStartingPositions;
        private Int64 _Position;
        private readonly Int64 _TotalLength;
        private int _Index;

        /// <summary>
        /// Constructs a new <see cref="CombinedStream"/> on top of the specified array
        /// of streams.
        /// </summary>
        /// <param name="underlyingStreams">
        /// An array of <see cref="Stream"/> objects that will be chained together and
        /// considered to be one big stream.
        /// </param>
        public CombinedStream(params Stream[] underlyingStreams)
        {
            if (underlyingStreams == null)
                throw new ArgumentNullException("underlyingStreams");
            foreach (Stream stream in underlyingStreams)
            {
                if (stream == null)
                    throw new ArgumentNullException("underlyingStreams[]");
                if (!stream.CanRead)
                    throw new InvalidOperationException("CanRead not true for all streams");
                if (!stream.CanSeek)
                    throw new InvalidOperationException("CanSeek not true for all streams");
            }

            _UnderlyingStreams = new Stream[underlyingStreams.Length];
            _UnderlyingStartingPositions = new Int64[underlyingStreams.Length];
            Array.Copy(underlyingStreams, _UnderlyingStreams, underlyingStreams.Length);

            _Position = 0;
            _Index = 0;

            _UnderlyingStartingPositions[0] = 0;
            for (int index = 1; index < _UnderlyingStartingPositions.Length; index++)
            {
                _UnderlyingStartingPositions[index] =
                    _UnderlyingStartingPositions[index - 1] +
                    _UnderlyingStreams[index - 1].Length;
            }

            _TotalLength =
                _UnderlyingStartingPositions[_UnderlyingStartingPositions.Length - 1] +
                _UnderlyingStreams[_UnderlyingStreams.Length - 1].Length;
        }

        /// <summary>
        /// Gets a value indicating whether the current stream supports reading.
        /// </summary>
        /// <value>
        /// <c>true</c>.
        /// </value>
        /// <returns>
        /// Always <c>true</c> for <see cref="CombinedStream"/>.
        /// </returns>
        public override Boolean CanRead
        {
            get
            {
                return true;
            }
        }

        /// <summary>
        /// Gets a value indicating whether the current stream supports seeking.
        /// </summary>
        /// <value>
        /// <c>true</c>.
        /// </value>
        /// <returns>
        /// Always <c>true</c> for <see cref="CombinedStream"/>.
        /// </returns>
        public override Boolean CanSeek
        {
            get
            {
                return true;
            }
        }

        /// <summary>
        /// Gets a value indicating whether the current stream supports writing.
        /// </summary>
        /// <value>
        /// <c>false</c>.
        /// </value>
        /// <returns>
        /// Always <c>false</c> for <see cref="CombinedStream"/>.
        /// </returns>
        public override Boolean CanWrite
        {
            get
            {
                return false;
            }
        }

        /// <summary>
        /// When overridden in a derived class, clears all buffers for this stream and causes any buffered data to be written to the underlying device.
        /// </summary>
        /// <exception cref="T:System.IO.IOException">An I/O error occurs. </exception>
        public override void Flush()
        {
            foreach (Stream stream in _UnderlyingStreams)
            {
                stream.Flush();
            }
        }

        /// <summary>
        /// Gets the total length in bytes of the underlying streams.
        /// </summary>
        /// <value>
        /// The total length of the underlying streams.
        /// </value>
        /// <returns>
        /// A long value representing the total length of the underlying streams in bytes.
        /// </returns>
        /// <exception cref="T:System.NotSupportedException">A class derived from Stream does not support seeking. </exception>
        /// <exception cref="T:System.ObjectDisposedException">Methods were called after the stream was closed. </exception>
        public override Int64 Length
        {
            get
            {
                return _TotalLength;
            }
        }

        /// <summary>
        /// Gets or sets the position within the current stream.
        /// </summary>
        /// <value></value>
        /// <returns>The current position within the stream.</returns>
        /// <exception cref="T:System.IO.IOException">An I/O error occurs. </exception>
        /// <exception cref="T:System.NotSupportedException">The stream does not support seeking. </exception>
        /// <exception cref="T:System.ObjectDisposedException">Methods were called after the stream was closed. </exception>
        public override Int64 Position
        {
            get
            {
                return _Position;
            }

            set
            {
                if (value < 0 || value > _TotalLength)
                    throw new ArgumentOutOfRangeException("Position");

                _Position = value;
                if (value == _TotalLength)
                {
                    _Index = _UnderlyingStreams.Length - 1;
                    _Position = _UnderlyingStreams[_Index].Length;
                }

                else
                {
                    while (_Index > 0 && _Position < _UnderlyingStartingPositions[_Index])
                    {
                        _Index--;
                    }

                    while (_Index < _UnderlyingStreams.Length - 1 &&
                           _Position >= _UnderlyingStartingPositions[_Index] + _UnderlyingStreams[_Index].Length)
                    {
                        _Index++;
                    }
                }
            }
        }

        /// <summary>
        /// Reads a sequence of bytes from the current stream and advances the position within the stream by the number of bytes read.
        /// </summary>
        /// <param name="buffer">An array of bytes. When this method returns, the buffer contains the specified byte array with the values between offset and (offset + count - 1) replaced by the bytes read from the current source.</param>
        /// <param name="offset">The zero-based byte offset in buffer at which to begin storing the data read from the current stream.</param>
        /// <param name="count">The maximum number of bytes to be read from the current stream.</param>
        /// <returns>
        /// The total number of bytes read into the buffer. This can be less than the number of bytes requested if that many bytes are not currently available, or zero (0) if the end of the stream has been reached.
        /// </returns>
        /// <exception cref="T:System.ArgumentException">The sum of offset and count is larger than the buffer length. </exception>
        /// <exception cref="T:System.ObjectDisposedException">Methods were called after the stream was closed. </exception>
        /// <exception cref="T:System.NotSupportedException">The stream does not support reading. </exception>
        /// <exception cref="T:System.ArgumentNullException">buffer is null. </exception>
        /// <exception cref="T:System.IO.IOException">An I/O error occurs. </exception>
        /// <exception cref="T:System.ArgumentOutOfRangeException">offset or count is negative. </exception>
        public override int Read(Byte[] buffer, int offset, int count)
        {
            int result = 0;
            while (count > 0)
            {
                _UnderlyingStreams[_Index].Position = _Position - _UnderlyingStartingPositions[_Index];
                int bytesRead = _UnderlyingStreams[_Index].Read(buffer, offset, count);
                result += bytesRead;
                offset += bytesRead;
                count -= bytesRead;
                _Position += bytesRead;

                if (count > 0)
                {
                    if (_Index < _UnderlyingStreams.Length - 1)
                        _Index++;
                    else
                        break;
                }
            }

            return result;
        }

        /// <summary>
        /// Sets the position within the current stream.
        /// </summary>
        /// <param name="offset">A byte offset relative to the origin parameter.</param>
        /// <param name="origin">A value of type <see cref="T:System.IO.SeekOrigin"></see> indicating the reference point used to obtain the new position.</param>
        /// <returns>
        /// The new position within the current stream.
        /// </returns>
        /// <exception cref="T:System.IO.IOException">An I/O error occurs. </exception>
        /// <exception cref="T:System.NotSupportedException">The stream does not support seeking, such as if the stream is constructed from a pipe or console output. </exception>
        /// <exception cref="T:System.ObjectDisposedException">Methods were called after the stream was closed. </exception>
        public override long Seek(long offset, SeekOrigin origin)
        {
            switch (origin)
            {
                case SeekOrigin.Begin:
                    Position = offset;
                    break;

                case SeekOrigin.Current:
                    Position += offset;
                    break;

                case SeekOrigin.End:
                    Position = Length + offset;
                    break;
            }

            return Position;
        }

        /// <summary>
        /// Throws <see cref="NotSupportedException"/> since the <see cref="CombinedStream"/>
        /// class does not supports changing the length.
        /// </summary>
        /// <param name="value">The desired length of the current stream in bytes.</param>
        /// <exception cref="T:System.NotSupportedException">
        /// <see cref="CombinedStream"/> does not support this operation.
        /// </exception>
        public override void SetLength(long value)
        {
            throw new NotSupportedException("The method or operation is not supported by CombinedStream.");
        }

        /// <summary>
        /// Throws <see cref="NotSupportedException"/> since the <see cref="CombinedStream"/>
        /// class does not supports writing to the underlying streams.
        /// </summary>
        /// <param name="buffer">An array of bytes.  This method copies count bytes from buffer to the current stream.</param>
        /// <param name="offset">The zero-based byte offset in buffer at which to begin copying bytes to the current stream.</param>
        /// <param name="count">The number of bytes to be written to the current stream.</param>
        /// <exception cref="T:System.NotSupportedException">
        /// <see cref="CombinedStream"/> does not support this operation.
        /// </exception>
        public override void Write(byte[] buffer, int offset, int count)
        {
            throw new NotSupportedException("The method or operation is not supported by CombinedStream.");
        }
    }
}
Lasse V. Karlsen
  • 380,855
  • 102
  • 628
  • 825
  • Nice, and from an existing library. I think it's a good general answer to the question as I posed it, though I prefer something that doesn't eagerly evaluate the streams, as I'd prefer to build them on demand. LVK is grabbing all of them to support efficient peeking, which makes sense, though in my case I don't have to support that operation. – Sebastian Good Oct 07 '10 at 21:05
  • In which case you should probably use the one provided by [@Hasan Khan](http://stackoverflow.com/users/36464/hasan-khan) [here](http://stackoverflow.com/questions/3879152/how-do-i-concatenate-two-system-io-stream-instances-into-one/3879231#3879231). – Lasse V. Karlsen Oct 08 '10 at 08:07
  • 1
    An implementation of the Dispose method that disposes the UnderlyingStreams should be added – NineBerry Feb 12 '14 at 08:41
  • The link doesn't work anymore. Not a problem for this answer because the full code is here, but I noticed you other answers (e.g. [this one](http://stackoverflow.com/a/3141949/247702)) that don't embed the code. – user247702 Feb 11 '15 at 09:22
  • It's a read only stream, shouldn't you make `Flush()` method empty or throw `NotSupportedException`? – dbardakov Mar 19 '15 at 09:37
  • I would tend to agree but the documentation also mentions "clear the buffers", which *could* mean releasing memory holding buffered data from disk. As such I left it as is, could probably have done a better job of documenting why I left it in there. – Lasse V. Karlsen Mar 19 '15 at 09:47
  • `Position` and return of `Seek` getter looks like broken. Proof: ` var ms1 = new MemoryStream(new byte[] { 1,2 }); var ms2 = new MemoryStream(new byte[] { 1,2,3 }); var combined = new CombinedStream(ms1, ms2); combined.Length.Dump(); // 5, good combined.Seek(0, SeekOrigin.End).Dump(); // acutal 3, expected 5 combined.Position.Dump(); // actual 3, expected 5 ` – Pavel Martynov Jun 18 '15 at 10:39
  • 1
    I've created a GitHub project for it - https://github.com/lassevk/Streams - and will commit some tests and fixed code shortly (probably not before the weekend though) – Lasse V. Karlsen Jun 18 '15 at 13:36
9

Untested, but something like:

class StreamEnumerator : Stream
{
    private long position;
    bool closeStreams;
    IEnumerator<Stream> iterator;
    Stream current;
    private void EndOfStream() {
        if (closeStreams && current != null)
        {
            current.Close();
            current.Dispose();
        }
        current = null;
    }
    private Stream Current
    {
        get {
            if(current != null) return current;
            if (iterator == null) throw new ObjectDisposedException(GetType().Name);
            if (iterator.MoveNext()) {
                current = iterator.Current;
            }
            return current;
        }
    }
    protected override void Dispose(bool disposing)
    {
        if (disposing)
        {
            EndOfStream();
            iterator.Dispose();
            iterator = null;
            current = null;
        }
        base.Dispose(disposing);
    }
    public StreamEnumerator(IEnumerable<Stream> source, bool closeStreams)
    {
        if (source == null) throw new ArgumentNullException("source");
        iterator = source.GetEnumerator();
        this.closeStreams = closeStreams;
    }
    public override bool CanRead { get { return true; } }
    public override bool CanWrite { get { return false; } }
    public override void Write(byte[] buffer, int offset, int count)
    {
        throw new NotSupportedException();
    }
    public override void WriteByte(byte value)
    {
        throw new NotSupportedException();
    }
    public override bool CanSeek { get { return false; } }
    public override bool CanTimeout { get { return false; } }
    public override void SetLength(long value)
    {
        throw new NotSupportedException();
    }
    public override long Seek(long offset, SeekOrigin origin)
    {
        throw new NotSupportedException();
    }
    public override void Flush()
    { /* nothing to do */ }
    public override long Length
    {
        get { throw new NotSupportedException(); }
    }
    public override long Position
    {
        get { return position; }
        set { if (value != this.position) throw new NotSupportedException(); }
    }
    public override int Read(byte[] buffer, int offset, int count)
    {
        int result = 0;
        while (count > 0)
        {
            Stream stream = Current;
            if (stream == null) break;
            int thisCount = stream.Read(buffer, offset, count);
            result += thisCount;
            count -= thisCount;
            offset += thisCount;
            if (thisCount == 0) EndOfStream();
        }
        position += result;
        return result;
    }
}
Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
2

Edit: Made it clear this is a seekable option.

Here is an option that supports seeking which is necessary in many situations. It is missing some functionality for disposing the underlying streams and could be a little more efficient if you were willing to assume that the underlying streams do not change length. Total length and stream offsets could then be calculated once.

public sealed class SeekableConcatenatedStream : Stream
{
    List<Stream> streams;
    private long _Position { get; set; }

    public SeekableConcatenatedStream(List<Stream> streams)
    {
        foreach (var s in streams)
        {
          if (!s.CanSeek)
              throw new ArgumentException($"All provided streams must be be seekable to create a {nameof(SeekableConcatenatedStream)}");
        }

        this.streams = streams;
        Seek(0, SeekOrigin.Begin);
    }

    public override bool CanRead
    {
        get { return true; }
    }

    public override int Read(byte[] buffer, int offset, int count)
    {
        if (streams.Count == 0)
            return 0;

        var startStream = 0;
        var cumulativeCapacity = 0L;
        for (var i = 0; i < streams.Count; i++)
        {
            cumulativeCapacity += streams[i].Length;
            if (_Position < cumulativeCapacity)
            {
                startStream = i;
                break;
            }
        }

        var bytesRead = 0;
        var curStream = startStream;

        while (_Position < Length && bytesRead < count && curStream < streams.Count)
        {
            var r = streams[curStream].Read(buffer, offset + bytesRead, count - bytesRead);
            bytesRead += r;
            Seek(_Position + r, SeekOrigin.Begin);
            curStream++;
        }

        return bytesRead;
    }

    public override bool CanSeek
    {
        get { return true; }
    }

    public override bool CanWrite
    {
        get { return false; }
    }

    public override void Flush()
    {
        throw new NotImplementedException();
    }

    public override long Length
    {
        get {
            long length = 0;
            for (var i = 0; i < streams.Count; i++)
            {
                length += streams[i].Length;
            }
            return length;
        }
    }

    public override long Position
    {
        get
        {
            return _Position;
        }
        set
        {
            Seek(value, SeekOrigin.Begin);
        }
    }

    public override long Seek(long offset, SeekOrigin origin)
    {
        if (origin == SeekOrigin.Begin)
        {
            _Position = offset;

            var prevLength = 0L;
            var cumulativeLength = 0L;
            for (var i = 0; i < streams.Count; i++)
            {
                cumulativeLength += streams[i].Length;
                if (offset < cumulativeLength)
                {
                    streams[i].Seek(offset - prevLength, SeekOrigin.Begin);
                    return _Position;
                }
                prevLength = cumulativeLength;
            }
        }

        if (origin == SeekOrigin.Current)
        {
            var newAbs = _Position + offset;
            return Seek(newAbs, SeekOrigin.Begin);
        } 
        else if(origin == SeekOrigin.End)
        {
            var newAbs = Length - offset;
            return Seek(newAbs, SeekOrigin.Begin);
        }

        throw new NotImplementedException();
    }

    public override void SetLength(long value)
    {
        throw new NotImplementedException();
    }

    public override void Write(byte[] buffer, int offset, int count)
    {
        throw new NotImplementedException();
    }
}
kriskalish
  • 145
  • 10
  • This is not a good solution because not all `Stream` implementations provide seeking abilities (and therefore not all streams provide `Length`). Your code extensively relies on `Length` being available, so it will only work on streams that support seeking. – ProgrammingLlama Dec 23 '21 at 01:02
  • 1
    I tried to put the disclaimer up front with the statement `Here is an option that supports seeking`. It's perfectly reasonable to assume that there are problems out there that rely on seekable streams to be solvable. If you wanted to provide a concatenated stream in such a context, it would also need to be seekable. Therefore, it is not fair to say it is "not a good solution". Your point raises a reasonable improvement which is that constructing the seekable concatenated stream should fail if any of the sub streams are not seekable. If I have time, I'll update the code to reflect it. – kriskalish Jan 11 '22 at 17:16
  • It's certainly possible to concatenate streams that aren't seekable, as demonstrated by the accepted answer. Your addition of `CanSeek` at least makes your answer fail at a better point in time in such scenarios, so I'll remove my downvote. – ProgrammingLlama Jan 12 '22 at 01:11
1

Why not use a container that already encapsulates the idea of multiple files, like say using ZipOutputStream from SharpZipLib?

Ed Courtenay
  • 520
  • 5
  • 14
  • 1
    fair enough, though in fact the data I'm sending isn't files -- i'm building it on the fly so I'd like to have as little of it in memory as possible. – Sebastian Good Oct 07 '10 at 21:02
  • 1
    It's irrelevant whether you're sending files or not; you simply write the streams you're concerned with to the `ZipOutputStream`, prefixing each stream with a `ZipEntry` marker – Ed Courtenay Oct 08 '10 at 14:24