1

I am writing a service that returns large sets of data to clients. Ideally, I want to return an IEnumerable of an Entity because I want the performance benefits of the laziness, both on the Service and Client. I also want to be able to compress the stream to reduce the bandwidth.

I was able to Serialize an IEnumerable to a stream and utilize GZip to compress it. I was also able to deserialize the stream successfully. However, my implementation doesn't achieve the Laziness part of my goal.

I've read solutions to concepts similar to my question, but they all involved returning an IEnumerable of byte. Ideally, I want the client to receive an IEnumerable of Entity and be able to yield return it as it deserializes it.

[DataContract]
public class Entity
{
    [DataMember]
    public int Id { get; set; }
    [DataMember]
    public string Code { get; set; }
    [DataMember]
    public string Description { get; set; }
}

[Test]
public void TestSerialEnumGzip()
{
    var e = GetEnum();
    var s = SerializeToStreamGzip(e);
    Console.WriteLine($" TestSerialGzip stream size {s.Length}");
    var b = DeserializeFromStreamGzip<IEnumerable<Entity>>(s);
}

private IEnumerable<Entity> GetEnum()
{
    for (var x = 0; x < 10; ++x)
    {
        Console.WriteLine($"yielding {x}");
        yield return new Entity { Id = x, Code = x.ToString(), Description = x.ToString() };
    }
}

private Stream SerializeToStreamGzip<T>(T toSerialize)
{
    var s = new MemoryStream();
    using (var gz = new GZipStream(s, CompressionMode.Compress, true))
    {
        var ser = new DataContractSerializer(typeof(T));
        ser.WriteObject(gz, toSerialize);
    }
    s.Seek(0, SeekOrigin.Begin);
    return s;
}

private T DeserializeFromStreamGzip<T>(Stream stream)
{
    var ser = new DataContractSerializer(typeof(T));
    var gz = new GZipStream(stream, CompressionMode.Decompress);
    var result = (T)ser.ReadObject(gz);
    return result;
}
TheGeneral
  • 79,002
  • 9
  • 103
  • 141
Paul Tsai
  • 893
  • 6
  • 16
  • Explain what you want from laziness – TheGeneral Mar 18 '18 at 01:21
  • `However, my implementation doesn't achieve the Laziness part of my goal.` How did you come to that conclusion? – mjwills Mar 18 '18 at 01:22
  • @MichaelRandall On the server, the entire dataset may take a while to gather, I would like to be able to return the stream as soon as possible. This way the client can start to receive the data even though the service may not have prepared all of it yet. This also has the added benefit that the service wouldn't need memory to pack up the entire dataset, – Paul Tsai Mar 18 '18 at 01:28
  • 1
    @mjwills The yielding console write occurs prior to the the stream size write. This means the IEnumerable was iterated through on the service prior to the client receiving it. – Paul Tsai Mar 18 '18 at 01:30
  • So you are trying to **stream** data from the server to the client? – mjwills Mar 18 '18 at 01:33
  • @mjwills Yes. But I would like it to utilize IEnumerable if possible. In my code, the entire set of Entity does not need to be instantiated in order to be sent. The entity could be instantiated one at a time as necessary. – Paul Tsai Mar 18 '18 at 01:38
  • 2
    What you're trying to do actually gets a lot more complex. Instead of having `SerializeToStreamGzip` serialize all elements from `e` in a single step, you would need to serialize them one-by-one and somehow find a way of being able to continue that only after a handle to your Stream is returned to WCF. Therefore, an instance of `MemoryStream` (acting as a single buffer, upfront) is inadequate and you would have to return your own implementation of `Stream` for which WCF can continually invoke the `Read` method to get following chunks. I would probably first look for a library that can do this – Biscuits Mar 18 '18 at 02:04
  • @Biscuits Would you know of any library that would do this? – Paul Tsai Mar 18 '18 at 03:01

1 Answers1

1

I think you might be a little confused about IEnumerable. However, that aside, you really should be focusing your research on WCF Streaming

Check out this blog Custom WCF Streaming and its associated example. It basically encapsulates everything you want and also uses BinaryFormatter,

If you wanted to take it a step further you could probably make use of Protocol Buffer Protobuf-net or add your own ad-hock compression. However, i leave those details up to you.

Basic idea is : we will have two threads, one thread will execute the complex database query and another thread will stream database rows to the clients. So we will alter the database query such that it returns only 1000 rows at time. And modify the WCF service to stream these 1000 rows to client. While WCF service is streaming database rows to the client, at the same time on a different thread, WCF Service will run the database query again to get the next 1000 rows. This way as soon as the WCF Service finishes streaming rows to the client, the next set of rows are available to stream to the client

enter image description here

  1. WCF Client calling WCF service
  2. WCF Service executing database query
  3. Database returns dataset to WCF service
  4. WCF Service response
  5. Second database query executed by WCF service
  6. WCF Stream response
Bouke
  • 11,768
  • 7
  • 68
  • 102
TheGeneral
  • 79,002
  • 9
  • 103
  • 141
  • Thanks. This isn't what I'm looking for, but it may help me get to where I want to be. – Paul Tsai Mar 18 '18 at 14:59
  • @PaulTsai unfortunately what you want to do can't be done out of the box, and there is really no library I know that will do this. However as you noted the example is basically a common solution to this problem, though I think another solution which is muxh more stable is just ask for 1000 records at a time and a count of what's left and when thr client finishes processing it just asks for the next 1000. You'd be surprised out how efficient this can be. Anyway I think you know the problems and potential solutions enough now. Also you could put a bounty on this and it might drag out more – TheGeneral Mar 18 '18 at 22:05