0

Greetings good people,

I have a rather strange issue (for me at least) regarding WebClient and reading data for a continuous data stream and I’m not really sure where the issue is. The stream receives data almost as expected, except the last row. But when new data arrives, the unprinted row prints above the new data.

For example, a set of lines is retreived and it could look like this:

<batch name="home">
    <event id="1"/>

And when the next set arrives, it contains the missing end block from the above set:

</batch>
<batch name="home">
    <event id="2"/>

The code presented is simplified, but hopefully is enough for getting a clearer picture.

WebClient _client = new WebClient();

_client.OpenReadCompleted += (sender, args) =>
{
    using (var reader = new StreamReader(args.Result))
    {
        while (!reader.EndOfStream)
        {
            Console.WriteLine(reader.ReadLine());
        }
    }
};

_client.OpenReadAsync(new Uri("localhost:1234/testdata?keep=true"));

In this setup the reader.EndOfStream never gets to true because the stream doesn't end. Anyone have a suggestion on how to retrieve the last line? Am I missing something or could the fault be with the API?

Kind regards :)

XenoPsy
  • 93
  • 1
  • 1
  • 6
  • 3
    First, don't use WebClient. That's an obsolete class meant to be used as a WinForms component for easy uploads, downloads. Use HttpClient instead. Second, XML strings may have no lines or rather, newlines. The whitespace outside elements is *not* significant. The service may not be sending `\n` or `\r\n` after ``. – Panagiotis Kanavos Jun 02 '22 at 09:32
  • 1
    You could just use `Console.WriteLine(reader.ReadToEnd());` – Charlieface Jun 02 '22 at 09:40
  • @PanagiotisKanavos, I'll look into HttpClient instead. But as for the XML string, so even if the line has a length of more than 0, it gets ignored because of missing return-characters? – XenoPsy Jun 02 '22 at 10:06
  • @Charlieface, reader.ReadToEnd() won't work because it finishes at the end of the stream, which is missing in this case. Thanks anyway :) – XenoPsy Jun 02 '22 at 10:11
  • 1
    As mentioned in the docs, `ReadLine` needs a line feed, `ReadToEnd` does not – Charlieface Jun 02 '22 at 10:12
  • That's not the point. The point is that XML doesn't care about newlines. Newlines aren't missing, they simply don't matter. Unless the service sends a character or byte that clearly identifies the end of a message, you'll have to use a different way of parsing the data. You could use XmlTextReader to read elements until you find the start of `batch` then read the element's XML with `ReadOuterXml` as shown [in this question](https://stackoverflow.com/questions/13642633/using-xmlreader-class-to-parse-xml-with-elements-of-the-same-name). – Panagiotis Kanavos Jun 02 '22 at 10:12
  • You may have to configure the reader to read XML fragments if the service doesn't start with a proper root element, by passing `new XmlReaderSettings{ConformanceLevel = ConformanceLevel.Fragment}` as the reader settings – Panagiotis Kanavos Jun 02 '22 at 10:13
  • @Charlieface the OP is trying to read XML fragments from a stream of messages, not a single document – Panagiotis Kanavos Jun 02 '22 at 10:13
  • @PanagiotisKanavos Then they need some kind of signalling or heuristic to know when the end of a fragment is. – Charlieface Jun 02 '22 at 10:15
  • @XenoPsy is this your own service? It would be far better to use a protocol and format suitable for message streaming. gRPC for example supports streaming out of the box. SignalR also supports streaming using WebSockets. Keeping a client stream open for long is very expensive for the server too, and very fragile. Either technology would result in far smaller messages, conserving bandwidth – Panagiotis Kanavos Jun 02 '22 at 10:25
  • @PanagiotisKanavos, sadly it's not my own service and this is the only protocol they offer. It seems that I have to ditch the stream and send periodical requests instead. That way I get complete data sets. Anyway, thanks for your assistance :) – XenoPsy Jun 03 '22 at 05:19

1 Answers1

1

It seems there's simply no newline character after the batch element. In XML whitespace, including newlines, isn't significant so no newlines are required. XML doesn't allow multiple root elements though, which makes this scenario a bit weird.

In streaming scenarios it's common to send each message unindented (ie in a single line) and send either a newline or another uncommon character to mark the end of the message. One would expect either no newlines at all, or a newline after each batch, eg :

<batch name="home"><event id="1"/>...</batch>
<batch name="home"><event id="2"/>...</batch>
<batch name="home"><event id="3"/>...</batch>

In that case you could use just a ReadLine to read each message:

var client=new HttpClient();
using var stream=client.GetStreamAsync(serviceUrl);

using var reader=new StreamReader(stream);
while(true)
{
    var msg=reader.ReadLine();
    var doc=XDocument.Parse(msg);
    ...
}

Without another way to identify each message though, you'll have to read each element form the stream. Luckily, LINQ-to-XML makes it a bit easier to read elements :

using var reader=XmlReader.Create(stream,new XmlReaderSettings{ConformanceLevel = ConformanceLevel.Fragment});
while (reader.Read())
{
    switch (reader.NodeType)
    {
        case XmlNodeType.Element:
            if (reader.Name == "batch") {
                    XElement el = XElement.ReadFrom(reader) as XElement;
                    //Process the batch!
            }
            break;
    }
}
Panagiotis Kanavos
  • 120,703
  • 13
  • 188
  • 236