0

I've an application that search XML over the network (using TcpClient), these XML have various encoding (one site in UTF8, other in Windows-1252). I would like encode all of these XML in UTF-8 (always) to be sure I'm clean.

How can I do the conversion from the NetworkStream to an XElement encoding correctly all data?

I've this :

NetworkStream _clientStream = /* ... */;
MemoryStream _responseBytes = new MemoryStream();

// serverEncoding -> Xml Encoding I get from server
// _UTF8Encoder -> Local encoder (always UTF8)

try
{
    _clientStream.CopyTo(_responseBytes);

    if (serverEncoding != _UTF8Encoder)
    {
        MemoryStream encodedStream = new MemoryStream();
        string line = null;
        using (StreamReader reader = new StreamReader(_responseBytes))
        {
            using (StreamWriter writer = new StreamWriter(encodedStream))
            {
                while ((line = reader.ReadLine()) != null)
                {
                    writer.WriteLine(
                        Encoding.Convert(serverEncoding, _UTF8Encoder, serverEncoding.GetBytes(line))
                        );
                }
            }
        }
        _responseBytes = encodedStream;
    }

    _responseBytes.Position = 0;
    using (XmlReader reader = XmlReader.Create(_responseBytes))
    {
        xmlResult = XElement.Load(reader, LoadOptions.PreserveWhitespace);
    }
}
catch (Exception ex)
{ }

Have you a better solution (and by ignoring all '\0' ?).


Edit

This works :

byte[] b = _clientStream.ReadToEnd();
var text = _UTF8Encoder.GetString(b, 0, b.Length);
xmlResult = XElement.Parse(text, LoadOptions.PreserveWhitespace);

But this not :

using (var reader = new StreamReader(_clientStream, false))
    xmlResult = XElement.Load(reader, LoadOptions.PreserveWhitespace);

I don't understand why ...

Arnaud F.
  • 8,252
  • 11
  • 53
  • 102

1 Answers1

2

You can simply create a StreamReader around the NetworkStream, passing the encoding of the stream, then pass it to XElement.Load:

XElement elem
using(var reader = new StreamReader = new StreamReader(_clientStream, serverEncoding))
    elem = XElement.Load(reader);

There is no point in manually transcoding it to a different encoding.

SLaks
  • 868,454
  • 176
  • 1,908
  • 1,964
  • It doesn't work. Program block on "Load" because it doesn't perform a Read operation ... But when I do : reader.BaseStream.Read() ... I got a result back... – Arnaud F. Mar 22 '11 at 15:47
  • That shouldn't happen. Are you sure you have the correct encoding? – SLaks Mar 22 '11 at 15:50
  • Yes, sure... Is the encoding blocking when not correct !? If yes, why not just throw an exception? ... – Arnaud F. Mar 22 '11 at 15:53
  • If you pass an encoding with multi-byte chars, it can block forever while waiting for the next byte in a multi-byte char if it's at the end of the stream. – SLaks Mar 22 '11 at 15:56
  • I removed encoding from StreamReader, same problem ... `using(var reader = new StreamReader(_clientStream, false)) elem = XElement.Load(reader, LoadOptions.PreserveWhitespace);` – Arnaud F. Mar 22 '11 at 16:00