0

Trying to use System.Net.WebSockets.Websocket with ASP.Net Core, but getting a problem when receiving Chinese text.

This is my code:

while (_socket.State == WebSocketState.Open)
{
    var result = await _socket.ReceiveAsync(_reciveBufferSegment, _source.Token);

    if (_source.Token.IsCancellationRequested) return;
    if (result.MessageType != WebSocketMessageType.Close)
    {
        if (_source.Token.IsCancellationRequested) return;
        await _recivedData.WriteAsync(_reciveBuffer, 0, result.Count);
        if (result.EndOfMessage)
        {
            _recivedData.Position = 0;
            var data = _recivedData.ToArray();
            _recivedData.SetLength(0);
            switch (result.MessageType)
            {
                case WebSocketMessageType.Binary:
                    if (OnBinaryMessage != null)
                        Task.Run(() => OnBinaryMessage(this, data));
                    break;
                case WebSocketMessageType.Text:
                    string decode = null;
                    try
                    {
                        decode = Encoding.UTF8.GetString(data);
                    }
                    catch
                    {
                        decode = "UTF8 decode failure";
                    }
                    if (OnTextMessage != null)
                        Task.Run(() => OnTextMessage(this, decode));
                    break;
            }
        }
    }
}

If client sends a 50 bytes text with multiple-byte characters and server receive with buffer smaller than 50 and will cut multiple-byte characters, an Invalid UTF-8 exception will occur.

I port the same code to .Net Framework 4.52 and host on IIS and it works fine, did I miss something? Or any config that can disable the utf-8 check?

testable solution github

Hsin-Yu Chen
  • 59
  • 1
  • 4
  • The two servers are returning different http headers. Use fiddler and compare results between to headers. The solution is usually to add missing headers to the request so your code will work on both servers. – jdweng Aug 02 '16 at 10:38
  • 1
    Double-byte text isn't UTF-8, it's UTF-16. One byte = 8 bit. Two bytes = 16 bit. so if the text is consistently 2-byte, you just have to test if it's UTF16LE or UTF16BE – Nyerguds Aug 02 '16 at 10:51
  • 1
    UTF-8 is variable-length, it can encoding all possible characters – Hsin-Yu Chen Aug 02 '16 at 11:04
  • @DavidG Thanks for edit – Hsin-Yu Chen Aug 02 '16 at 15:03
  • @jdweng There doesn't seem to be much difference between two request/response headers 。[chrome network panel](http://imgur.com/a/jGHQv) – Hsin-Yu Chen Aug 02 '16 at 17:09

0 Answers0