I have an issue with writing the character 56623 to a stream using a StreamWriter in UTF16 (the issue persists in other encodings as well). If I get the buffer from the stream, it contains the value 65533 instead of what I originally wrote. This issue snuck up on me when doing randomised unit tests and it does does not appear for value 60000 nor 95.
To illustrate, I have a minimal program to check the behaviour:
char value = (char)56623;
MemoryStream stream = new MemoryStream();
StreamWriter writer = new StreamWriter(stream, Encoding.Unicode);
writer.Write(value);
writer.Close();
var byteArray = BitConverter.GetBytes(value); // Reference bytes
var buffer = writer.GetBuffer();
By reading byteArray and buffer I get:
byteArray = [221,47] = 11011101 00101111 = 56623
buffer = [255,254,253,255,...] = BOM 11111101 11111111 ... = BOM 65533
Thus, the written value 65533 is clearly not equal to the original 56623. However, when trying with the value 60000 the correct values are written:
byteArray = [96,234] = 01100000 11101010 = 60000
buffer = [255,254,96,234,...] = BOM 01100000 11101010 ... = BOM 60000
I fail to understand why this is the behaviour, but I am unwilling to think that there is an issue with the implementation of StreamWriter so there has the be something I am missing.
What is it that I am not seeing here?
Thank you!