For a customer project, a query is made against a DB and the results are written to a file. The file is required to be in Shift JIS as it is later used as input for another legacy system. The Wikipedia article indicates that:
The single-byte characters 0x00 to 0x7F match the ASCII encoding, except for a yen sign (U+00A5) at 0x5C and an overline (U+203E) at 0x7E in place of the ASCII character set's backslash and tilde respectively.
During some testing, I have verified that while the yen sign (U+00A5) properly becomes 0x5C, the overline (U+203E) becomes 0x3F (question mark) rather than the expected 0x7E.
While I am doing normal output with a StreamWriter to a file, below is minimal code to reproduce:
static void Test()
{
// Get Shift-JIS encoder.
var encoding = Encoding.GetEncoding("shift_jis");
// Declare overline (U+203E).
char c = (char) 0x203E;
// Get bytes when encoded as Shift-JIS.
var bytes = encoding.GetBytes(c.ToString());
// Expected 0x7E, but the value returned is 0x3F.
}
Is this behavior correct? I suppose I could subclass EncoderFallback, but this seems like far more work for something that I would have expected to work from the start.