0

I'm trying to convert a string to a byte[] using the ASCIIEncoder object in the .NET library. The string will never contain non-ASCII characters, but it will usually have a length greater than 16. My code looks like the following:

public static byte[] Encode(string packet)
{
    ASCIIEncoder enc = new ASCIIEncoder();
    byte[] byteArray = enc.GetBytes(packet);
    return byteArray;
}

By the end of the method, the byte array should be full of packet.Length number of bytes, but Intellisense tells me that all bytes after byteArray[15] are literally questions marks that cannot be observed. I used Wireshark to view byteArray after I sent it and it was received on the other side fine, but the end device did not follow the instructions encoded in byteArray. I'm wondering if this has anything to do with Intellisense not being able to display all elements in byteArray, or if my packet is completely wrong.

Dmitry
  • 13,797
  • 6
  • 32
  • 48
dan-0
  • 199
  • 1
  • 11
  • 1
    Intellisense has drill-down, you should eb able to compekltely verify your array. If needs be, write a method to do it. – H H Aug 14 '14 at 18:41

2 Answers2

2

If your packet string basically contains characters in the range 0-255, then ASCIIEncoding is not what you should be using. ASCII only defines character codes 0-127; anything in the range 128-255 will get turned into question marks (as you have observed) because there characters are not defined in ASCII.

Consider using a method like this to convert the string to a byte array. (This assumes that the ordinal value of each character is in the range 0-255 and that the ordinal value is what you want.)

public static byte[] ToOrdinalByteArray(this string str)
{
    if (str == null) { throw new ArgumentNullException("str"); }

    var bytes = new byte[str.Length];
    for (int i = 0; i < str.Length; ++i) {
        // Wrapping the cast in checked() will trigger an OverflowException
        // if the character being converted is out of range for a byte.
        bytes[i] = checked((byte)str[i]);
    }

    return bytes;
}

The Encoding class hierarchy is specifically designed for handling text. What you have here doesn't seem to be text, so you should avoid using these classes.

cdhowie
  • 158,093
  • 24
  • 286
  • 300
  • Is there anything specific about the array being 16 elements wide? I noticed on byte arrays larger than 16, the question marks will show up beyond the 16th element, regardless of what should appear in the 16th place and up. The questions marks never appear throughout the array, only in the 16th and up positions. Moreover, the last character in every packet is a carriage return. The carriage return's ASCII num is 10, much less than 127 and therefore within hte range of ASCII encoding. That's why I think my array size has something to do with it, but I can't understand why. – dan-0 Aug 14 '14 at 21:51
  • @user2569316 No, the length of the array is irrelevant. ASCIIEncoding only turns ordinals that are not part of the proper ASCII set into question marks; this would be the range 127-255 only. – cdhowie Aug 14 '14 at 22:01
  • 1
    When using (or abusing) System.String for binary data, I use CP437 because it's round-trippable. CP437 has exactly 256 codepoints and encodes each in one byte. It also is a superset of ASCII, encoding its ASCII range the same as ASCII would. And, of course, Unicode (the character set used by System.String) is a superset of CP437. – Tom Blodget Aug 15 '14 at 16:32
2

The standard encoders use the replacement character fallback strategy. If a character doesn't exist in the target character set, they encode a replacement character ('?' by default).

To me, that's worse than a silent failure; It's data corruption. I prefer that libraries tell me when my assumptions are wrong.

You can derive an encoder that throws an exception:

Encoding.GetEncoding(
    "us-ascii",
    new EncoderExceptionFallback(), 
    new DecoderExceptionFallback());

If you are truly using only characters in Unicode's ASCII range then you'll never see an exception.

Tom Blodget
  • 20,260
  • 3
  • 39
  • 72