Binary Writer/Reader extra character

Question

I am converting some legacy VB6 code to C# and this just has me a little baffled. The VB6 code wrote certain data sequentially to a file. This data is always 110 bytes. I can read this file just fine in the converted code, but I'm having trouble with when I write the file from the converted code.

Here is a stripped down sample I wrote real quick in LINQPad:

void Main()
{
  int[,] data = new[,]
  {
    {
      0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
    },
    {
      20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39
    }
  };

  using ( MemoryStream stream = new MemoryStream() )
  {
    using ( BinaryWriter writer = new BinaryWriter( stream, Encoding.ASCII, true ) )
    {
      for( var i = 0; i < 2; i++ )
      {
        byte[] name = Encoding.ASCII.GetBytes( "Blah" + i.ToString().PadRight( 30, ' ' ) );

        writer.Write( name );
        for( var x = 0; x < 20; x++ )
        {
          writer.Write( data[i,x] );
        }
      }
    }

    using ( BinaryReader reader = new BinaryReader( stream ) )
    {
      // Note the extra +4 is because of the problem below.
      reader.BaseStream.Seek( 30 + ( 20 * 4 ) + 4, SeekOrigin.Begin );

      string name = new string( reader.ReadChars(30) );
      Console.WriteLine( name );

      // This is the problem..This extra 4 bytes should not be here.
      //reader.ReadInt32();

      for( var x = 0; x < 20; x++ )
      {
        Console.WriteLine( reader.ReadInt32() );
      }
    }
  }
}

As you can see, I have a 30 character string written first. The string is NEVER longer than 30 characters and is padded with spaces if it is shorter. After that, twenty 32-bit integers are written. It is always 20 integers. So I know each character in a string is one byte. I know a 32 bit integer is four bytes. So in my reader sample, I should be able to seek 110 bytes ( 30 + (4 * 20) ), read 30 chars, and then read 20 ints and that's my data. However, for some reason, there is an extra 4 bytes being written after the string.

Am I just missing something completely obvious (as is normally the case for myself)? Strings aren't null terminated in .Net and this is four bytes anyway, not just an extra byte? So where is this extra 4 bytes coming from? I'm not directly calling Write(string) so it can't be a prefixed length, which it's obviously not since it's after my string. If you uncomment the ReadInt32(), it produces the desired result.

Now if you were to use a debugger and check the length of name you would see your error. — Philip Stuyck, Jul 08 '15 at 02:15

score 2 · Accepted Answer · answered Jul 08 '15 at 02:04

2

The extra 4 bytes are from the extra 4 characters you're writing. Change the string you're encoding as ASCII to this:

("Blah" + i.ToString()).PadRight(30, ' ')

That is, pad the string after you've concatenated the prefix and the integer.

answered Jul 08 '15 at 02:04

Peter Duniho

68,759
7
102
136

Awesome, I knew it was something obvious. I've been staring at this for too long I think. Thanks! – Calix Jul 08 '15 at 02:15

score 0 · Answer 2 · answered Jul 08 '15 at 03:08

Your extra four bytes are whitespace, because you aren't subtracting the length of 'Blah'. You don't know where you are in your stream. So basically, you think you're writing only 30 chars, but you really wrote 34 chars.

I know you didn't ask this - but you're writing garbage data to a file that doesn't need to be there.

Instead of padding your string with whitespace, you should just include a header or pointer that indicates the length of the next field in your file.

For example, say you have a 120 byte file. The first 4 bytes of the file indicate that the length of the following string is 96 bytes. So you read 4 bytes, get the length and then read 96 bytes. The next 4 bytes say that you have a string that's 16 bytes long, so you read the next 16 bytes and get your next string. This is pretty much how every well defined protocol works.

Binary Writer/Reader extra character

2 Answers2