3

I have the following code:

using (BinaryReader br = new BinaryReader(
       File.Open(FILE_PATH, FileMode.Open, FileAccess.ReadWrite)))
{
    int pos = 0;
    int length = (int) br.BaseStream.Length;

    while (pos < length)
    {
        b[pos] = br.ReadByte();
        pos++;
    }

    pos = 0;
    while (pos < length)
    {
        Console.WriteLine(Convert.ToString(b[pos]));
        pos++;
    }
}

The FILE_PATH is a const string that contains the path to the binary file being read. The binary file is a mixture of integers and characters. The integers are 1 bytes each and each character is written to the file as 2 bytes.

For example, the file has the following data :

1HELLO HOW ARE YOU45YOU ARE LOOKING GREAT //and so on

Please note: Each integer is associated with the string of characters following it. So 1 is associated with "HELLO HOW ARE YOU" and 45 with "YOU ARE LOOKING GREAT" and so on.

Now the binary is written (I do not know why but I have to live with this) such that '1' will take only 1 byte while 'H' (and other characters) take 2 bytes each.

So here is what the file actually contains:

0100480045..and so on Heres the breakdown:

01 is the first byte for the integer 1 0048 are the 2 bytes for 'H' (H is 48 in Hex) 0045 are the 2 bytes for 'E' (E = 0x45)

and so on.. I want my Console to print human readable format out of this file: That I want it to print "1 HELLO HOW ARE YOU" and then "45 YOU ARE LOOKING GREAT" and so on...

Is what I am doing correct? Is there an easier/efficient way? My line Console.WriteLine(Convert.ToString(b[pos])); does nothing but prints the integer value and not the actual character I want. It is OK for integers in the file but then how do I read out characters?

Any help would be much appreciated. Thanks

Alfred Myers
  • 6,384
  • 1
  • 40
  • 68
zack
  • 7,115
  • 14
  • 53
  • 63

3 Answers3

8

I think what you are looking for is Encoding.GetString.

Since your string data is composed of 2 byte characters, how you can get your string out is:

for (int i = 0; i < b.Length; i++)
{
  byte curByte = b[i];

  // Assuming that the first byte of a 2-byte character sequence will be 0
  if (curByte != 0)
  { 
    // This is a 1 byte number
    Console.WriteLine(Convert.ToString(curByte));
  }
  else
  { 
    // This is a 2 byte character. Print it out.
    Console.WriteLine(Encoding.Unicode.GetString(b, i, 2));

    // We consumed the next character as well, no need to deal with it
    //  in the next round of the loop.
    i++;
  }
}
paracycle
  • 7,665
  • 1
  • 30
  • 34
  • You'll need to read the first "id" byte separately, then translate the rest of the bytes using the proper encoding. – tvanfosson Aug 20 '09 at 21:50
  • Oh, I missed that bit of the question. I will edit my answer. – paracycle Aug 20 '09 at 21:58
  • How does the code determine where the first string ends? Without that information, you wont't know when to search for the next number. – Alfred Myers Aug 20 '09 at 22:15
  • It doesn't, it reads the array byte by byte until it hits a 0 byte which it assumes to be the start of a 2 byte character sequence. After that it consumes the next 2 bytes and checks the next byte to see if it is also the first byte of a 2 byte character sequence, if not it assumes it is an integer and so on. – paracycle Aug 20 '09 at 22:26
  • Oh yeah... Now I see... When reading the code I skipped the (b, i, 2) part. That'll work as long he doesn't have any characters above 0xFF which is reasonable to infer given the example. +1 for you. – Alfred Myers Aug 20 '09 at 22:32
  • Thanks; and, yes there is a little assumption going on given the info we have been supplied but I tried to make the assumptions as obvious as I can in the answer. – paracycle Aug 20 '09 at 22:36
  • @Paracycle : Thanks. your code seems to work fine with a little modifictions here and there. But no change in the logic. Thanks a lot man – zack Aug 21 '09 at 20:16
2

You can use String System.Text.UnicodeEncoding.GetString() which takes a byte[] array and produces a string.

I found this link very useful

Note that this is not the same as just blindly copying the bytes from the byte[] array into a hunk of memory and calling it a string. The GetString() method must validate the bytes and forbid invalid surrogates, for example.

Jacob Seleznev
  • 8,013
  • 3
  • 24
  • 34
  • You really ought to add a summary so your answer can stand on its own. It's not my downvote, but I can certainly understand why someone thought it wasn't helpful. – tvanfosson Aug 20 '09 at 21:51
0
using (BinaryReader br = new BinaryReader(File.Open(FILE_PATH, FileMode.Open, FileAccess.ReadWrite)))
{    
   int length = (int)br.BaseStream.Length;    

   byte[] buffer = new byte[length * 2];
   int bufferPosition = 0;

   while (pos < length)    
   {        
       byte b = br.ReadByte();        
       if(b < 10)
       {
          buffer[bufferPosition] = 0;
          buffer[bufferPosition + 1] = b + 0x30;
          pos++;
       }
       else
       {
          buffer[bufferPosition] = b;
          buffer[bufferPosition + 1] = br.ReadByte();
          pos += 2;
       }
       bufferPosition += 2;       
   }    

   Console.WriteLine(System.Text.Encoding.Unicode.GetString(buffer, 0, bufferPosition));

}

LorenVS
  • 12,597
  • 10
  • 47
  • 54
  • 1
    I am getting the following compiler errors when I try using your code at the line buffer[bufferPosition + 1] = b + 0x30; : error CS0266: Cannot implicitly convert type 'int' to 'byte'. An explicit conversion exists (are you missing a cast?) – zack Aug 21 '09 at 01:05
  • Also I checked the value of the length variable. It is including the count of the zeros. So I dont think theres a need to multiply it by 2 initially as you have done. – zack Aug 21 '09 at 01:11
  • Sorry, I forgot to cast the hex value, that line should be buffer[bufferPosition + 1] = b + (byte)0x30; You do, however, need to multiply the buffer length by 2, as the overall size of the array could double if the entire input is integers – LorenVS Aug 21 '09 at 14:08