1

I am using a C# Script Tasks in SSIS to output ASCII characters. I am doing this because I am creating a file with Packed fields, a packed field takes two digits into each byte, using a notation called Binary Coded Decimal.

So, I have found when outputting a NUL (0x00) [Dec 0] with a Œ (0x8C) [Dec 140] it adds an extra character  (0xC2) between them. I can't figure out why this is happening. Does anyone have any ideas? Please see my code below:

string fileName;
System.IO.StreamWriter writer;

public override void PreExecute()
{
    base.PreExecute();

    this.fileName = this.Variables.FilePath;
}

public override void PostExecute()
{
    base.PostExecute();
    writer.Flush();
    writer.Close();
}

public override void Input0_ProcessInputRow(Input0Buffer Row)
{
    writer.Write((char)00);
    writer.Write((char)140);

    writer.Write((char)13);
    writer.Write((char)10);
}

Output below:

ExtraCharacter

UPDATE One thing I didn't make a point of is that I am passing Hex Values into the C# Script and then writing the Characters represented by the hex value to a file with fixed length columns.

I don't know if this makes a difference, but I will also be writing other things to the file that aren't the packed values on the same lines as the packed values, and thus the reason for using the StreamWriter.

buzzzzjay
  • 1,140
  • 6
  • 27
  • 54
  • 4
    What encoding is your writer using? My guess here is that it's an UTF8 escape sequence. – H H Feb 08 '12 at 15:23
  • 1
    Values in BCD will not consistently be within the ASCII range, why aren't you using a `BinaryWriter` for this? – M.Babcock Feb 08 '12 at 15:25

3 Answers3

3

A StreamWriter is for writing text to a stream. It always uses an encoding and if you don't specify one when you create it it will use UTF-8 (without a byte order mark - BOM). The output you get is the UTF-8 encoder trying to translate the the text (in the form of individual characters) into UTF-8.

If you want to write bytes to a stream simply write to the stream directly using the Write method that accepts an array of bytes. If you want to write to a file you can create a FileStream and use that as the stream.

The naming of classes within the System.IO namespace can be confusing at times:

  • Stream is an abstract base class providing methods to read and write bytes
  • FileStream is a Stream that reads and writes to a file
  • BinaryWriter allows you to write primitive types in binary form to a Stream
  • TextWriter is an abstract base class that allows you to write text
  • StreamWriter is a TextWriter that allows you to write text to a Stream

You probably should use FileStream or BinaryWriter on top of a FileStream to solve your problem.

Martin Liversage
  • 104,481
  • 22
  • 209
  • 256
  • Do you think you could provide a sample or a link to a place using a BinaryWriter on top of a FileStream? I have tried using a BinaryWriter as suggested but it add weird character to any strings I output. – buzzzzjay Feb 08 '12 at 16:03
  • @buzzzzjay: `BinaryWriter` has overloads to write many primitive types to the stream including an overload to write strings. This overload will encode the strings (again using UTF-8) if you don't specify it yourself. In your case I think you should use the overload to write a single byte to the stream. – Martin Liversage Feb 08 '12 at 16:14
  • I have BinaryWriter overloading FileStream: FileStream fs = File.Create(@"C:\buzzzzjay.txt"); UTF8Encoding utf8 = new UTF8Encoding(); BinaryWriter bw = new BinaryWriter(fs, utf8); bw.Write("String Test"); It writes the string to the file as expected, but also adds (VT) 0x0B hex value at the begging of the string. So "(VT)String Test" is in the file. What am I doing wrong? – buzzzzjay Feb 08 '12 at 17:07
  • Never mind, I figured it out. I had to convert the string to a byte[] array so that it wouldn't generate special characters. Thanks! – buzzzzjay Feb 08 '12 at 17:25
1

You must have not specified the correct encoding of your writer.

See: http://msdn.microsoft.com/en-us/library/72d9f8d5.aspx

and: http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx

weston
  • 54,145
  • 21
  • 145
  • 203
1

It's an encoding issue. It shouldn't happen if you write *byte*s.

BinaryWriter writer = new BinaryWriter(someStream);
write.Write((byte)123); // just an example! not a "that's how you should do it"

A better solution would be to select the proper encoding. But does the way your characters look in the file really matter?

haiyyu
  • 2,194
  • 6
  • 22
  • 34
  • Yes, it does matter, what characters are in the file. The file I am generating has fixed lengths and adding an extra character causes the fields to have incorrect values and to be longer than they should. – buzzzzjay Feb 08 '12 at 15:27
  • Then using a charset with variable character length is not a good choice. One approach would be to use the ASCII charset, which uses exactly 7 bits for every character (last one is always 0). If you want to use additional characters which are not supported, you can add another 128 using the 8th bit. – haiyyu Feb 08 '12 at 15:29
  • Writing a byte like you have shown doesn't actually output an ASCII Character for the value 123. It just writes 123 to the file. This will cause the values within the file to be incorrect. – buzzzzjay Feb 08 '12 at 15:31
  • I suppose you should be using a BinaryWriter then. I'll edit my answer. Give me one minute. // Done - But still, you should not use that code for your purpose. Specify the proper encoding and you shouldn't have any more problems. – haiyyu Feb 08 '12 at 15:31