0

I am generating CSV file which contains double byte characters. Right now I am using BinaryWriter with UTF-8 encoding. Problem is that generated CSV file has BOM prefix(preamble). How can I remove the preamble?

I am adding preamble to BinaryWriter because without it, it still shows unrecognized characters instead of double byte characters.

I tried using different encoding constructor for example: Dim encoding As New System.Text.UTF8Encoding(False) which didn't work.

   Dim encoding As Encoding = Encoding.UTF8
   Using bw As New BinaryWriter(fs, encoding)
       bw.Write(encoding.GetPreamble())
       bw.Write(data)
       bw.Flush()

       Using n As New Ionic.Zip.ZipFile(encoding)
           n.CompressionLevel = Ionic.Zlib.CompressionLevel.BestCompression
           fs.Position = 0
           n.AddEntry(fileName, fs)

           Response.Clear()
           Response.ContentType = "application/octet-stream"

           n.Save(Response.OutputStream)
       End Using

       bw.Close()
       bw.Dispose()
   End Using

This code will generate correct CSV file in Zip file with correct double byte characters but with preamble as prefix of whole data like: "���"

I want to remove this unrecognized prefix, but by removing the line: bw.Write(encoding.GetPreamble()) I lose whole encoding and double byte characters appear as not recognized and prefix is still there.

Changing the encoding constructor to: Dim encoding As New System.Text.UTF8Encoding(False) broke the encoding as well.

theduck
  • 2,589
  • 13
  • 17
  • 23
  • Try skipping the writing of GetPreamble yourself and just test with different UTF8Encoding constructor values. Compare the bytes of the output (for which it might be easier to write to a file instead of a zip stream). BTW—Unicode does not recommend a BOM with UTF-8 but it is a common practice with CSV files that people want to open with Excel while having it guess the character encoding (instead of going the Import Text route and saying what the "file origin" is.) – Tom Blodget Sep 30 '19 at 10:29
  • Alternative: Write out an xlsx file instead of a zipped CSV file. Row, column and value formatting is done by a library, it's zipped by definition and no one has to communicate and understand which character encoding is being used and all the other CSV format metadata. – Tom Blodget Sep 30 '19 at 10:32

0 Answers0