5

I'm trying to write a CSV with non-ascii character using Python 3.

import csv

with open('sample.csv', 'w', newline='', encoding='utf-8') as csvfile:
    spamwriter = csv.writer(csvfile, delimiter=' ',
                            quotechar='|', quoting=csv.QUOTE_MINIMAL)
    spamwriter.writerow("嗨")

When I open the Excel file, see å—¨ instead. Am I doing something wrong here?

user1187968
  • 7,154
  • 16
  • 81
  • 152
  • Could the issue be with Excel not recognizing the encoding? Can you try following these instructions: https://www.itg.ias.edu/content/how-import-csv-file-uses-utf-8-character-encoding-0 – ryachza Oct 03 '17 at 19:31

1 Answers1

9

You need to indicate to Excel that this is a UTF-8 file; it won't assume so automatically.

You do this by putting a Byte Order Mark (BOM) at the start of the file:

with open('sample.csv', 'w', newline='', encoding='utf-8') as csvfile:
    csvfile.write('\ufeff')
Mark Ransom
  • 299,747
  • 42
  • 398
  • 622
  • If I don't want to write to a file, but I want to write to a buffer, how should I do it? Can you show example? – user1187968 Oct 03 '17 at 20:09
  • 1
    @user1187968 since your question didn't show writing to a buffer, I'm unsure how you would use `csv` to do that. – Mark Ransom Oct 03 '17 at 20:18
  • 5
    There is also `encoding='utf_8_sig'`, which inserts the BOM automatically. This is nice from a programmer's point of view, but maybe less so for an encoding purist (because UTF-8 doesn't actually need the BOM like UTF-16 does). – lenz Oct 05 '17 at 21:22