0

I want to write to a csv using python's 3 module. However, I did not find any documentation that tells me how to pass an encoding argument.

My code:

for item in list_documents:
    print("The item is: ", item)
    wb = openpyxl.load_workbook(path+item)
    sh = wb.get_active_sheet()
    split_item = item.split(".")[0]
    new_name = str(split_item) + ".csv"
    with open(path + new_name, 'w', newline="") as f:
        c = csv.writer(f, delimiter=";")
        counter = 0
        for r in sh.rows:
            counter += 1
            print(counter)
            c.writerow([cell.value for cell in r])

My code reads lines from an xlsx file and puts them into a csv. For the csv.writer, I cannot seem to be able to specify that I want UTF-8 encoding.

The error message:

Traceback (most recent call last):
  File "C:/Users/aprofir/Desktop/python_project/transform_data/xlsx_to_csv.py", line 31, in <module>
    c.writerow([cell.value for cell in r])
  File "C:\Users\aprofir\AppData\Local\Programs\Python\Python37-32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u0142' in position 173: character maps to <undefined>

As I have understood, the character \u0142 refers to the polish letter ł. Is there a way I go around this. I cannot delete the data or alter it.

martineau
  • 119,623
  • 25
  • 170
  • 301
Adrian
  • 774
  • 7
  • 26
  • 1
    tried `with open(path + new_name, 'w', newline="", encoding="utf8")` ? – Jean-François Fabre May 16 '19 at 20:01
  • If you use UTF-8 as the encoding and want to open the CSV in Excel afterward, use `encoding='utf-8-sig'` instead; otherwise, Excel will assume the CSV is ANSI-encoded (a localized encoding, typically `cp1252` on US Windows). – Mark Tolonen May 17 '19 at 09:43
  • I suggest using open(os.path.join(path, new_name)...) rather than concatenating file names with the plus operator – picmate 涅 Jan 07 '21 at 00:31

1 Answers1

1

You can specify encoding when opening the file here:

with open(path + new_name, 'w', newline="", encoding='utf-8') as f:
sanyassh
  • 8,100
  • 13
  • 36
  • 70