0

I opened and manipulated a .csv file. It contains Cyrillic script. When I try to open and to save it as a .xlsx file i get an error. When I save the new .csv file and open it, the Cyrillic script turns into random characters and is practically unreadable (Ангел, Димитър, Мария etc.) You can see what i get as a result

What should I do?

Zozo
  • 81
  • 2
  • 8
  • Does this answer your question? [How to write Russian characters in file?](https://stackoverflow.com/questions/3198765/how-to-write-russian-characters-in-file) – zmike Jun 30 '20 at 19:39
  • This very much looks like an encoding problem. I can't give you a straight answer, but I certainly would look in the documentation for how to handle and preserve UTF-8 encoding. By the way, it certainly helps if you post your code. Other people can then try to reproduce your (erroneous) result. – Ronald Jun 30 '20 at 19:40
  • Use `encoding='utf-8-sig'` for the CSV at least if you are viewing the file in Excel. Show your code! – Mark Tolonen Jun 30 '20 at 21:29

1 Answers1

0

Both output files opened in Excel correctly with this. Note that .to_excel() requires an additional Python package to write Excel files. I used pip install openpyxl:

input.csv:

Колонка1,Колонка2,Колонка3
Раз,два,три

Code:

import pandas as pd

data = pd.read_csv('input.csv',encoding='utf-8-sig') # or whatever the actual encoding
data.to_csv('output.csv',encoding='utf-8-sig')
data.to_excel('output.xlsx',encoding='utf-8-sig')

FYI, .to_csv() did not work with utf8 alone, but .to_excel() did.

Mark Tolonen
  • 166,664
  • 26
  • 169
  • 251