I'm trying to read a csv file containing foreign characters (french accents at the moment but will be russian as well in the future). Is there a way to read these csvs without removing/replacing the foreign characters?
Whenever I try:
pd.read_csv('filename.csv', encoding='utf-8')
, it fails to find any columns.
So I tried this:
with codecs.open('filename.csv', 'r') as f:
for line in f.readlines():
print line
It just outputs [Decode error - output not utf-8]
for some lines (the ones with áéí etc.)
I have also tried the suggestion below to get the encoding for the file and when I read the file with the correct encoding, I get UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in position 0: ordinal not in range(128)
or something similar!
Any ideas? Thanks in advance