0

I got this weird .csv with some special characters. Using notepad++ built in "encoding detector" it gives me ANSI, but running:

with open('acidentes-unchanged.csv') as f:
    print(f)

Returns me "df = pd.read_csv('acidentes-unchanged.csv', encoding='cp1252')"

So, when I run:

import pandas as pd
    
df = pd.read_csv('acidentes-unchanged.csv', encoding='cp1252')
print(df)

It returns me this error: "UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 322: character maps to "

And, when I try to read it using encoding = 'ANSI', it vanishes with some characters...

1- What is the right encoding and how to know for sure, and 2- What I should do to translate those characters into UTF-8 or something like this? Tried using a dictionary (believing that the .csv is ANSI) but with no success

1 Answers1

0

Try to add encoding='utf8' in the open() like this:

with open('acidentes-unchanged.csv', encoding='utf8')...
Drakax
  • 1,305
  • 3
  • 9