-1
import xlrd 
import pandas as pd
data = pd.read_csv("/Milk_Papers_Estimated_Class.csv")
path ='/Milk_Papers_Estimated_Class.csv'

I experience an error in the following code while trying to run the .csv file.:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe2 in position 504: invalid continuation byte.

I do not know why I am facing this error.Can anyone help me out with this?

Pedro Lobito
  • 94,083
  • 31
  • 258
  • 268
  • try `data = pd.read_csv("/Milk_Papers_Estimated_Class.csv", encoding="latin1")` or other encoding... – Pedro Lobito Mar 02 '20 at 20:39
  • Try this post: https://stackoverflow.com/questions/56453782/utf-8-codec-cant-decode-byte-0xe2-invalid-continuation-byte-error – Khaled Adrani Mar 02 '20 at 20:40
  • Pandas assumes the file is utf-8 encoded. This automatically works with ascii/latin1 encodings which is a subset of utf-8 already. But apparently this file has some other encoding, like perhaps utf-16 or a windows code page. Do you know how the file is encoded? – tdelaney Mar 02 '20 at 20:41

1 Answers1

0

By default the read_csv takes utf-8 as the encoder.

data = pd.read_csv("/Milk_Papers_Estimated_Class.csv", encoding='latin-1')

Try giving the encoding as latin-1 Might work:")

Srivatsav Raghu
  • 399
  • 4
  • 11