I am trying to import an csv that contains Chinese characters.
this command is to download the csv file
!wget -O wm.csv https://raw.githubusercontent.com/hierarchyJK/compare-LIBSVM-with-Linear-and-Gassian-Kernel/master/%E8%A5%BF%E7%93%9C3.0.csv
The repository is not mine, so I am not sure if it is encoded the right way.
what I can be sure is that it renders correctly.
this code
pd.read_csv('wm.csv',encoding = 'utf-8')
causes this Error
'utf-8' codec can't decode byte 0xb1 in position 0: invalid start byte
I've searched this error, didn't find appropriate rca and solution.
this code executed properly
pd.read_csv('wm.csv',encoding = 'cp1252')
but renders the garbled
the system renders Chinese characters correctly.
with python open command
with open('wm.csv', 'r', encoding='cp1252') as f:
for line in f.readlines():
print(line)
break
this code renders something garbled without any warning or error.
±àºÅ,É«Ôó,¸ùµÙ,ÇÃÉù,ÎÆÀí,Æê²¿,´¥¸Ð,ÃܶÈ,º¬ÌÇÂÊ,ºÃ¹Ï,Ðò¹ØÏµ