I am accessing dataset that lives on ftp server. after I download the data, I used pandas
to read it as csv
but I got an encoding error. The file has csv
file extension but after I opened the file with MS excell, data was in Unicode Text
format. I want to make conversion of those dataset that stored in Unicode text format. How can I make this happen? Any idea to get this done?
my attempt:
from ftplib import FTP
import os
def mydef():
defaultIP=''
username='cat'
password='cat'
ftp = FTP(defaultIP,user=username, passwd=password)
ftp.dir()
filenames=ftp.nlst()
for filename in files:
local_filename = os.path.join('C:\\Users\\me', filename)
file = open(local_filename, 'wb')
ftp.retrbinary('RETR '+ filename, file.write)
file.close()
ftp.quit()
then I tried this to get correct encoding:
mydef.encode('utf-8').splitlines()
but this one is not working for me. I used this solution
the output of above code:
here is output snippet of above code:
b'\xff\xfeF\x00L\x00O\x00W\x00\t\x00C\x00T\x00Y\x00_\x00R\x00P\x00T\x00\t\x00R\x00E\x00P\x00O\x00R\x00T\x00E\x00R\x00\t\x00C\x00T\x00Y\x00_\x00P\x00T\x00N\x00\t\x00P\x00A\x00R\x00T\x00N\x00E\x00R\x00\t\x00C\x00O\x00M\x00M\x00O\x00D\x00I\x00T\x00Y\x00\t\x00D\x00E\x00S\x00C\x00R\x00I\x00P\x00T\x00I\x00O\x00N\x00\t'
expected output
the expected output of this dataset should be in normal csv
data such as common trade data, but encoding doesn't work for me.
I used different encoding for getting the correct conversion of csv
format data but none of them works for me. How can I make that work? any idea to get this done? thanks