I want to read the data(from the Hadoop database) which has characters other than ACSII characters. I am trying to read the data using .py file. I have used
#!/usr/bin/env python
# -*- coding: utf-8 -*-
to specify the encoding.
I have used below query to pull the data.
def hiveconnection(host_name, port, user, database):
conn = hive.Connection(host=host_name, port=port, username=user, database=database, auth='KERBEROS', kerberos_service_name='impala')
cur = conn.cursor()
cur.execute(" select * from db_name.table_name ")
result = cur.fetchall()
return result
output = hiveconnection(host_name, port, user, database)
denialt2= pd.DataFrame(output)
I had got the error message. Error message: " 'utf-8' codec can't decode byte 0x96 in position 13: invalid start byte". On investigating the error message, I got know that it is throwing error message because of special character other than ASCII character. Pasted the special character below from the one of the columns.
Attaching the complete traceback(error message).
Please help me to resolve the issue. Thanks in Advance:).