May I have your help to resolve below error from Pyhive Module. Issue: We have upgraded the Cloudera cluster from CDH version to CDP version. We are using Pyhive python module to get the impala connection from Impala using pyhive hive.connect(User,password,host,port,auth=LDAP). We are getting below error for some queries which are submitted through pandas read_sql and some queries are getting executed fine and returning DF. This was fine before upgrade and queries have no issues and all were returning results.
conn = pyhive.Connection(host=impala_host, port=impala_port, username=user, password=password, auth="LDAP")
PFB the stack trace.
data = pd.read_sql(sql, conn)
File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 608, in read_sql
chunksize=chunksize,
File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 2130, in read_query
data = self._fetchall_as_list(cursor)
File "/usr/local/lib/python3.7/site-packages/pandas/io/sql.py", line 2144, in _fetchall_as_list
result = cur.fetchall()
File "/usr/local/lib/python3.7/site-packages/pyhive/common.py", line 137, in fetchall
return list(iter(self.fetchone, None))
File "/usr/local/lib/python3.7/site-packages/pyhive/common.py", line 106, in fetchone
self._fetch_while(lambda: not self._data and self._state != self._STATE_FINISHED)
File "/usr/local/lib/python3.7/site-packages/pyhive/common.py", line 46, in _fetch_while
self._fetch_more()
File "/usr/local/lib/python3.7/site-packages/pyhive/hive.py", line 477, in _fetch_more
_check_status(response)
File "/usr/local/lib/python3.7/site-packages/pyhive/hive.py", line 585, in _check_status
raise OperationalError(response)
pyhive.exc.OperationalError: TFetchResultsResp(status=TStatus(statusCode=2, infoMessages=None, sqlState=None, errorCode=None, errorMessage=None), hasMoreRows=True, results=None)
We have verified the source code of Pyhive and the cursor.fetchall() not waiting on sleep() and immediately coming out because the query status (code=2) is still running at backend.