We have a Python 3.7 application running on an AWS EC2 instance (Amazon Linux) that performs SQL queries against a Cloudera Impala service using pyodbc (4.0.27) and the Cloudera Impala ODBC driver (installed using ClouderaImpalaODBC-2.6.5.rpm). This application has been running successfully for several years.
I'm currently trying to get the application running in a Docker container running Ubuntu 18.04.4 LTS, but having trouble with the following error when running even the most basic query (e.g. SELECT 'HELLO'
):
Error: ('HY000', '[HY000] [Cloudera][ImpalaODBC] (110) Error while executing a query in Impala: [HY000] : ParseException: Syntax error in line 1:\\n\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\\n^\\nEncountered: Unexpected character\\nExpected: ALTER, COMMENT, COMPUTE, COPY, CREATE, DELETE, DESCRIBE, DROP, EXPLAIN, GRANT, INSERT, INVALIDATE, LOAD, REFRESH, REVOKE, SELECT, SET, SHOW, TRUNCATE, UPDATE, UPSERT, USE, VALUES, WITH\\n\\nCAUSED BY: Exception: Syntax error\\n\\x00\u6572\u3a64\u5520\u656e\u7078\u6365\u6574\\u2064\u6863\u7261\u6361\u6574\u0a72 (110) (SQLExecDirectW)')"}
Needless to say this looks like a string encoding problem.
Some context housekeeping:
- the python code on both systems (Amazon Linux / Ubuntu) is identical
- the Impala ODBC driver installations on both systems have the same version (2.6.5); the Impala ODBC driver for Ubuntu was downloaded directly from the Cloudera website (https://www.cloudera.com/downloads/connectors/impala/odbc/2-6-5.html)
- the Impala ODBC connection params are identical except for the OS specific items:
- "HOST": "[host]"
- "PORT": 21050
- "Database": "[database]
- "UID": "[username]"
- "PWD": "[password]"
- "Driver": "{/opt/cloudera/impalaodbc/lib/64/libclouderaimpalaodbc64.so}"
- "UseSASL": 1
- "AuthMech": 3
- "SSL": 1
- "CAIssuedCertNamesMismatch": 1
- "TrustedCerts": "[path_to_certs_file]"
- "TSaslTransportBufSize": 1000
- "RowsFetchedPerBlock": 10000
- "SocketTimeout": 0
- "StringColumnLength": 32767
- "UseNativeQuery": 0
- The application appears to be connecting successfully to Impala as there is no error calling
pyodbc.connect(**config, autocommit=True)
or getting the cursor from the connection (have tried with invalid creds to make sure, and get the usual connection errors when creds are wrong). The details of the error message indicate the correct ODBC driver is being used
I have tried playing around with different values for the Impala ODBC driver param "DriverManagerEncoding" such as "UTF-16", "UTF-32" or not having it at all (which is the case for the Amazon Linux setup) but always get the same error.
I also tried using the odbclinux tool isql on both system to try troubleshooting that way; was able to connect successfully from Amazon Linux system, but could never connect on Ubuntu - consistently get the following (not sure if this is related or some other issue):
iusql -v [DSN]
[unixODBC][
[ISQL]ERROR: Could not SQLDriverConnect