1

I have a small program that gets the list of databases in hive via thrift server endpoint. I am using pyhive. When I run it as a standalone program it works perfectly fine. I am using Python3.9.

Now when I run the same code from from Lambda, it gives me the following error.

No module named 'sasl.saslwrapper'

My code snippet to connect hive is given below

connection = hive.Connection(
        host='thrift server endpoint',
        port=10001,
        username='username',
        database='db name',
    )

I used the same dependencies present in the lambda zip file and ran the standalone program and it works fine again.

I have the following dependencies. Any suggestion would be appreciated.

drwxr-xr-x  3 root root  4096 Aug  1 15:09 google
drwxr-xr-x  2 root root  4096 Aug  1 15:09 protobuf-3.20.3.dist-info
drwxr-xr-x  2 root root  4096 Aug  1 15:10 mysql_connector_python-8.0.32.dist-info
drwxr-xr-x  5 root root  4096 Aug  1 15:10 mysql
drwxr-xr-x  5 root root  4096 Aug  1 15:10 mysqlx
-rw-r--r--  1 root root 34549 Aug  1 15:11 six.py
drwxr-xr-x  2 root root  4096 Aug  1 15:11 six-1.16.0.dist-info
drwxr-xr-x 10 root root  4096 Aug  1 15:11 future
drwxr-xr-x  2 root root  4096 Aug  1 15:11 future-0.18.3.dist-info
drwxr-xr-x  4 root root  4096 Aug  1 15:11 libpasteurize
drwxr-xr-x  4 root root  4096 Aug  1 15:11 libfuturize
drwxr-xr-x  7 root root  4096 Aug  1 15:12 past
drwxr-xr-x  6 root root  4096 Aug  1 15:12 dateutil
drwxr-xr-x  2 root root  4096 Aug  1 15:12 python_dateutil-2.8.2.dist-info
drwxr-xr-x  3 root root  4096 Aug  1 15:13 TCLIService
drwxr-xr-x  3 root root  4096 Aug  1 15:14 pyhive
drwxr-xr-x  2 root root  4096 Aug  1 15:14 pure_sasl-0.6.2.dist-info
drwxr-xr-x  3 root root  4096 Aug  1 15:14 puresasl
-rw-r--r--  1 root root   539 Aug  1 15:15 protobuf-3.20.3-py3.9-nspkg.pth
-rw-r--r--  1 root root   152 Aug  1 15:15 distutils-precedence.pth
drwxr-xr-x  7 root root  4096 Aug  1 15:16 setuptools
drwxr-xr-x  2 root root  4096 Aug  1 15:16 setuptools-58.1.0.dist-info
-rw-r--r--  1 root root  1916 Aug  1 15:25 hive_table_ddl.sql
drwxr-xr-x  2 root root  4096 Aug  1 16:59 requests-2.28.2.dist-info
drwxr-xr-x  2 root root  4096 Aug  1 16:59 requests
drwxr-xr-x  5 root root  4096 Aug  1 17:02 charset_normalizer
drwxr-xr-x  2 root root  4096 Aug  1 17:02 charset_normalizer-3.1.0.dist-info
drwxr-xr-x  3 root root  4096 Aug  1 17:05 idna
drwxr-xr-x  2 root root  4096 Aug  1 17:05 idna-3.4.dist-info
drwxr-xr-x  2 root root  4096 Aug  1 17:42 certifi-2023.7.22.dist-info
drwxr-xr-x  2 root root  4096 Aug  1 17:42 certifi
drwxr-xr-x  3 root root  4096 Aug  1 17:54 sasl
drwxr-xr-x  2 root root  4096 Aug  1 17:54 sasl-0.3.1.dist-info
drwxr-xr-x  6 root root  4096 Aug  1 18:03 thrift
drwxr-xr-x  2 root root  4096 Aug  1 18:03 thrift_sasl-0.4.3.dist-info
drwxr-xr-x  3 root root  4096 Aug  1 18:03 thrift_sasl
drwxr-xr-x  2 root root  4096 Aug  1 18:03 thrift-0.16.0.dist-info
drwxr-xr-x  2 root root  4096 Aug  1 18:19 __pycache__
-rw-r--r--  1 root root 10791 Aug  1 18:55 hive_update_compactor.py
-rw-r--r--  1 root root 11548 Aug  1 18:58 create_hive_queries.py
  • What version of sasl are you using? When I try to install it in Pip3.9 I get the error: "note: This error originates from a subprocess, and is likely not a problem with pip. error: legacy-install-failure × Encountered error while trying to install package. ╰─> sasl" Also, when looking at the sasl page in pypi.org, you can see it hasn't been updated in years. Can you do a clean install of your deps in a container or a virtualenv? – Uberhumus Aug 02 '23 at 08:57
  • Couldn't install it in Python3.7 either. I suspect the issue is a broken package. However when I tried to install it with `pip3 install sasl==0.3` it worked. So it could be that the latest version (0.3.1) is broken but 0.3 is fine. After installing sasl==0.3 I was then able to import `sasl.saslwrapper`. LMK if this solves your problem, and I will publish an answer. – Uberhumus Aug 02 '23 at 11:10

0 Answers0