I want to be able to use both Pyspark and AutoGluon libraries in a notebook backed by an EMR cluster. I have tried to install AutoGluon using the bootstrap script for the EMR cluster (emr-5.30.1) with the following
sudo python3 -m pip install autogluon
, but it fails with
Running setup.py install for ConfigSpace: finished with status 'error'
Complete output from command /bin/python3 -u -c "import setuptools, tokenize;__file__='/mnt/tmp/pip-build-1yey/ConfigSpace/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-n_888-record/install-record.txt --single-version-externally-managed --compile:
...
...
ConfigSpace/hyperparameters.c:4:10: fatal error: Python.h: No such file or directory
#include "Python.h"
^~~~~~~~~~
compilation terminated.
error: command 'gcc' failed with exit status 1
mxnet also installed using bootstrap script has Version: 1.6.0.(cannot upgrade to a higher version - No matching distribution found for mxnet==1.7.0
)
Is there any way I can get autogluon work with an EMR cluster?