I am aware of Change Apache Livy's Python Version and How do i setup Pyspark in Python 3 with spark-env.sh.template.
I also have seen the Livy documentation
However, none of that works. Livy keeps using Python 2.7 no matter what.
This is running Livy 0.6.0 on an EMR cluster.
I have changed the
PYSPARK_PYTHON
environment variable to/usr/bin/python3
in the hadoop user, my user, the root, and ec2-user. Logging into the EMR master node viassh
and runningpyspark
starts python3 as expected. But, Livy keeps using python2.7.I added
export PYSPARK_PYTHON=/usr/bin/python3
to the/etc/spark/conf/spark-env.sh
file. Livy keeps using python2.7.I added
"spark.yarn.appMasterEnv.PYSPARK_PYTHON":"/usr/bin/python3"
and"spark.executorEnv.PYSPARK_PYTHON":"/usr/bin/python3"
to the items listed below and in every case . Livy keeps using python2.7.- sparkmagic
config.json
andconfig_other_settings.json
files before starting a PySpark kernel Jupyter - Session Properties in the sparkmagic
%manage_spark
Jupyter widget. Livy keeps using python2.7. %%spark config
cell-magic before the line-magic%spark add --session test --url http://X.X.X.X:8998 --auth None --language python
- sparkmagic
Note: This works without any issues in another EMR cluster running Livy 0.7.0 I have gone over all of the settings on the other cluster and cannot find what is different. I did not have to do any of this on the other cluster, Livy just used python3 by default.
How exactly do I get Livy to use python3 instead of python2?