I built Spark 1.4 from the GH development master, and the build went through fine. But when I do a bin/pyspark
I get the Python 2.7.9 version. How can I change this?

- 10,736
- 12
- 72
- 116
-
7For anyone looking for how to do this: `PYSPARK_DRIVER_PYTHON=ipython3 PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark`, in which case it runs IPython 3 notebook. – tchakravarty May 16 '15 at 19:49
5 Answers
Just set the environment variable:
export PYSPARK_PYTHON=python3
in case you want this to be a permanent change add this line to pyspark script.

- 3,983
- 4
- 19
- 31

- 1,777
- 2
- 11
- 13
-
The environment variables can be edited under /etc/profile. Do not forget to execute "source /etc/profile" after saving the profile, so the changes can be taken into action immediately. – Phyticist Dec 05 '16 at 10:40
-
1
-
5It's better to add this to `$SPARK_HOME/conf/spark-env.sh` so `spark-submit` uses the same interpreter as well. – flow2k Jun 16 '19 at 21:01
-
-
PYSPARK_PYTHON=python3
./bin/pyspark
If you want to run in in IPython Notebook, write:
PYSPARK_PYTHON=python3
PYSPARK_DRIVER_PYTHON=ipython
PYSPARK_DRIVER_PYTHON_OPTS="notebook"
./bin/pyspark
If python3
is not accessible, you need to pass path to it instead.
Bear in mind that the current documentation (as of 1.4.1) has outdate instructions. Fortunately, it has been patched.

- 6,004
- 3
- 39
- 53

- 11,864
- 9
- 64
- 86
-
1I think your command for the IPython Notebook is not correct. Should be like this : PYSPARK_PYTHON=python3 PYSPARK_DRIVER_PYTHON=ipython3 PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark – SpiderRico Mar 13 '16 at 20:09
-
-
@ChrisNielsen In Linux or OS X is is a terminal/console. I have no idea how it works under Windows (when in Windows, I used Spark only on a Docker container). – Piotr Migdal Jan 19 '17 at 22:24
-
@SpiderRico These don't seem to work on my Mac. For Jupyter Notebook to work for Spark, use the following. PYSPARK_PYTHON=python3 PYSPARK_DRIVER_PYTHON=jupyter PYSPARK_DRIVER_PYTHON_OPTS="notebook" ./bin/pyspark – Hank Chan Jan 27 '20 at 11:48
1,edit profile :vim ~/.profile
2,add the code into the file: export PYSPARK_PYTHON=python3
3, execute command : source ~/.profile
4, ./bin/pyspark

- 251
- 3
- 2
Have a look into the file. The shebang line is probably pointed to the 'env' binary which searches the path for the first compatible executable.
You can change python to python3. Change the env to directly use hardcoded the python3 binary. Or execute the binary directly with python3 and omit the shebang line.

- 6,049
- 2
- 27
- 34
-
1Yeah, looking into the file helped. Needed to set the `PYSPARK_PYTHON` environment variable. – tchakravarty May 16 '15 at 19:34
For Jupyter Notebook, edit spark-env.sh
file as shown below from command line
$ vi $SPARK_HOME/conf/spark-env.sh
Goto the bottom of the file and copy paste these lines
export PYSPARK_PYTHON=python3
export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
Then, simply run following command to start pyspark in notebook
$ pyspark

- 1,371
- 2
- 16
- 20