0

I have a cluster with Spark 2.2 on CDH 5.12 with RHEL and I am trying to set up IPython to use with pyspark2. I have installed IPython 5.x LTS (long term support) but I am not able to get it to work.

So far

yum -y update
yum install epel-release
yum -y install python-pip
yum groupinstall 'Development Tools'
yum install python-devel

pip install IPython==5.0 --user

But I am unable to get it to work. Anyone with an idea of what am I missing?

xmorera
  • 1,933
  • 3
  • 20
  • 35

1 Answers1

1

pyspark launch script looks for,

# Determine the Python executable to use for the driver:
if [[ -n "$IPYTHON_OPTS" || "$IPYTHON" == "1" ]]; then
  # If IPython options are specified, assume user wants to run IPython
  # (for backwards-compatibility)
  PYSPARK_DRIVER_PYTHON_OPTS="$PYSPARK_DRIVER_PYTHON_OPTS $IPYTHON_OPTS"
  PYSPARK_DRIVER_PYTHON="ipython"
elif [[ -z "$PYSPARK_DRIVER_PYTHON" ]]; then
  PYSPARK_DRIVER_PYTHON="${PYSPARK_PYTHON:-"$DEFAULT_PYTHON"}"
fi

set below variables in your ~/.bashrc

echo "export PATH=$PATH:/path_to_downloaded_spark/spark-1.6.0/bin"
echo "export PYSPARK_DRIVER_PYTHON=ipython"
echo "export PYSPARK_DRIVER_PYTHON_OPTS='notebook'
WoodChopper
  • 4,265
  • 6
  • 31
  • 55
  • It is probably really close, I am still getting this error: env: ipython: No such file or directory – xmorera Oct 30 '17 at 23:19