I'm getting an error while running spark-shell command through cmd but unfortunately without any luck so far. I have Python/Java/Spark/Hadoop(winutils.exe)/Scala installed with versions as below:
- Python: 3.7.3
- Java: 1.8.0_311
- Spark: 3.2.0
- Hadoop(winutils.exe):2.5x
- scala sbt: sbt-1.5.5.msi
I followed below steps and ran spark-shell (C:\Program Files\spark-3.2.0-bin-hadoop3.2\bin>
) through cmd:
- Create
JAVA_HOME
variable:C:\Program Files\Java\jdk1.8.0_311\bin
- Add the following part to your path:
%JAVA_HOME%\bin
- Create
SPARK_HOME
variable:C:\spark-3.2.0-bin-hadoop3.2\bin
- Add the following part to your path:
%SPARK_HOME%\bin
- The most important part Hadoop path should include bin file before
winutils.exe
as the following:C:\Hadoop\bin
Sure you will locatewinutils.exe
inside this path. - Create
HADOOP_HOME
Variable:C:\Hadoop
- Add the following part to your path:
%HADOOP_HOME%\bin
Am I missing out on anything? I've posted my question with error details in another thread (spark-shell command throwing this error: SparkContext: Error initializing SparkContext)