3

I am very frustrated by Spark. An evening wasted thinking that I was doing something wrong but I have uninstalled and reinstalled several times, following multiple guides that all indicate a very similar path.

On cmd prompt, I am trying to run:

pyspark

or

spark-shell

The steps I followed include downloading a pre-built package from:

https://spark.apache.org/downloads.html

including spark 2.0.2 with hadoop 2.3 and spark 2.1.0 with hadoop 2.7.

Neither work and I get this error:

'Files\Spark\bin\..\jars""\' is not recognized as an internal or external  command,
operable program or batch file.
Failed to find Spark jars directory.
You need to build Spark before running this program.

I've setup my environment variables fine as well utilising the winutils.exe trick but these seem unrelated to the problem at hand.

I can't be the only one who's stuck on this problem. Anyone know a work around for getting this program to work in windows?

EB88
  • 841
  • 1
  • 10
  • 26

3 Answers3

10

I've just found an answer in one of the answers to this question:

Why does spark-submit and spark-shell fail with "Failed to find Spark assembly JAR. You need to build Spark before running this program."?

The following answer worked for me and is totally counter-intuitive:

"On Windows, I found that if it is installed in a directory that has a space in the path (C:\Program Files\Spark) the installation will fail. Move it to the root or another directory with no spaces."

Community
  • 1
  • 1
EB88
  • 841
  • 1
  • 10
  • 26
  • I like your answer but after running the command ".\bin\sparkR" I've got a message "R is not recognized as an internal or external command, operable program or batch file". Did you get the same warning? – Katin Mar 20 '17 at 15:32
  • I didn't see this but the first thing that springs to mind is that the environment variables may not be set up right. – EB88 Mar 21 '17 at 09:36
1

This problem is caused by your environment variable settings, in fact you probably put the SPARK_HOME value as 'Program Files\Spark\bin", which has 2 issue :

  • you have to remove the bin, spark home is just 'Program Files\Spark\'
  • since the path to spark home contains a white space, it causes a problem therefore you can set it as 'Progra~1\Spark\'
aName
  • 2,751
  • 3
  • 32
  • 61
1

I too faced the same issue. The main reason for this issue is the space in the folder path. C:\Program Files\spark-2.4.5-bin-hadoop2.7 for SPARK_HOME. Just move this spark-2.4.5-bin-hadoop2.7 folder to the root directory of C drive i.e C:\spark-2.4.5-bin-hadoop2.7 and point the SPARK_HOME also to the same location. It solves the issue.

Ramineni Ravi Teja
  • 3,568
  • 26
  • 37