0

I tried following command in my local RStudio session to connect to sparkR -

sc <- spark_connect(master = "spark://x.x.x.x:7077",
spark_home = "/home/hduser/spark-2.0.0-bin-hadoop2.7", version="2.0.0", config = list())

But, I am getting following error -

Error in start_shell(master = master, spark_home = spark_home, spark_version = version,  : 
SPARK_HOME directory '/home/hduser/spark-2.0.0-bin-hadoop2.7' not found

Any help?

Thanks in advance

r4sn4
  • 117
  • 5
  • 14

1 Answers1

0

may I ask you have you actually installed the spark into that folder? Can you show the result of ls command in /home/ubuntu/ folder?

And sessionInfo() in R?

Let me please share with you how I am using the custom folder structure. It is on Win, not Ubuntu but I guess it won't make much of the difference.

Using the most recent dev edition

If you would check on GitHub the RStudio guys are updating sparklyr almost every day fixing numerous reported bugs:

devtools::install_github("rstudio/sparklyr")

in my case only installation of sparklyr_0.4.12 has resolved problem with Spark 2.0 under Windows

Checking Spark availability

please check if version you're inquiring is available:

spark_available_versions()

You should see something like the line below, which indicates that the version you indend to use is actually available for your sparklyr package.

[13] 2.0.0 2.7 spark_install(version = "2.0.0", hadoop_version = "2.7")

Installation of Spark

Just to keep the order you may like to install spark in other location rather then home folder of RStudio cache.

options(spark.install.dir = "c:/spark")

Once you are sure the desire version is available it is time to install spark

spark_install(version = "2.0.0", hadoop_version = "2.7")

I'd check if it is install correctly (change it for shell ls if needed)

cd c:/spark dir (in Win) | ls (in Ubuntu)

Now specify the location of the edition you want to use:

Sys.setenv(SPARK_HOME = 'C:/spark/spark-2.0.0-bin-hadoop2.7')

And finally enjoy the creation of connection:

sc <- spark_connect(master = "local")

I hope it helps.

smit patel
  • 129
  • 4
Alex Skorokhod
  • 500
  • 1
  • 5
  • 16