I'm trying to follow a tutorial for using spark from RStudio on DSX, but I'm running into the following error:
> library(sparklyr)
> sc <- spark_connect(master = "CS-DSX")
Error in spark_version_from_home(spark_home, default = spark_version) :
Failed to detect version from SPARK_HOME or SPARK_HOME_VERSION. Try passing the spark version explicitly.
I took the above code snippet from the connect to spark dialog in RStudio:
So I took a look at SPARK_HOME
:
> Sys.getenv("SPARK_HOME")
[1] "/opt/spark"
Ok, Lets check that dir exists:
> dir("/opt")
[1] "ibm"
I'm guessing this is the cause of the problem?
NOTE: there are a few similar questions on stackoverflow, but none of them are about IBM's Data Science Experience (DSX).
Update 1:
I tried the following:
> sc <- spark_connect(config = "CS-DSX")
Error in config$spark.master : $ operator is invalid for atomic vectors
Update 2:
An extract from my config.yml. Note that I have many more spark services in my, I've just pasted the first one:
default:
method: "shell"
CS-DSX:
method: "bluemix"
spark.master: "spark.bluemix.net"
spark.instance.id: "7a4089bf-3594-4fdf-8dd1-7e9fd7607be5"
tenant.id: "sdd1-7e9fd7607be53e-39ca506ba762"
tenant.secret: "xxxxxx"
hsui.url: "https://cdsx.ng.bluemix.net"
Note that my config.yml was generated for me.
Update 3:
My .Rprofile looks like this:
# load sparklyr library
library(sparklyr)
# setup SPARK_HOME
if (nchar(Sys.getenv("SPARK_HOME")) < 1) {
Sys.setenv(SPARK_HOME = "/opt/spark")
}
# setup SparkaaS instances
options(rstudio.spark.connections = c("CS-DSX","newspark","cleantest","4jan2017","Apache Spark-4l","Apache Spark-3a","ML SPAAS","Apache Spark-y9","Apache Spark-a8"))
Note that my .Rprofile was generated for me.
Update 4:
I uninstalled sparklyr and restarted the session twice. Next I tried to run:
library(sparklyr)
library(dplyr)
sc <- spark_connect(config = "CS-DSX")
However, the above command hung. I stopped the command and checked the version of sparklyr which seems to be ok:
> ip <- installed.packages()
> ip[ rownames(ip) == "sparklyr", c(0,1,3) ]
Package Version
"sparklyr" "0.4.36"