0

I have Spark cluster in my remote centos nodes and i want to connect that remote Spark cluster from my local windows R studio (I am using Rstudio Desktop in my local windows)

  if (nchar(Sys.getenv("SPARK_HOME")) < 1) {
  Sys.setenv(SPARK_HOME = "/home/remoteclusterpath/spark-1.6.0-bin-hadoop2.6")
  .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths()))
}
library(SparkR, lib.loc = "/home/remoteclusterpath/spark-1.6.0-bin-hadoop2.6/R/lib")

sc <- sparkR.init(master = "spark://<IP-Address>:7077", sparkEnvir = list(spark.driver.memory="2g"))

I am getting this below error

Error in library(SparkR) : there is no package called ‘SparkR’

please anyone provide me the solution thanks in advance

2 Answers2

1

To use SparkR in RStudio you need install SparkR package and load it. Use these commands

install.packages("SparkR")

library(SparkR)
0

To use SparkR in RStudio you may install sparklyr putting this command on RStudio console:

intall.packages("sparklyr")

Later, you can load this package with this command:

library("sparklyr")
Eduardo
  • 1
  • 1