4

I am working with SparkR.

I am able to set Spark Context on YARN with desired number of executors and executor-cores with such command:

spark/bin/sparkR --master yarn-client --num-executors 5 --executor-cores 5

Now I am trying to initialize a new Spark Context but from RStudio which is more comfortable to work with than a regular command line.

I figured out that to do this I'll need to use sparkR.init() function. There is an option master which I set to yarn-client but how to specify num-executors or executor-cores? This is where I stacked

library(SparkR, lib.loc = "spark-1.5.0-bin-hadoop2.4/R/lib")

sc <- sparkR.init(sparkHome = "spark-1.5.0-bin-hadoop2.4/",
                  master = "yarn-client")
Marcin
  • 7,834
  • 8
  • 52
  • 99

1 Answers1

3

Providing sparkEnvir argument for sparkR.init should work:

sparkEnvir <- list(spark.num.executors='5', spark.executor.cores='5')

sc <- sparkR.init(
    sparkHome = "spark-1.5.0-bin-hadoop2.4/", 
    master = "yarn-client",
    sparkEnvir = sparkEnvir)
zero323
  • 322,348
  • 103
  • 959
  • 935