17

I want to start an instance of a standalone Apache Spark cluster embedded into my java app. I tried to find some documentation at their website but not look yet.

Is this possible?

Rodrigo
  • 195
  • 1
  • 10

3 Answers3

19

You can create SparkContext in local mode, you just need to provide "local" as a spark master url to SparkConf

val sparkConf = new SparkConf().
  setMaster("local[2]").
  setAppName("MySparkApp")

val sc = new SparkContext(sparkConf)
Eugene Zhulenev
  • 9,714
  • 2
  • 30
  • 40
9

Yes -- you can use Spark in an embedded way with a "local" master.

SparkConf sparkConf = new SparkConf();//Create new spark config
sparkConf.setMaster("local[8]"); // local, using 8 cores (you can vary the number)
sparkConf.setAppName("MyApp");
SparkContext sc = new SparkContext(sparkConf);

This will run Spark within your JVM.

Hasson
  • 1,894
  • 1
  • 21
  • 25
Daniel Winterstein
  • 2,418
  • 1
  • 29
  • 41
  • Can Spark run in a multithreaded setting? I want to embed Spark in Jboss, but it seems that Spark can run only once in a JVM. – ps0604 May 18 '19 at 16:14
4

Others answered this question but as for 2020 with Apache Spark version 3.0:

Java example:

SparkSession spark = SparkSession.builder().appName("Your app name").master("local[*]").getOrCreate();

master("local[*]") means run in a standalone mode with all available CPU cores.

Maven dependencies:

    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-sql_2.12</artifactId>
      <version>3.0.1</version>
      <scope>provided</scope>
    </dependency>
    
    <dependency>
      <groupId>org.apache.spark</groupId>
      <artifactId>spark-core_2.12</artifactId>
      <version>3.0.1</version>
    </dependency>
Sharhabeel Hamdan
  • 1,273
  • 13
  • 15