Questions tagged [apache-spark-standalone]

Use for question related to Apache Spark standalone deploy mode (not local mode).

This tag should be used for questions specific to Standalone deploy mode. Questions might include cluster orchestration in standalone mode, or standalone specific features and configuration options.

Spark standalone mode is alternative to running Spark on Mesos or YARN. Standalone mode provides a simpler alternative to using more sophisticated resource managers, which might be useful or applicable on dedicated Spark cluster (i.e. not running other jobs).

"Standalone" speaks to the nature of running "alone" without an external resource manager.

Related tags:

164 questions
0
votes
1 answer

Spark Standalone on Kubernetes - application got finished after consecutive master then driver failure

Trying to achieve High Availability of SparkMaster using ZooKeeper with SparkDriver resiliency using metaData checkpoint into GlusterFS. Some Informations : Using Spark 2.2.0 (prebuilt binary) Submitting a streaming app with --deploy-mode cluster…
0
votes
0 answers

Spark standalone cluster master url change

I am trying to setup Spark standalone cluster in Azure cloud VM .Spark 2.2 setup is done. if I start the master (start-master.sh) , am able to see the master URL in web ui. But that spark master url has host name of that VM not IP address. VM IP …
Gnana
  • 2,130
  • 5
  • 26
  • 57
0
votes
1 answer

Is it possible to make Spark run whole Taskset on a single executor?

I run a single spark job on a local cluster(1 master-2workers/executors). From what i have understood until now, all stages of a job are splited into tasks. Each stage has its own task set. Each task of this TaskSet will be scheduled on an executor…
0
votes
0 answers

Unable to connect spark standlone application with kerberized hadoop

I am using Spark standalone 1.6.x version to connect kerberos enabled hadoop 2.7.x JavaDStream status = stream.map(new Function() { public String call(String arg0) throws Exception { Configuration conf = new…
User_qwerty
  • 375
  • 1
  • 2
  • 10
0
votes
1 answer

Spark - local standalone mode won't write to history server

I'm trying to enable Spark history server in single standalone mode on my Mac. I have a spark-master service running and am able to run jobs. I also have a history-server service running on localhost. I'm able to view it in my browser but there are…
pac
  • 491
  • 1
  • 7
  • 30
0
votes
1 answer

Spark Standalone --total-executor-cores

Im using Spark 2.1.1 Standalone cluster, Although I have 29 free cores in my cluster (Cores in use: 80 Total, 51 Used), when submitting new spark job with --total-executor-cores 16 this config is not taking affect and the job submitted only with 6…
0
votes
3 answers

Spark standalone cluster tuning

We have spark 2.1.0 standalone cluster running on a single node with 8 cores and 50GB memory(single worker). We run spark applications in cluster mode with the following memory settings - --driver-memory = 7GB (default - 1core is…
0
votes
0 answers

Spark Standalone cluster only two workers utilized

In Spark Standalone Cluster, only 2 of the 6 worker instances get utilized, rest of them are idle. I used two VMs both having 4 cores. 2 workers were on the local VM(where master was started) and 4 workers were on the other VM. Only local two got…
0
votes
1 answer

How to dynamically change PYTHONPATH in pyspark app

OK, so I am running a script that depends on a complicated project with a bunch of custom submodules out of pyspark. The job I am running is one where I would like for it to have several different versions of code running against a Spark standalone…
0
votes
1 answer

Executors are failing in Spark standalone deployment

I am running spark on my local windows machine. It works perfectly fine when i set master as a local but when I give it a cluster master uri, It throws the following exception for each and every executor it initiates. 17/10/05 17:27:19 INFO…
Rakesh
  • 466
  • 3
  • 12
0
votes
1 answer

Spark Standalone cluster, memory per executor issue

Hi i am launch my Spark application with the spark submit script as such spark-submit --master spark://Maatari-xxxxxxx.local:7077 --class EstimatorApp /Users/sul.maatari/IdeaProjects/Workshit/target/scala-2.11/Workshit-assembly-1.0.jar …
MaatDeamon
  • 9,532
  • 9
  • 60
  • 127
0
votes
0 answers

Block missing exception while processing the data from hdfs in spark standalone cluster

I Was running the spark on the hadoop with 2 workers and 2 datanodes . First machine contains : sparkmaster , namenode , worker-1 , datanode-1. Second machine contains : worker2, datanode2 In hadoop cluster there are 2 files under the /usr directory…
0
votes
1 answer

Why does start-slave.sh require master URL?

I'm wondering why the client, using apache-spark/sbin/start-slave.sh has to indicate this master's URL, since the master already indicates it in : apache-spark/sbin/start-master.sh --master spark://my-master:7077e.g. ? Is it because…
JarsOfJam-Scheduler
  • 2,809
  • 3
  • 31
  • 70
0
votes
1 answer

Why does standalone master schedule drivers on a worker?

The schedule() in Master.scala shows the first schedule task is scheduling drivers on Workers. As Master will start only standalone mode, drivers will run on client out of Spark cluster. Why does the master need schedule a Worker to run Driver?
CCong
  • 47
  • 3
0
votes
1 answer

How to use mesos master url in a self-contained Scala Spark program

I am creating a self-contained Scala program that uses Spark for parallelization in some parts. In my specific situation, the Spark cluster is available through mesos. I create spark context like this: val conf = new…
lambdapilgrim
  • 1,041
  • 7
  • 10
1 2 3
10
11