Questions tagged [apache-spark-standalone]

Use for question related to Apache Spark standalone deploy mode (not local mode).

This tag should be used for questions specific to Standalone deploy mode. Questions might include cluster orchestration in standalone mode, or standalone specific features and configuration options.

Spark standalone mode is alternative to running Spark on Mesos or YARN. Standalone mode provides a simpler alternative to using more sophisticated resource managers, which might be useful or applicable on dedicated Spark cluster (i.e. not running other jobs).

"Standalone" speaks to the nature of running "alone" without an external resource manager.

Related tags: apache-spark

164 questions

votes

1 answer

Spark Standalone on Kubernetes - application got finished after consecutive master then driver failure

Trying to achieve High Availability of SparkMaster using ZooKeeper with SparkDriver resiliency using metaData checkpoint into GlusterFS. Some Informations : Using Spark 2.2.0 (prebuilt binary) Submitting a streaming app with --deploy-mode cluster…

apache-spark kubernetes spark-streaming apache-spark-standalone kubernetes-statefulset

asked Jun 11 '18 at 19:59

Sayak Ghosh

votes

0 answers

Spark standalone cluster master url change

I am trying to setup Spark standalone cluster in Azure cloud VM .Spark 2.2 setup is done. if I start the master (start-master.sh) , am able to see the master URL in web ui. But that spark master url has host name of that VM not IP address. VM IP …

scala apache-spark azure-virtual-machine apache-spark-standalone

asked Apr 01 '18 at 19:53

Gnana

2,130
5
26
57

votes

1 answer

Is it possible to make Spark run whole Taskset on a single executor?

I run a single spark job on a local cluster(1 master-2workers/executors). From what i have understood until now, all stages of a job are splited into tasks. Each stage has its own task set. Each task of this TaskSet will be scheduled on an executor…

apache-spark taskmanager apache-spark-standalone

asked Mar 08 '18 at 14:07

Jim_Spr

votes

0 answers

Unable to connect spark standlone application with kerberized hadoop

I am using Spark standalone 1.6.x version to connect kerberos enabled hadoop 2.7.x JavaDStream status = stream.map(new Function() { public String call(String arg0) throws Exception { Configuration conf = new…

hadoop apache-spark kerberos apache-spark-standalone

asked Feb 27 '18 at 04:14

User_qwerty

votes

1 answer

Spark - local standalone mode won't write to history server

I'm trying to enable Spark history server in single standalone mode on my Mac. I have a spark-master service running and am able to run jobs. I also have a history-server service running on localhost. I'm able to view it in my browser but there are…

apache-spark apache-spark-standalone

asked Jan 10 '18 at 14:14

pac

votes

1 answer

Spark Standalone --total-executor-cores

Im using Spark 2.1.1 Standalone cluster, Although I have 29 free cores in my cluster (Cores in use: 80 Total, 51 Used), when submitting new spark job with --total-executor-cores 16 this config is not taking affect and the job submitted only with 6…

apache-spark pyspark spark-submit apache-spark-standalone

asked Jan 10 '18 at 09:28

Gal Shaboodi

votes

3 answers

Spark standalone cluster tuning

We have spark 2.1.0 standalone cluster running on a single node with 8 cores and 50GB memory(single worker). We run spark applications in cluster mode with the following memory settings - --driver-memory = 7GB (default - 1core is…

apache-spark apache-spark-sql apache-spark-2.0 apache-spark-standalone

asked Dec 14 '17 at 16:04

veerat

votes

0 answers

Spark Standalone cluster only two workers utilized

In Spark Standalone Cluster, only 2 of the 6 worker instances get utilized, rest of them are idle. I used two VMs both having 4 cores. 2 workers were on the local VM(where master was started) and 4 workers were on the other VM. Only local two got…

java apache-spark apache-spark-2.0 spark-submit apache-spark-standalone

asked Nov 20 '17 at 08:21

Ashwin Daswani

votes

1 answer

How to dynamically change PYTHONPATH in pyspark app

OK, so I am running a script that depends on a complicated project with a bunch of custom submodules out of pyspark. The job I am running is one where I would like for it to have several different versions of code running against a Spark standalone…

apache-spark pyspark apache-spark-standalone

asked Oct 20 '17 at 22:04

Zo the Relativist

votes

1 answer

Executors are failing in Spark standalone deployment

I am running spark on my local windows machine. It works perfectly fine when i set master as a local but when I give it a cluster master uri, It throws the following exception for each and every executor it initiates. 17/10/05 17:27:19 INFO…

apache-spark apache-spark-standalone

asked Oct 05 '17 at 12:09

Rakesh

votes

1 answer

Spark Standalone cluster, memory per executor issue

Hi i am launch my Spark application with the spark submit script as such spark-submit --master spark://Maatari-xxxxxxx.local:7077 --class EstimatorApp /Users/sul.maatari/IdeaProjects/Workshit/target/scala-2.11/Workshit-assembly-1.0.jar …

apache-spark apache-spark-standalone

asked Aug 18 '17 at 15:20

MaatDeamon

9,532
9
60
127

votes

0 answers

Block missing exception while processing the data from hdfs in spark standalone cluster

I Was running the spark on the hadoop with 2 workers and 2 datanodes . First machine contains : sparkmaster , namenode , worker-1 , datanode-1. Second machine contains : worker2, datanode2 In hadoop cluster there are 2 files under the /usr directory…

scala hadoop apache-spark apache-spark-standalone

asked Jul 12 '17 at 11:42

Rishikesh Teke

votes

1 answer

Why does start-slave.sh require master URL?

I'm wondering why the client, using apache-spark/sbin/start-slave.sh has to indicate this master's URL, since the master already indicates it in : apache-spark/sbin/start-master.sh --master spark://my-master:7077e.g. ? Is it because…

apache-spark apache-spark-standalone

asked May 22 '17 at 09:15

JarsOfJam-Scheduler

2,809
3
31
70

votes

1 answer

Why does standalone master schedule drivers on a worker?

The schedule() in Master.scala shows the first schedule task is scheduling drivers on Workers. As Master will start only standalone mode, drivers will run on client out of Spark cluster. Why does the master need schedule a Worker to run Driver?

apache-spark apache-spark-standalone

asked May 21 '17 at 07:03

CCong

votes

1 answer

How to use mesos master url in a self-contained Scala Spark program

I am creating a self-contained Scala program that uses Spark for parallelization in some parts. In my specific situation, the Spark cluster is available through mesos. I create spark context like this: val conf = new…

scala apache-spark apache-spark-standalone

asked Apr 20 '17 at 22:14

lambdapilgrim

1,041
7
10

Prev 1 2 3

…

11 Next