Questions tagged [apache-spark-standalone]

Use for question related to Apache Spark standalone deploy mode (not local mode).

This tag should be used for questions specific to Standalone deploy mode. Questions might include cluster orchestration in standalone mode, or standalone specific features and configuration options.

Spark standalone mode is alternative to running Spark on Mesos or YARN. Standalone mode provides a simpler alternative to using more sophisticated resource managers, which might be useful or applicable on dedicated Spark cluster (i.e. not running other jobs).

"Standalone" speaks to the nature of running "alone" without an external resource manager.

Related tags: apache-spark

164 questions

vote

0 answers

RDD partitioning problem while running als program on spark standalone cluster

I am running my ALS program on spark cluster of two nodes in pyspark.It is working fine for 20 iterations if I disable checkpointIntervalin als params.For more than 20 iterations it requires to enable CheckpointInterval.I have also given a…

apache-spark pyspark apache-spark-standalone

asked Feb 07 '19 at 11:07

Neha patel

vote

2 answers

How multiple executors are managed on the worker nodes with a Spark standalone cluster?

Until now, I have only used Spark on a Hadoop cluster with YARN as the resource manager. In that type of cluster, I know exactly how many executors to run and how the resource management works. However, know that I am trying to use a Standalone…

scala apache-spark hadoop cluster-computing apache-spark-standalone

asked Jan 25 '19 at 11:26

MetallicPriest

29,191
52
200
356

vote

1 answer

spark submit - An existing connection was forcibly closed by the remote host [on master node ]

I have setup a spark cluster on my windows 7 machine locally. It has a master and a worker node. I have created a simple jar using sbt compile + sbt package and trying to submit it to the spark master node using spark-submit. Currently both the…

scala apache-spark spark-submit apache-spark-standalone

asked Dec 27 '18 at 10:57

ankur

vote

3 answers

Get the Exit status for failed Spark jobs when submitted through Spark-submit

I am submitting spark jobs using spark-submit in standalone mode. All these jobs are triggered using cron. I want to monitor these jobs for any failure. But using spark-submit if any exception occurs in the application (Ex. ConnectionException) the…

apache-spark spark-submit apache-spark-standalone

asked Aug 29 '18 at 07:21

thebytewalker

vote

0 answers

Spark-2.3 submit not working for one master one slave confifuration in Windows 10 -64bit

Following steps for Hadoop hdfs and spark:- 1) Environment variables - HADOOP_CONF_DIR - F:\spark\Hadoop2\hadoop-2.7.6\etc\hadoop HADOOP_HOME - F:\spark\Hadoop2\hadoop-2.7.6 JAVA_HOME - F:\Java\jdk1.8.0_121 SPARK_HOME -…

java apache-spark hadoop hdfs apache-spark-standalone

asked Apr 25 '18 at 14:47

adihere

vote

1 answer

Spark working faster in Standalone rather than YARN

Wanted some insights on spark execution on standalone and yarn. We have a 4 node cloudera cluster and currently the performance of our application while running in YARN mode is less than half than what we are getting while executing in standalone…

performance apache-spark spark-streaming hadoop-yarn apache-spark-standalone

asked Apr 12 '18 at 10:07

Sumit Khurana

vote

0 answers

Spark Initial Job Not Accepting Resources Amazon EC2 Standalone Cluster

So I have deployed a standalone cluster to Amazon EC2 using Terraform. It is using passwordless ssh to communicate with workers. I start the master with the start master script, giving the public ip of the cluster to be the public dns of the ec2…

amazon-web-services apache-spark amazon-ec2 terraform apache-spark-standalone

asked Apr 11 '18 at 17:17

Joe

vote

1 answer

Spark Master IP configuration in Azure VM

I am setting up standalone Spark cluster in Azure VM. I want to run Spark master with Azure VM's public IP not with VM's hostname, so that I can access from other VM. Spark version: spark-2.2.0-bin-hadoop2.7 I have created new file "spark-env.sh"…

apache-spark spark-streaming azure-virtual-machine apache-spark-standalone

asked Apr 10 '18 at 14:32

Gnana

2,130
5
26
57

vote

1 answer

How SPARK_WORKER_CORES setting impacts concurrency in Spark Standalone

I am using a Spark 2.2.0 cluster configured in Standalone mode. Cluster has 2 octa core machines. This cluster is exclusively for Spark jobs and no other process uses them. I have around 8 Spark Streaming apps which run on this cluster.I explicitly…

apache-spark streaming distributed-computing apache-spark-standalone

asked Jan 29 '18 at 06:15

scorpio

vote

0 answers

Spark Standalone Mode, application runs, but executor is killed with exitStatus 1

I am new to Apache Spark and was trying to run the example Pi Calculation application on my local spark setup (using Standalone Cluster). Both the Master, Slave and Driver are running on my local machine. What I am noticing is that, the PI is…

apache-spark apache-spark-standalone

asked Jan 25 '18 at 14:22

Chandu

81,493
19
133
134

vote

1 answer

Spark standalone connection driver to worker

I'm trying to host locally a spark standalone cluster. I have two heterogeneous machines connected on a LAN. Each piece of the architecture listed below is running on docker. I have the following configuration master on machine 1 (port 7077…

apache-spark spark-submit apache-spark-standalone

asked Jan 16 '18 at 21:10

Matthias Beaupère

1,731
2
17
44

vote

0 answers

Spark standalone cluster port related issue

I am deploying spark app through standalone cluster. I have one master and 2 slaves. I am testing my cluster. I have application .jar copied everywhere at the same location. I have observed following issue: on Master bin/spark-submit --class *****…

java apache-spark amazon-ec2 apache-spark-standalone

asked Dec 14 '17 at 10:50

Aniruddha

vote

0 answers

Unstable Executor reconnecting again and again in spark standalone cluster?

I am getting below stack trace and executor is lost and creating new executor connection. INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(etlspark); groups with view permissions:…

apache-spark apache-spark-standalone

asked Oct 28 '17 at 06:30

xyz_scala

vote

1 answer

Can I distribute work with Apache Spark Standalone version?

I hear people talking about an "Apache Standalone Cluster", which confuses me because I understand a "cluster" as various machines connected by a potentially fast network and working in parallel, and "standalone" as a machine or program that is…

apache-spark networking cluster-computing apache-spark-standalone

asked Aug 04 '17 at 20:42

Martin Ventura

vote

2 answers

spark-submit on standalone cluster complain about scala-2.10 jars not exist

I'm new to Spark and downloaded a pre-compiled Spark binaries from Apache (Spark-2.1.0-bin-hadoop2.7) When submitting my scala (2.11.8) uber jar the cluster throw and error: java.lang.IllegalStateException: Library directory…

scala apache-spark apache-spark-standalone

asked Jul 25 '17 at 03:57

Y. Eliash

1,808
3
14
23

Prev 1 2 3

…

10 11 Next