Questions tagged [apache-spark-standalone]

Use for question related to Apache Spark standalone deploy mode (not local mode).

This tag should be used for questions specific to Standalone deploy mode. Questions might include cluster orchestration in standalone mode, or standalone specific features and configuration options.

Spark standalone mode is alternative to running Spark on Mesos or YARN. Standalone mode provides a simpler alternative to using more sophisticated resource managers, which might be useful or applicable on dedicated Spark cluster (i.e. not running other jobs).

"Standalone" speaks to the nature of running "alone" without an external resource manager.

Related tags:

164 questions
1
vote
0 answers

RDD partitioning problem while running als program on spark standalone cluster

I am running my ALS program on spark cluster of two nodes in pyspark.It is working fine for 20 iterations if I disable checkpointIntervalin als params.For more than 20 iterations it requires to enable CheckpointInterval.I have also given a…
Neha patel
  • 143
  • 2
  • 12
1
vote
2 answers

How multiple executors are managed on the worker nodes with a Spark standalone cluster?

Until now, I have only used Spark on a Hadoop cluster with YARN as the resource manager. In that type of cluster, I know exactly how many executors to run and how the resource management works. However, know that I am trying to use a Standalone…
1
vote
1 answer

spark submit - An existing connection was forcibly closed by the remote host [on master node ]

I have setup a spark cluster on my windows 7 machine locally. It has a master and a worker node. I have created a simple jar using sbt compile + sbt package and trying to submit it to the spark master node using spark-submit. Currently both the…
ankur
  • 557
  • 1
  • 10
  • 37
1
vote
3 answers

Get the Exit status for failed Spark jobs when submitted through Spark-submit

I am submitting spark jobs using spark-submit in standalone mode. All these jobs are triggered using cron. I want to monitor these jobs for any failure. But using spark-submit if any exception occurs in the application (Ex. ConnectionException) the…
1
vote
0 answers

Spark-2.3 submit not working for one master one slave confifuration in Windows 10 -64bit

Following steps for Hadoop hdfs and spark:- 1) Environment variables - HADOOP_CONF_DIR - F:\spark\Hadoop2\hadoop-2.7.6\etc\hadoop HADOOP_HOME - F:\spark\Hadoop2\hadoop-2.7.6 JAVA_HOME - F:\Java\jdk1.8.0_121 SPARK_HOME -…
1
vote
1 answer

Spark working faster in Standalone rather than YARN

Wanted some insights on spark execution on standalone and yarn. We have a 4 node cloudera cluster and currently the performance of our application while running in YARN mode is less than half than what we are getting while executing in standalone…
1
vote
0 answers

Spark Initial Job Not Accepting Resources Amazon EC2 Standalone Cluster

So I have deployed a standalone cluster to Amazon EC2 using Terraform. It is using passwordless ssh to communicate with workers. I start the master with the start master script, giving the public ip of the cluster to be the public dns of the ec2…
1
vote
1 answer

Spark Master IP configuration in Azure VM

I am setting up standalone Spark cluster in Azure VM. I want to run Spark master with Azure VM's public IP not with VM's hostname, so that I can access from other VM. Spark version: spark-2.2.0-bin-hadoop2.7 I have created new file "spark-env.sh"…
1
vote
1 answer

How SPARK_WORKER_CORES setting impacts concurrency in Spark Standalone

I am using a Spark 2.2.0 cluster configured in Standalone mode. Cluster has 2 octa core machines. This cluster is exclusively for Spark jobs and no other process uses them. I have around 8 Spark Streaming apps which run on this cluster.I explicitly…
1
vote
0 answers

Spark Standalone Mode, application runs, but executor is killed with exitStatus 1

I am new to Apache Spark and was trying to run the example Pi Calculation application on my local spark setup (using Standalone Cluster). Both the Master, Slave and Driver are running on my local machine. What I am noticing is that, the PI is…
Chandu
  • 81,493
  • 19
  • 133
  • 134
1
vote
1 answer

Spark standalone connection driver to worker

I'm trying to host locally a spark standalone cluster. I have two heterogeneous machines connected on a LAN. Each piece of the architecture listed below is running on docker. I have the following configuration master on machine 1 (port 7077…
Matthias Beaupère
  • 1,731
  • 2
  • 17
  • 44
1
vote
0 answers

Spark standalone cluster port related issue

I am deploying spark app through standalone cluster. I have one master and 2 slaves. I am testing my cluster. I have application .jar copied everywhere at the same location. I have observed following issue: on Master bin/spark-submit --class *****…
1
vote
0 answers

Unstable Executor reconnecting again and again in spark standalone cluster?

I am getting below stack trace and executor is lost and creating new executor connection. INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(etlspark); groups with view permissions:…
xyz_scala
  • 463
  • 1
  • 4
  • 21
1
vote
1 answer

Can I distribute work with Apache Spark Standalone version?

I hear people talking about an "Apache Standalone Cluster", which confuses me because I understand a "cluster" as various machines connected by a potentially fast network and working in parallel, and "standalone" as a machine or program that is…
1
vote
2 answers

spark-submit on standalone cluster complain about scala-2.10 jars not exist

I'm new to Spark and downloaded a pre-compiled Spark binaries from Apache (Spark-2.1.0-bin-hadoop2.7) When submitting my scala (2.11.8) uber jar the cluster throw and error: java.lang.IllegalStateException: Library directory…
Y. Eliash
  • 1,808
  • 3
  • 14
  • 23