Questions tagged [apache-spark-standalone]

Use for question related to Apache Spark standalone deploy mode (not local mode).

This tag should be used for questions specific to Standalone deploy mode. Questions might include cluster orchestration in standalone mode, or standalone specific features and configuration options.

Spark standalone mode is alternative to running Spark on Mesos or YARN. Standalone mode provides a simpler alternative to using more sophisticated resource managers, which might be useful or applicable on dedicated Spark cluster (i.e. not running other jobs).

"Standalone" speaks to the nature of running "alone" without an external resource manager.

Related tags: apache-spark

164 questions

votes

3 answers

SparkUI not showing Tab (Jobs, Stages, Storage, Environment,...) when run in standalone mode

I'm running spark master through the following command: ./sbin/start-master.sh After that I went to http://localhost:8080, and I saw the following page. I was expecting to see the tab with Jobs, Environments, ... like the following Could someone…

apache-spark apache-spark-standalone spark-ui

asked May 31 '19 at 15:18

Giuseppe Scopelliti

votes

2 answers

How to add the "--deploy-mode cluster" option to my scala code

209/5000 Hello I want to add the option "--deploy-mode cluster" to my code scala: val sparkConf = new SparkConfig ().setMaster ("spark: //192.168.60.80:7077") Without using the shell (the command. \ Spark-submit) i whant to usage the "…

scala apache-spark spark-streaming apache-spark-standalone

asked May 11 '17 at 12:28

hatem dagbouj

votes

3 answers

PySpark: Not able to create SparkSession.(Java Gateway Error)

I have installed PySpark on windows and was having no problem till yesterday. I am using windows 10, PySpark version 2.3.3(Pre-build version), java version "1.8.0_201". Yesterday when I tried creating a spark session, I ran into below…

java apache-spark hadoop pyspark apache-spark-standalone

asked Mar 29 '19 at 20:52

Sanchit Kumar

1,545
1
11
19

votes

0 answers

Spark Streaming - Block replication policy issue in case of multiple executor on the same worker

I am running a spark streaming application on a cluster composed by three nodes, each node with a worker and three executors (so a total of 9 executors). I am using Spark version 2.3.2, and the spark standalone cluster manager. The…

apache-spark spark-streaming executor apache-spark-standalone

asked Oct 10 '18 at 08:49

Davide Mandrini

votes

1 answer

Why is Apache Livy session showing Application id NULL?

I've implemented a fully functional Spark 2.1.1 Standalone cluster, where I POST job batches via the curl command using Apache Livy 0.4. When consulting the Spark WEB UI I see my job along with its application id (something like this:…

apache-spark apache-spark-2.0 apache-spark-standalone livy

asked Aug 03 '17 at 16:27

Emiliano

votes

1 answer

How to set up cluster environment for Spark applications on Windows machines?

I have been developing in pyspark with spark standalone non-cluster mode. These days, I would like to explore more on the cluster mode of spark. I searched on the internet, and found I may need a cluster manager to run clusters in different machines…

windows apache-spark mesos apache-spark-standalone

asked Jun 08 '17 at 13:49

Yohan Chung

votes

2 answers

how to run Spark job on specific nodes

For example my Spark cluster has 100 nodes(workers), when I run one job I just want it be ran on some 10 specific nodes, how should I achieve this. btw, I'm using Spark standalone module. Why Do I need the above requirement: One of my Spark job…

apache-spark apache-spark-standalone

asked May 29 '16 at 14:33

Jack

5,540
13
65
113

votes

2 answers

winutils spark windows installation env_variable

I am trying to install Spark 1.6.1 on windows 10 and so far I have done the following... Downloaded spark 1.6.1, unpacked to some directory and then set SPARK_HOME Downloaded scala 2.11.8, unpacked to some directory and then set SCALA_HOME Set the…

windows git scala apache-spark apache-spark-standalone

asked May 18 '16 at 16:14

uh_big_mike_boi

3,350
4
33
64

votes

1 answer

Access spark-shell from different Spark versions

TL;DR: Is it absolutely necessary that the Spark running a spark-shell (driver) have the exactly same version of the Spark's master? I am using Spark 1.5.0 to connect to Spark 1.5.0-cdh5.5.0 via spark-shell: spark-shell --master…

apache-spark apache-spark-sql cloudera-cdh apache-spark-standalone

asked May 10 '16 at 16:19

matheusr

votes

3 answers

Starting multiple workers on a master node in Standalone mode

I have a machine with 80 cores. I'd like to start a Spark server in standalone mode on this machine with 8 executors, each with 10 cores. But, when I try to start my second worker on the master, I get an error. $ ./sbin/start-master.sh Starting…

apache-spark apache-spark-standalone

asked Jan 28 '20 at 20:45

Ben Caine

1,128
3
15
25

votes

0 answers

spark-submit: unable to get driver status

I'm running a job on a test Spark standalone in cluster mode, but I'm finding myself unable to monitor the status of the driver. Here is a minimal example using spark-2.4.3 (master and one worker running on the same node, started running…

apache-spark apache-spark-2.0 apache-spark-standalone

asked Jun 27 '19 at 09:32

Cavaz

2,996
24
38

votes

1 answer

pyspark got Py4JNetworkError("Answer from Java side is empty") when exit python

Background: spark standalone cluster mode on k8s spark 2.2.1 hadoop 2.7.6 run code in python, not in pyspark client mode, not cluster mode The pyspark code in python, not in pyspark env. Every code can work and get it down. But 'sometimes', when…

python apache-spark pyspark apache-spark-standalone

asked Nov 23 '18 at 03:26

Jayce Li

votes

0 answers

Spark launcher handle not updating state on Standalone cluster mode

I'm trying to programmatically submit Spark jobs using the Spark Launcher library in a spring web application. Everything works fine with yarn-client, yarn-cluster and standalone-client modes. However, when using standalone-cluster mode, the…

apache-spark cluster-computing apache-spark-standalone spark-launcher

asked Nov 04 '18 at 12:56

Mengdi Gao

votes

2 answers

Why does stopping Standalone Spark master fail with "no org.apache.spark.deploy.master.Master to stop"?

Stopping standalone spark master fails with the following message: $ ./sbin/stop-master.sh no org.apache.spark.deploy.master.Master to stop Why? There is one Spark Standalone master up and running.

apache-spark apache-spark-standalone

asked Jan 06 '18 at 05:46

AravindR

votes

2 answers

How many executor processes run for each worker node in spark?

How many executors will be launched for each worker node in Spark? Can i know the math behind it? for example i have 6 worker nodes and 1 master and if i submit a job through spark-submit, how many maximum number of executors will be launched for…

apache-spark apache-spark-standalone

asked Oct 10 '16 at 19:23

AKC

Prev 1

…

10 11 Next