Questions tagged [apache-spark-standalone]

Use for question related to Apache Spark standalone deploy mode (not local mode).

This tag should be used for questions specific to Standalone deploy mode. Questions might include cluster orchestration in standalone mode, or standalone specific features and configuration options.

Spark standalone mode is alternative to running Spark on Mesos or YARN. Standalone mode provides a simpler alternative to using more sophisticated resource managers, which might be useful or applicable on dedicated Spark cluster (i.e. not running other jobs).

"Standalone" speaks to the nature of running "alone" without an external resource manager.

Related tags:

164 questions
2
votes
0 answers

Remote Spark standalone executor error

I am running a spark (2.0.1) standalone cluster on a remote server (Microsoft azure). I am able to connect my spark app to this cluster, however the tasks are getting stuck without any execution(with the following warning: WARN…
2
votes
1 answer

spark rest api /api/v1 gives method not allowed

I have deployed a spark standalone cluster, but when i try to access the rest api for some application info. The url i try to access is http://ip:4040/api/v1. Link for the rest api doc ->…
Raghvendra Singh
  • 952
  • 9
  • 17
2
votes
1 answer

Who loads partitions into RAM in Spache Spark?

I have this question that I have not been able to find its answer anywhere. I am using the following lines to load data within a PySpark application: loadFile = self.tableName+".csv" dfInput=…
User2130
  • 555
  • 1
  • 6
  • 16
2
votes
0 answers

How many cores can I really set to the Apache Spark standalone cluster?

I have a Apache Spark 1.6.1 standalone cluster set on a single machine with the following specifications: CPU: Core i7-4790 (# of cores: 4, # of threads: 8) RAM: 16GB I am using the following settings in conf/spark-env.sh export…
User2130
  • 555
  • 1
  • 6
  • 16
1
vote
0 answers

how to change spark worker URL change in master UI?

I want to change external URL of spark worker in spark master user currently i am using docker server for same.Any one have idea what can i do for same ? Any parameter that i have to pass in docker compose file ? I also attach screenshot of my spark…
1
vote
0 answers

spark Local mode vs standalone cluster in term of cores and threads usage

im comparing between pyspark local mode and standalone mode where local : findspark.init('C:\spark\spark-3.0.3-bin-hadoop2.7') conf=SparkConf() conf.setMaster("local[*]") conf.setAppName('firstapp') sc = SparkContext(conf=conf) spark =…
1
vote
0 answers

What is Spark default username and password for a standalong installation?

I am trying to connect Spark to Oracle Analytics Cloud(OAC). I have a standalone spark(3.1.2) installation with Hadoop (2.7) in my windows VM. The connection requires a username, password, host and port. Can you please provide the default username…
1
vote
1 answer

Cannot run spark submit in standalone spark cluster

I am working with the following docker-compose image to build a spark standalone cluster: --- # ---------------------------------------------------------------------------------------- # -- Docs:…
J.C Guzman
  • 1,192
  • 3
  • 16
  • 40
1
vote
0 answers

Spark Standalone Security

I am trying to understand how can i restrict a user from submitting spark application apart from shared secret method in standalone mode. Can I use Kerberos based authentication in spark standlone cluster ? Considering the daemon processes will…
1
vote
1 answer

Difference between SPARK_WORKER_CORES and SPARK_EXECUTOR_CORES?

how to configure the number of cores to SPARK_WORKER_CORES and SPARK_EXECUTOR_CORES, when using stand alone cluster manager.
1
vote
0 answers

How does SPARK_WORKER_MEMORY related to JVM heap size?

I am running Spark in standalone mode, inside a container. I can set SPARK_WORKER_MEMORY and I can set jvm heap size, but how should I think of them in relation to each other? Does the jvm heap need head room in addition to the spark worker memory?…
EMC
  • 1,560
  • 2
  • 17
  • 31
1
vote
1 answer

Kerberos authentication with Hadoop cluster from Spark stand alone cluster running on Kubernetes cluster

I have set up Spark Standalone cluster on Kubernetes, and I am trying to connect to a Kerberized Hadoop cluster which is NOT on Kubernetes. I have placed core-site.xml and hdfs-site.xml in my Spark cluster's container and have set HADOOP_CONF_DIR…
1
vote
1 answer

Output Spark application name in driver log

I need to output the Spark application name (spark.app.name) in each line of the driver log (along with other attributes like message and date). So far I failed to find the correct log4j configuration or any other hints. How could it be done? I…
Valentina
  • 518
  • 7
  • 18
1
vote
0 answers

how to start the master node of a spark cluster from R on windows?

Chapter 6: "Clusters" from the book "mastering spark with R", shows how to start the master node of a Spark standalone cluster from R: # Retrieve the Spark installation directory spark_home <- spark_home_dir() # Build paths and classes spark_path <-…
user1767316
  • 3,276
  • 3
  • 37
  • 46
1
vote
0 answers

Using `spark-submit` to start a job in a single node standalone spark cluster

I have a single node spark cluster (4 cpu cores and 15GB of memory) configured with a single worker. I can access the web UI and see the worker node. However, I am having trouble submitting the jobs using spark-submit. I have couple of questions. I…