Questions tagged [apache-spark-standalone]

Use for question related to Apache Spark standalone deploy mode (not local mode).

This tag should be used for questions specific to Standalone deploy mode. Questions might include cluster orchestration in standalone mode, or standalone specific features and configuration options.

Spark standalone mode is alternative to running Spark on Mesos or YARN. Standalone mode provides a simpler alternative to using more sophisticated resource managers, which might be useful or applicable on dedicated Spark cluster (i.e. not running other jobs).

"Standalone" speaks to the nature of running "alone" without an external resource manager.

Related tags:

164 questions
0
votes
1 answer

How to set spark standalone master url in play-framework conf/application.conf file?

plugging the play application with spark standalone cluster it executes well on dev mode but when trying to deploy in production mode it gives following error: Caused by: org.apache.spark.SparkException: A master URL must be set in your…
0
votes
1 answer

Spark Worker not joining Master after Master dies and comes back

I was wondering on how often does Worker pings Master to check on Master's liveness? Or is it the Master (Resource manager) that pings Workers to check on their liveness and if any workers are dead to spawn ? Or is it both? Some info: Standalone…
K P
  • 861
  • 1
  • 8
  • 25
0
votes
2 answers

Spark allocating all cores to a task

I have a task that would benefit from more cores but the standalone scheduler launches it when only a subset are available. I’d rather use all cluster cores on this task. Is there a way to tell the scheduler to finish everything before allocating…
pferrel
  • 5,673
  • 5
  • 30
  • 41
0
votes
0 answers

spark Dynamic Resource Allocation in a standalone

I have a question/problem regarding dynamic resource allocation. I am using spark 1.6.2 with stand alone cluster manager. I have one worker with 2 cores. I set the the folllowing arguments in the spark-defaults.conf file on all my…
Ofer Eliassaf
  • 2,870
  • 1
  • 17
  • 22
0
votes
0 answers

Spark 2.0 standalone mode collect() error

I'm working on Spark2.0(scala) and Play framework.I'm running this standalone mode application with Intellij IDEA. My application works just fine with local master object Utilities { val master = "local" //-------machine learning algorithm …
pzq317
  • 1
  • 1
  • 1
0
votes
0 answers

erro when submiting spark application

i am trying to submit a very simple application,it consists to create two rdds from one input large file (about 500 GO), subtract the header (first line), zip them with indexes, map them to key-value with a small modification then save them as a…
0
votes
1 answer

spark standalone on a cluster

i installed pre_built version of spark on each node of my cluster, (just download it then unzip it) Question 1 : Do i have to copy into conf directory the files slaves.template and spark-env.sh.template then edit them to connect my machines to each…
0
votes
0 answers

How to direct spark logs when running in client mode?

I am using pyspark to run an application on a cluster in client mode using standalone for monitoring. All I want to do is see the logs. I've tried two things: 1) I went to the config file (spark-defaults.conf) in SPARK_HOME: spark.eventLog.dir …
makansij
  • 9,303
  • 37
  • 105
  • 183
0
votes
1 answer

Early Initialization of objects on Worker nodes in Spark Cluster

I am using Drools with Spark in a stand alone cluster. I want to load the knowledge session on all the worker nodes at the startup i.e. before the map reduce task. I've tried passing the Statefull session from driver to slave nodes but its not…
0
votes
0 answers

Why "memory in use" = 1g in Spark Standalone?

I'm running Apache Spark with Standalone, and when I connect to myip:8080, I always see something like "Memory in use: 120.0 GB Total, 1.0 GB Used". Why only 1Gb is used if much more memory is available? Is it possible (or desirable) to increase the…
Ric
  • 3
  • 2
0
votes
1 answer

Spark resource scheduling - Standalone cluster manager

I have pretty low configuration testing machine for my data pipelines developed in Spark. I will use only one AWS t2.large instance, which has only 2 CPUs and 8 GB of RAM. I need to run 2 spark streaming jobs, as well as leave some memory and CPU…
Srdjan Nikitovic
  • 853
  • 2
  • 9
  • 19
-1
votes
1 answer

How to set standalone Spark configurations to locally run MLlib spark examples?

I wanna run Spark MLlib examples locally on my PC (I think it names standalone). I want to run JavaWord2VecExample.java. this file configuration is set for sessions that runs the Spark on some workers with a one Master but I want to run the class…
-1
votes
1 answer

which version of gsoap is the most stable under ubuntu 16.04?

I have been using the GSoap API and have different responses based on OS + GSoap combination. For GSoap gsoap_2.8.26, I run a developed Stand Alone GSoap Server and get the following when I do :http://22.22.222.222:8075/?conmony.wsdl, I get: > …
Casey Harrils
  • 2,793
  • 12
  • 52
  • 93
-2
votes
1 answer

Different outputs per number of partition in spark

I run spark code in my local machine and cluster. I create SparkContext object for local machine with following code: val sc = new SparkContext("local[*]", "Trial") I create SparkContext object for cluster with following code: val spark =…
ugur
  • 400
  • 6
  • 20
1 2 3
10
11