Questions tagged [apache-spark-standalone]

Use for question related to Apache Spark standalone deploy mode (not local mode).

This tag should be used for questions specific to Standalone deploy mode. Questions might include cluster orchestration in standalone mode, or standalone specific features and configuration options.

Spark standalone mode is alternative to running Spark on Mesos or YARN. Standalone mode provides a simpler alternative to using more sophisticated resource managers, which might be useful or applicable on dedicated Spark cluster (i.e. not running other jobs).

"Standalone" speaks to the nature of running "alone" without an external resource manager.

Related tags: apache-spark

164 questions

votes

0 answers

Apache Spark: History server (logging) + non super-user access (HDFS)

I have a working HDFS and a running Spark framework in a remote server. I am running SparkR applications and hope to see the logs of the completed UI as well. I followed all the instructions here: Windows: Apache Spark History Server Config and…

hadoop apache-spark hdfs apache-spark-sql apache-spark-standalone

asked Jul 25 '16 at 07:58

turnip424

votes

0 answers

How to get result from Spark after submitting a job via REST API?

When I submit a Spark job through API /v1/submissions/create on port 6066 and check the status of it by /v1/submissions/status/{driver-id}, I only get something like this { "action" : "SubmissionStatusResponse", "driverState" : "FINISHED", …

apache-spark apache-spark-standalone

asked Dec 26 '21 at 07:41

MegaOwIer

votes

0 answers

Standalone Spark - How to find final status (Driver's) for an application

I am setting up Spark 2.2.0 in standalone mode (https://spark.apache.org/docs/latest/spark-standalone.html) and submitting spark jobs programatically using SparkLauncher sparkAppLauncher = new…

apache-spark apache-spark-standalone

asked Sep 26 '19 at 06:10

Nilkanth Patel

votes

1 answer

Apache Spark method not found sun.nio.ch.DirectBuffer.cleaner()Lsun/misc/Cleaner;

I encounter this problem while running an automated data processing script in spark-shell. First couple of iterations work fine, but it always sooner or later bumps into this error. I googled this issue but haven't found an exact match. Other…

scala apache-spark apache-spark-standalone spark-shell

asked Jan 23 '19 at 09:12

Bing-hsu Gao

votes

0 answers

Connecting to remote Spark Cluster

I'm having problem connecting to a spark cluster remotely from jupyter notebook. It works fine locally. Method 1: conf = pyspark.SparkConf().setAppName('Pi').setMaster('spark://my-cluster:7077') sc = pyspark.SparkContext(conf=conf) This returns…

apache-spark pyspark cluster-computing apache-spark-standalone

asked Apr 17 '18 at 17:37

beginner_

7,230
18
70
127

votes

0 answers

Buffer/cache exhaustion Spark standalone inside a Docker container

I have a very weird memory issue (which is what a lot of people will most likely say ;-)) with Spark running in standalone mode inside a Docker container. Our setup is as follows: We have a Docker container in which we have a Spring boot…

apache-spark docker apache-spark-sql apache-spark-standalone

asked Dec 08 '17 at 09:19

Stein Welberg

votes

3 answers

Why Spark utilizing only one core per executor? How it decides to utilize cores other than number of partitions?

I am running spark in HPC environment on slurm using Spark standalone mode spark version 1.6.1. The problem is my slurm node is not fully used in the spark standalone mode. I am using spark-submit in my slurm script. There are 16 cores available on…

apache-spark apache-spark-standalone

asked Apr 27 '17 at 21:23

Laeeq

votes

4 answers

Spark master won't show running application in UI when I use spark-submit for python script

The image shows 8081 UI. The master shows running application when I start a scala shell or pyspark shell. But when I use spark-submit to run a python script, master doesn't show any running application. This is the command I used: spark-submit…

apache-spark apache-spark-standalone

asked Dec 01 '16 at 09:16

kavya

votes

1 answer

Forcing driver to run on specific slave in spark standalone cluster running with "--deploy-mode cluster"

I am running a small spark cluster, with two EC2 instances (m4.xlarge). So far I have been running the spark master on one node, and a single spark slave (4 cores, 16g memory) on the other, then deploying my spark (streaming) app in client…

apache-spark apache-spark-standalone

asked Nov 10 '16 at 11:39

Adam Dossa

votes

1 answer

Is FAIR available for Spark Standalone cluster mode?

I'm having 2 node cluster with spark standalone cluster manager. I'm triggering more than one job using same sc with Scala multi threading.What I found is my jobs are scheduled one after another because of FIFO nature so I tried to use FAIR…

scala apache-spark hadoop-yarn apache-spark-standalone

asked Oct 27 '16 at 15:14

Balaji Reddy

5,576
3
36
47

votes

1 answer

java.lang.IllegalStateException: Cannot find any build directories

I want to run spark master and worker in Intellij. I have started the spark master and worker successfully. The worker is also connected to master without any problem. I can confirm this by looking at logs and spark web UI. But the problem starts…

java debugging intellij-idea apache-spark apache-spark-standalone

asked Sep 21 '16 at 12:01

Rana

votes

3 answers

Continuously INFO JobScheduler:59 - Added jobs for time *** ms in my Spark Standalone Cluster

We are working with Spark Standalone Cluster with 8 Cores and 32GB Ram, with 3 nodes cluster with same configuration. Some times streaming batch completed in less than 1sec. some times it takes more than 10 secs at that time below log will appears…

apache-spark spark-streaming apache-spark-standalone

asked Mar 29 '16 at 10:26

Charan Adabala

votes

1 answer

Role of the Executors on the Spark master machine

In a Spark stand alone cluster, does the Master node run tasks as well? I wasn't sure if there Executors processes are spun up on the Master node and do work, alongside the Worker nodes. Thanks!

apache-spark apache-spark-standalone

asked May 11 '15 at 04:52

Ranjit Iyer

votes

2 answers

Is writing to database done by driver or executor in spark cluster

I have a spark cluster setup with 1 master node and 2 worker nodes. I am running a pyspark application in this spark standalone cluster where I have a job to write the transformed data into Mysql database. So, I have a question here whether writing…

apache-spark pyspark apache-spark-sql apache-spark-standalone

asked Dec 13 '22 at 09:52

Saranraj K

votes

2 answers

Hadoop 3 gcs-connector doesn't work properly with latest version of spark 3 standalone mode

I wrote a simple Scala application which reads a parquet file from GCS bucket. The application uses : JDK 17 Scala 2.12.17 Spark SQL 3.3.1 gcs-connector of hadoop3-2.2.7 The connector is taken from Maven, imported via sbt (Scala build tool). I'm…

scala apache-spark hadoop google-cloud-storage apache-spark-standalone

asked Nov 25 '22 at 06:31

Bogdan Senyshyn

Prev 1 2

…

10 11 Next