Highest Voted 'spark-submit' Questions

1

vote

0 answers

Spark-submit issue - looking for non-existent path

I am trying to run spark-submit: /usr/local/Cellar/apache-spark/2.3.0/libexec/bin/spark-submit sdp-consumer.py It gives error: /usr/local/Cellar/apache-spark/2.3.0/libexec/bin/spark-submit: line 27:…

python apache-spark spark-submit

asked Apr 18 '18 at 18:56

Joe

11,983
31
109
183

1

vote

0 answers

cannot pickle pyspark dataframe

I want to create a decision tree model using spark submit. from pyspark.mllib.regression import LabeledPoint from pyspark.mllib.tree import DecisionTree from pyspark import SparkConf, SparkContext from numpy import array from pyspark.sql import…

pyspark pickle spark-submit

asked Apr 02 '18 at 09:55

betty bth

33
7

1

vote

1 answer

How to set spark.driver.extraClassPath through Apache Livy on Azure Spark cluster?

I would would like to add some configuration when a Spark Job is submitted via Apache Livy into an Azure cluster. Currently to launch a spark Job via Apache Livy in the cluster, I use the following command curl -X POST --data '{"file":…

azure apache-spark spark-submit livy

asked Mar 30 '18 at 08:38

moun

69
1
6

1

vote

0 answers

PySpark fails with exit code 52

I have an Amazon EMR cluster running, to which I submit jobs using the spark-submit shell command. The way I call it: spark-submit --master yarn --driver-memory 10g convert.py The convert.py script is running using PySpark with Python 3.4. After…

amazon-web-services apache-spark pyspark amazon-emr spark-submit

asked Mar 26 '18 at 21:07

Vlad

23
1
6

1

vote

1 answer

Submit Python Script into Spark Cluster

Im trying to submit the following python script into Spark Cluster. I have 2 slaves running from sklearn import grid_search, datasets from sklearn.ensemble import RandomForestClassifier # Use spark_sklearn’s grid search instead: from…

apache-spark pyspark spark-submit

asked Mar 03 '18 at 06:01

syv

3,528
7
35
50

1

vote

0 answers

Create fat runnable jar for spark-submit

I'm writing a Spark application using Java(not Scala). Something like: SparkConf conf = new SparkConf().setAppName("TEST"); JavaSparkContext sc = new JavaSparkContext(conf); sc.setLogLevel("WARN"); I started with simple java project and when to…

java maven maven-plugin executable-jar spark-submit

asked Feb 28 '18 at 15:47

DXC

21
1

1

vote

1 answer

PySpark failing in Jupyter after setting PYSPARK_SUBMIT_ARGS

I'm trying to load a Spark (2.2.1) package in a Jupyter notebook that can otherwise run Spark fine. Once I add %env PYSPARK_SUBMIT_ARGS='--packages com.databricks:spark-redshift_2.10:2.0.1 pyspark-shell' I get this error upon trying to create a…

python apache-spark pyspark jupyter-notebook spark-submit

asked Feb 28 '18 at 00:17

lfk

2,423
6
29
46

1

vote

1 answer

Spark standalone connection driver to worker

I'm trying to host locally a spark standalone cluster. I have two heterogeneous machines connected on a LAN. Each piece of the architecture listed below is running on docker. I have the following configuration master on machine 1 (port 7077…

apache-spark spark-submit apache-spark-standalone

asked Jan 16 '18 at 21:10

Matthias Beaupère

1,731
2
17
44

1

vote

1 answer

Spark Application Not reading log4j.properties present in Jar

I am using MapR5.2 - Spark version 2.1.0 And i am running my spark app jar in Yarn CLuster mode. I have tried all the available options that i found But unable to succeed. This is our Production environment. But i need that for my particular spark…

scala apache-spark hadoop-yarn mapr spark-submit

asked Dec 17 '17 at 22:39

AJm

993
2
20
39

1

vote

0 answers

'Cannot allocate memory' when submit Spark job

I got an error when try to submit a Spark Job to yarn. But I can't understand which JVM throwed this error. How can I avoid this error? Java HotSpot(TM) 64-Bit Server VM warning: INFO: os::commit_memory(0x00000006bff80000, 3579314176, 0) failed;…

apache-spark spark-submit

asked Dec 12 '17 at 11:50

lulijun

415
3
22

1

vote

1 answer

spark-submit to a docker container

I created a Spark Cluster using this repository and the relative documentation. Now I'm trying to execute through spark-submit a job inside the Docker container of the Spark Master so the command that I use is something…

hadoop apache-spark docker spark-submit

asked Nov 30 '17 at 18:09

Vzzarr

4,600
2
43
80

1

vote

1 answer

Spark submitted application not shown in YARN web ui

I have node where I have installed spark in yarn mode. When I run an application with sudo ./usr/bin/spark-submit --master yarn --deploy-mode client MySparkCode.py it runs fine. When I connect in spark history server at http://localhost:18089/ I…

apache-spark pyspark hadoop-yarn spark-submit

asked Nov 06 '17 at 09:37

Michail N

3,647
2
32
51

1

vote

0 answers

Spark fails on task with SparkException: Can only zip RDDs with same... with no zip() direct call

I'm using Spark with a spark-submit call and a python script for cluster analysis I send with it. From the script: spark = SparkSession.builder.appName(results.taskName).getOrCreate() dataset =…

python apache-spark pyspark spark-submit

asked Oct 17 '17 at 15:13

Ran P

332
2
4
11

1

vote

0 answers

Copy files (config) from HDFS to local working directory of every spark executor

I am looking how to copy a folder with files of resource dependencies from HDFS to a local working directory of each spark executor using Java. I was at first thinking of using --files FILES option of spark-submit but it seems it does not support…

java hadoop apache-spark hdfs spark-submit

asked Oct 01 '17 at 17:33

YuGagarin

341
7
20

1

vote

0 answers

Trying to run a spark-submit job on a yarn cluster but I keep getting the following warning. How do I fix the issue?

WARN YarnClusterScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources. I have looked through similar questions and tried everything else that was…

apache-spark hdfs hadoop-yarn spark-submit

asked Sep 21 '17 at 17:13

Sonia S

15
5

Questions tagged [spark-submit]