Questions tagged [spark-submit]

spark-submit is a script that is able to run apache-spark code written in e.g. java, scala or python

More information about spark-submit can be found here.

611 questions
-1
votes
1 answer

Pyspark 2.4 Issue faced while passing properties file in spark submit

I have a pyspark program which connects to MySQL db successfully and reads a table. Now, I am trying to pass the database credentials from a properties file, instead of embedding them in the code, but not able to make it work. from pyspark.sql…
-1
votes
1 answer

java.lang.NoClassDefFoundError spark-submit in yarn cluster mode, cluster being setup using Ambari

I'm using the spark-submit command as below: spark-submit --class com.example.hdfs.spark.RawDataAdapter --master yarn --deploy-mode cluster --jars /home/hadoop/emr/deployment/server/emr-core-1.0-SNAPSHOT.jar home/hadoop/emr-spark-1.0-SNAPSHOT.jar…
Satya
  • 1
  • 3
-1
votes
2 answers

spark-submit with python entry points

I have a script wordcount.py I used setuptools to create an entry point, named wordcount, so now I can call the command from anywhere in the system. I am trying to execute it via spark-submit (command: spark-submit wordcount) but it is failing with…
bill
  • 293
  • 2
  • 6
  • 17
-1
votes
1 answer

Spark asynchronous job fails with error

I'm writing code for spark in java. When I use foreachAsync spark fails and gives me java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext. In this code: JavaSparkContext sparkContext = new…
fena coder
  • 217
  • 4
  • 17
-1
votes
2 answers

nohup: ignoring input and appending output to ânohup.outâ

I'm recieving the following error when I try to run a spark-submit code in cloudera. "nohup: ignoring input and appending output to ânohup.outâ" My spark submit code doesn't seem to run. What could be causing this issue ?
-2
votes
1 answer

How to submit spark-submit from Apache Airflow

Can anyone help me how to schedule a spark job in the Apache Airflow, I am looking for the script please help me
-2
votes
1 answer

Check whether the job is completed or not through unix

I have to run multiple spark job one by one in a sequence, So I am writing a shell script. One way I can do is to check success file in output folder for job status, but i wanna know that is there any other way to check the status of spark-submit…
Kumar Harsh
  • 423
  • 5
  • 26
-2
votes
1 answer

Spark Program running very slow on cluster

I am trying to run my PySpark in Cluster with 2 nodes and 1 master (all have 16 Gb RAM). I have run my spark with below command. spark-submit --master yarn --deploy-mode cluster --name "Pyspark" --num-executors 40 --executor-memory 2g…
Ironman
  • 1,330
  • 2
  • 19
  • 40
-3
votes
1 answer

Apache Beam job hangs up when submitted via spark-submit

I am just trying to execute Apache Beam example code in local spark setup. I generated the source and built the package as mentioned in this page. And submitted the jar using spark-submit as below, $ ~/spark/bin/spark-submit --class…
Sathish Jayaram
  • 129
  • 3
  • 14
-4
votes
1 answer

How to cache spark streaming Dataset

I have a spark streaming Dataset which streams directory of csv files. So I have these questions: How to cache the streaming dataset. How to submit my spark streaming job in YARN so, my streaming job should run forever until manual…
-4
votes
1 answer

Why can't I put 'Val' in function definition arguments. Error ':' expected but '}' found pops up

I was trying to submit this code using spark submit. I found the following errors. I also would like to know how to do a function call in Scala and how do we start the function definition. I am calling file_reading function and I got the below…
Fasty
  • 784
  • 1
  • 11
  • 34
1 2 3
40
41