21

When submitting spark streaming program using spark-submit(YARN mode) it keep polling the status and never exit

Is there any option in spark-submit to exit after the submission?

===why this trouble me===

The streaming program will run forever and i don't need the status update

I can ctrl+c to stop it if i start it manually but i have lots of streaming context to start and i need to start them using script

I can put the spark-submit program in background, but after lots of background java process created, the user corresponding to, will not able to run any other java process because JVM cannot create GC thread

Peter Chan
  • 255
  • 1
  • 2
  • 7

3 Answers3

103

I know this is an old question but there's a way to do this now by setting --conf spark.yarn.submit.waitAppCompletion=false when you're using spark-submit. With this the client will exit after successfully submitting the application.

In YARN cluster mode, controls whether the client waits to exit until the application completes. If set to true, the client process will stay alive reporting the application's status. Otherwise, the client process will exit after submission.

Also, you may need to set --deploy-mode to cluster

In cluster mode, the Spark driver runs inside an application master process which is managed by YARN on the cluster, and the client can go away after initiating the application.

More at https://spark.apache.org/docs/latest/running-on-yarn.html

Mateusz Dymczyk
  • 14,969
  • 10
  • 59
  • 94
  • 3
    You deserve a medal. – Navarro Feb 02 '17 at 15:54
  • @Peter Chan, please update this as the accepted answer. – TayTay Mar 28 '18 at 15:05
  • the accepted answer requires the user manually manage the submission process and still does not make the application master aware of the choice to not watch the application report. This answer uses a supported option. – xor007 May 14 '18 at 11:23
  • Documented here: https://spark.apache.org/docs/latest/running-on-yarn.html#spark-properties – flow2k Jul 06 '19 at 00:42
  • This is good answer but there one issue after configuring with --conf spark.yarn.submit.waitAppCompletion=false the application will exit as soon as it is accepted, is there any way to hold the job till it changes the status to RUNNING from ACCEPTED state. – agaonsindhe Mar 15 '22 at 12:42
1

Interesting. I never thought about this issue. Not sure there is a clean way to do this, but I simply kill the submit process on the machine and the yarn job continues to run until you stop it specifically. So you can create a script that execute the spark submit and then kills it. When you will actually wanna stop the job use yarn -kill. Dirty but works.

z-star
  • 680
  • 5
  • 6
  • Thank you for answering. I guess this is the way to go for now. It will be a lot better and i think its the right way to do if spark-submit provide an option to exit after submit. – Peter Chan May 16 '16 at 00:32
-4

command timeout TIME CMD will close CMD after TIME

hustljian
  • 965
  • 12
  • 9