2

Code below is successfully created spark context when I submit using spark submit and running fine.

When I kill application under Running Applications from Apache spark web UI, application state shows killed but, printing Test application on screen after killing also:

Application running on apache spark web UI:

enter image description here

Application killed using "kill" button on spark web UI

enter image description here

Still printing message on screen after killing application

enter image description here

Need solution to auto kill python job when I kill spark-context

from pyspark import SparkConf
from pyspark import SparkContext

if __name__ == "__main__":
    conf = SparkConf().setAppName("TEST")
    conf.set("spark.scheduler.mode", "FAIR")
    sc = SparkContext(conf=conf)

    while True:
        print("Test application")
Alexandre B.
  • 5,387
  • 2
  • 17
  • 40
Siddeshwar
  • 73
  • 1
  • 8
  • Are you running Spark Application over Yarn. If yes then kill it from Yarn using command "Yarn application -kill " – Nikhil Suthar Jun 20 '19 at 12:15
  • Submitting through command editor **spark-submit test_spark.py** manually and I can see print output on same command editor. – Siddeshwar Jun 20 '19 at 12:19

3 Answers3

0

You can do it the old fashioned way.

Run ps -ef and find the java job id. Then run kill -9

//Find all the java jobs
[stack_overflow@stack_overflow ~]$ ps -ef | grep SparkSubmit
stack_overflow  96747  96736 99 11:19 pts/15   00:01:55 /usr/bin/java -cp /opt/spark/conf/:/opt/spark/jars/* -Dscala.usejavacp=true -Xmx1g -Dderby.system.home=/home/stack_overflow/Spark/ org.apache.spark.deploy.SparkSubmit --conf spark.local.dir=/opt/spark/temp_land/spark-temp --conf spark.driver.extraJavaOptions=-Dderby.system.home=/home/stack_overflow/ --class org.apache.spark.repl.Main --name Spark shell spark-shell
stack_overflow  97410  14952  0 11:20 pts/15   00:00:00 grep --color=auto SparkSubmit
//96747 is the Spark job I forced to become unresponsive
//97410 is the Base Spark Account don't delete
////Run the kill command on the job, only works if you have permissions on that job
[stack_overflow@stack_overflow ~]$ kill -9 96747
//The job is now dead and gone
[stack_overflow@stack_overflow ~]$ ps -ef | grep SparkSubmit
stack_overflow  96190  14952  0 11:17 pts/15   00:00:00 grep --color=auto SparkSubmit
afeldman
  • 492
  • 2
  • 10
  • Thanks for response. Can I handle in python program? Reason: in production I can be able to submit multiple spark jobs jobs and each job contains separate sparkcontext. I'm expecting if I kill spark context session on spark UI it should kill whole application execution. – Siddeshwar Jun 25 '19 at 04:40
  • Lets say I spin up 10 Spark Jobs each with their own SparkContext then they should each have their own PID or Job_ID to kill. For your question, no you can kill directly from Spark (everyonce and a while the job goes aberrant) as you have seen and doesn't die. However if you are interested you can set up a watcher to kill on command. (1) From a high-level get a program to watch a folder, check every 2 seconds and sleep, (2) When you make your SparkContext give it a unique App_Name, (3) When you want it to be killed use java.file.io to place a file in the folder with the name of App Name. – afeldman Jun 25 '19 at 15:41
  • The watcher parses the App_Name uses it in the above grep, identified the pid, and kill it. Sorry there really isn't an easy way to kill these jobs dynamically in Spark. – afeldman Jun 25 '19 at 15:42
  • Thanks for details. Below I posted code it worked for me. – Siddeshwar Jun 26 '19 at 05:06
0

You can open another session and see if your spark application is still running -

yarn application -list <app_id>

then kill your application if still running,

yarn application -kill <app_id>
Ajay Ahuja
  • 1,196
  • 11
  • 26
0

I found a way to solve my issue with below code. Thanks for all your responses

from pyspark import SparkConf
from pyspark import SparkContext

if __name__ == "__main__":
    conf = SparkConf().setAppName("TEST")
    conf.set("spark.scheduler.mode", "FAIR")
    sc = SparkContext(conf=conf)

    while True:
        if sc._jsc.sc().isStopped():
            break
        print("Test application")
Siddeshwar
  • 73
  • 1
  • 8