How to run some custom code when Hadoop YARN container exits/ends?

Question

In Hadoop YARN, the YARN containers exit when a SIGTERM signal is caught. So, how to detect when the YARN container is about to end and run some custom code. How do I inject it into the YARN framework?

I am looking for a solution especially for Spark on YARN but also a common solution applicable for other services that use YARN (Hive on Tez,MR)

Could you give a little more description of what your trying to clean it might help to give context as there are lots of ways to clean.... is it files? processes? what the count of something is? — Matt Andruff, Oct 20 '21 at 17:52
There are lots of internals to spark that help you put things in places so they'll get cleaned up.... that's why I'm asking if that's what you need help with. You can catch exceptions.... For sure it's frustrating when a executor gets killed, but there's usually a reason. — Matt Andruff, Oct 20 '21 at 17:58
I have a use-case where I have a custom Spark SQL UDF that accumulates logs in memory when it is called multiple times in Spark Tasks that run inside the Executor (YARN container) process. So, once all tasks are completed and the Executor YARN container is about to be exited, I want to aggregate and flush all the accumulated logs to an external store. Since Spark SQL UDF doesn't provide lifecycle methods like close() or cleanup(), I am looking whether it is possible by detecting YARN container's lifecycle to run my flush call. — FRG96, Oct 20 '21 at 18:52
I'd suggest just flushing the log every 'x' times to disk. There aren't good tools for what you want to do. — Matt Andruff, Oct 20 '21 at 20:00
Just an update: I added my log aggregation and flush call in a custom Spark ExecutorPlugin.shutdown() method which is invoked when the Executor YARN container is shutting down. It seems to work fine with Spark service in CDP Private Cloud Base 7.1 cluster. The shutdown method is present in both Spark 2 and 3, so it is valid for me. https://spark.apache.org/docs/2.4.4/api/java/org/apache/spark/ExecutorPlugin.html https://spark.apache.org/docs/3.1.2/api/java/org/apache/spark/api/plugin/ExecutorPlugin.html This solution is specific to Spark only and not for any general YARN application. — FRG96, Oct 22 '21 at 13:22

score 0 · Answer 1 · answered Oct 20 '21 at 12:35

0

If we are talking about cleaning up the node think about using:

yarn.nodemanager.localizer.cache.target-size-mb
yarn.nodemanager.localizer.cache.cleanup.interval-ms

Good explanation of those properties here.

answered Oct 20 '21 at 12:35

Matt Andruff

4,974
1
5
21

Actually, I meant any custom code that should run when the YARN container exits. I have edited my question. – FRG96 Oct 20 '21 at 17:26

score 0 · Answer 2 · answered Oct 20 '21 at 17:51

For True freedom of SIGTERM you may want to dig into the code of yarn itself to find how you could hijack or extend the yarn container executor itself to bend it to your will. This would mean compiling and deploying your code to the cluster but there is a project called BipTop which helps you do that sort of thing.

score 0 · Answer 3 · answered Oct 20 '21 at 19:59

If... you aren't going to log a lot and want to log a little ....you can abuse accumulators to do your bidding and log information to the driver. Here's a great explanation/example. It's not made for logging but if you use it really sparingly, like for a handful of items it will do the job. Accumulators are most useful for counting things. They also will log the count at least once. (If a executor dies and re-runs it could count something twice so be wary.) (They're a hold over from mappers/reducers.)

A better abuse of string accumulators: You could use it post where the location of your log file is so you can retrieve the file later.

How to run some custom code when Hadoop YARN container exits/ends?

3 Answers3