Spark Executors are running on node manager machine in spite machine is in decommission state

Question

we have a spark cluster of 50 nodes, with YARN as the resource manager.

cluster is based on HDP version - 2.6.4 and , and cluster is based ambari platform

on yarn node manager yarn-machine34 , we set the machine as decommission state and also for datanode

decommission state performed successfully and we also can see that ,machines is signed in files:

/etc/hadoop/conf/dfs.exclude
/etc/hadoop/conf/yarn.exclude

but in spite that , we still saw The Spark Executors are running on yarn-machine34

so how it an be ?

as I understand decommission state should avoid any of spark running application / executes

so what else we can do about?

_"Graceful Decommission of YARN Nodes is the mechanism to decommission NMs while minimize the impact to running applications. Once a node is in DECOMMISSIONING state, **RM won’t schedule new containers on it and will wait for running containers and applications to complete** (or until decommissioning timeout exceeded) before transition the node into DECOMMISSIONED."_ What timeout did you set? — mazaneicha, Jan 25 '22 at 15:58
https://hadoop.apache.org/docs/stable/hadoop-yarn/hadoop-yarn-site/GracefulDecommission.html#per-node-decommission-timeout-support — mazaneicha, Jan 25 '22 at 16:28
from doc I see that "yarn rmadmin -refreshNodes [-g [timeout in seconds] -client|server]" , but we are using Ambari , and not cli , and from ambari I cant found this value — Judy, Jan 25 '22 at 16:39
What evidence are you using to show they are running on that currently? — Matt Andruff, Jan 26 '22 at 18:55
I don't think you need to worry about graceful decommission because 2.6.4 Ambari should be running hadoop 2.6 and the doc you pointed to is hadoop 3.3. So it might be possible graceful decommission is present but I couldn't find it in the documents for hadoop 2.6 — Matt Andruff, Jan 26 '22 at 19:02

0 Answers0