When running Spark on Yarn I understand that Jobs may exceed their resource quota during quiet times but may be preempted when other users require their quota.
This seems fair however I occasionally see that an AM quota has been preempted causing my entire app to restart and the loss of hours worth of work.
This seems unfair! Could anyone explain the conditions AM container preemption occurs and how to prevent it? I would rather my app sits idle for a while than restart.