4

I manage a cluster with several machines that is shared with other colleagues. Some using spark and some using Map Reduce.

Spark users usually open a context and have it open for days or weeks, while in MR the jobs start and finish.

The problem is a lot of times the MR job get stacked because:

  • After X% of the map phase it start running reducers.
  • Eventually you have a lot of reducers running and only 5-15 maps waiting to execute.
  • At this point there is no enough memory to start a new map, and the reducers cannot go over 33% because the maps have not finished yet producing a deadlock.

The only way to solve this problem is by killing one of the spark context and letting the maps finish.

Is there a way to configure yarn to avoid this problem?

Thanks.

user1753235
  • 199
  • 12

0 Answers0