Yarn not respecting map-only job after migrating from CDH 4.2.1 Cluster to CDH 5.2.0 Cluster managed by cloudera manager

Question

This is an odd one, recently we started to migrate from an older CDH 4.2.1 cluster running MRv1 to a CM5 managed CDH 5.2.0 cluster running Mrv2(YARN) and have run into some rather unusual problems. The workflow processes roughly 1.2TB of data and on the CDH 4.2.1 cluster the processing queries fired use no reducers and each individual map output is stored as a single file(takes about 35 minutes)

On the CDH 5.2.0 cluster the workflow fails most of the time (after 3+ times the length of time normally taken) and always attempts to combine the output of all the mappers into a single file, we think that this is where it is falling over.

All the error logs point to the Shuffle and sort phase failing with an out of heap space error.

We have tried using the two parameters to specify no reducers (mapred.reduce.tasks = 0 and mapreduce.jobs.reduces = 0) but this has no effect.

This is a HiveQL query using a python transform to process data fields and the exact code,queries,tables, and workflow has been migrated.

Has anyone else run into this problem or could anyone shed some light on it?

Thanks,

Anthony

Managed to eventually fix it by setting hive.optimize.sort.dynamic.partition to false... Seems this parameter is set to true by default in CDH5. — Anthony, Apr 13 '15 at 12:35

Yarn not respecting map-only job after migrating from CDH 4.2.1 Cluster to CDH 5.2.0 Cluster managed by cloudera manager

0 Answers0