I was running a map reduce Hadoop job on Amazon EMR 5.5.2 which uses Hadoop 2.7.3.
I recently upgraded EMR to 5.12.1 which uses Hadoop 2.8.0.
For the same input load, my new cluster is running comparatively very slow.
I am not able to find out the reason. Maybe I will need to tweak some performance parameters.
Following are the map reduce job counters. Looking at these counters can anybody have any insights on which performance parameters are wrong?
Job Counters
File System Counters
FILE: Number of bytes read=1087
FILE: Number of bytes written=24787084
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=15840
HDFS: Number of bytes written=0
HDFS: Number of read operations=132
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
S3N: Number of bytes read=0
S3N: Number of bytes written=4315
S3N: Number of read operations=0
S3N: Number of large read operations=0
S3N: Number of write operations=0
Job Counters
Launched map tasks=132
Launched reduce tasks=7
Other local map tasks=132
Total time spent by all maps in occupied slots (ms)=1576936320
Total time spent by all reduces in occupied slots (ms)=26894720
Total time spent by all map tasks (ms)=2463963
Total time spent by all reduce tasks (ms)=42023
Total vcore-milliseconds taken by all map tasks=2463963
Total vcore-milliseconds taken by all reduce tasks=42023
Total megabyte-milliseconds taken by all map tasks=50461962240
Total megabyte-milliseconds taken by all reduce tasks=860631040
Map-Reduce Framework
Map input records=12523
Map output records=2
Map output bytes=3236
Map output materialized bytes=15935
Input split bytes=15840
Combine input records=0
Combine output records=0
Reduce input groups=1
Reduce shuffle bytes=15935
Reduce input records=2
Reduce output records=8
Spilled Records=4
Shuffled Maps =924
Failed Shuffles=0
Merged Map outputs=924
GC time elapsed (ms)=64327
CPU time spent (ms)=2737480
Physical memory (bytes) snapshot=166237839360
Virtual memory (bytes) snapshot=2760473792512
Total committed heap usage (bytes)=187218526208