6

After my MapReduce job finishes, I get a whole lot of Counter information:

File System Counters
                FILE: Number of bytes read=4386096368
                FILE: Number of bytes written=8805370803
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=54583718086
                HDFS: Number of bytes written=4382090874
                HDFS: Number of read operations=1479
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=2
        Job Counters
                Launched map tasks=369
                Launched reduce tasks=1
                Data-local map tasks=369
                Total time spent by all maps in occupied slots (ms)=34288552
                Total time spent by all reduces in occupied slots (ms)=232084
                Total time spent by all map tasks (ms)=8572138
                Total time spent by all reduce tasks (ms)=58021
                Total vcore-seconds taken by all map tasks=8572138
                Total vcore-seconds taken by all reduce tasks=58021
                Total megabyte-seconds taken by all map tasks=35111477248
                Total megabyte-seconds taken by all reduce tasks=237654016
        Map-Reduce Framework
                Map input records=14753874
                Map output records=666776
                Map output bytes=4383426830
                Map output materialized bytes=4386098552
                Input split bytes=47970
                Combine input records=0
                Combine output records=0
                Reduce input groups=1
                Reduce shuffle bytes=4386098552
                Reduce input records=666776
                Reduce output records=666776
                Spilled Records=1333552
                Shuffled Maps =369
                Failed Shuffles=0
                Merged Map outputs=369
                GC time elapsed (ms)=1121584
                CPU time spent (ms)=23707900
                Physical memory (bytes) snapshot=152915259392
                Virtual memory (bytes) snapshot=2370755190784
                Total committed heap usage (bytes)=126644912128
        Shuffle Errors
                BAD_ID=0
                CONNECTION=0
                IO_ERROR=0
                WRONG_LENGTH=0
                WRONG_MAP=0
                WRONG_REDUCE=0
        File Input Format Counters
                Bytes Read=49449743227
        File Output Format Counters
                Bytes Written=4382090874

Where can I find an explanation for what each of these fields mean? Some of them are fairly obvious (Number of bytes read), but others are more ambiguous (Total time spent by all maps in occupied slots vs Total time spent by all map tasks).

I found a list of all the default counters, but I can't seem to find an explanation or description of them.

I'm fairly surprised that I can't seem to easily find documentation about this output. Can anyone provide a link or an explanation?

Mogsdad
  • 44,709
  • 21
  • 151
  • 275
Ryan Marcus
  • 966
  • 8
  • 21
  • 1
    See this link for some info on this: http://stackoverflow.com/questions/25482426/explanation-for-hadoop-mapreduce-console-output – AST Oct 20 '15 at 18:32
  • The explanation for these counters are available in Chapter 8 (Map Reduce Features) of the Book Hadoop - The Definitive Guide 3rd Edition by Tom White Hope this helps. Raj – Raju Oct 25 '15 at 12:30

1 Answers1

0

Chapter 8 of the Hadoop: The Definitive Guide (full PDF in link from Washington State University), provides details of Counters, to do with MapReduce. This starts on page 225 and are listed in Table 8-1. A more up to date edition (4th) of this resource is available at Safari Books Online (you will need to log in first).