Determine cause of Hadoop error in code, as standard logs inconclusive: file splits, container memory, or block size

Question

For a while now I've been going through the log4j logs trying to determine why my Hadoop job is crashing.

Essentially what the job attempts to do is issue a command on the underlying machine, and collect the output of that command- at the moment point all of these steps takes place in a map job (later I'll try to reduce over the sum of those individual outputs).

The behaviour I'm experiencing is that- for a certain number of outputs generated to the BufferedReader, for sake of conversation- 28 of them, everything works fine and the job finishes almost immediately, however when I increase that number to 29, the map job hangs at 67% completion- attempts it three times- always stopping at 67% and finally terminates itself for lack of progress.

From the NameNode where the job is issued we can see the following output:

17/10/09 15:19:29 WARN mapreduce.JobResourceUploader: Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
17/10/09 15:19:29 INFO input.FileInputFormat: Total input files to process : 1
17/10/09 15:19:30 INFO mapreduce.JobSubmitter: number of splits:1
17/10/09 15:19:30 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1507562353923_0001
17/10/09 15:19:30 INFO impl.YarnClientImpl: Submitted application application_1507562353923_0001
17/10/09 15:19:30 INFO mapreduce.Job: The url to track the job: http://master:8088/proxy/application_1507562353923_0001/
17/10/09 15:19:30 INFO mapreduce.Job: Running job: job_1507562353923_0001
17/10/09 15:19:36 INFO mapreduce.Job: Job job_1507562353923_0001 running in uber mode : false
17/10/09 15:19:36 INFO mapreduce.Job:  map 0% reduce 0%
17/10/09 15:19:53 INFO mapreduce.Job:  map 67% reduce 0%
17/10/09 15:30:05 INFO mapreduce.Job: Task Id : attempt_1507562353923_0001_m_000000_0, Status : FAILED
AttemptID:attempt_1507562353923_0001_m_000000_0 Timed out after 600 secs
Sent signal OUTPUT_THREAD_DUMP (SIGQUIT) to pid 6230 as user ubuntu for container container_1507562353923_0001_01_000002, result=success
Container killed by the ApplicationMaster.
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143

17/10/09 15:30:06 INFO mapreduce.Job:  map 0% reduce 0%
17/10/09 15:30:26 INFO mapreduce.Job:  map 67% reduce 0%
17/10/09 15:40:36 INFO mapreduce.Job: Task Id : attempt_1507562353923_0001_m_000000_1, Status : FAILED
AttemptID:attempt_1507562353923_0001_m_000000_1 Timed out after 600 secs
17/10/09 15:40:37 INFO mapreduce.Job:  map 0% reduce 0%
17/10/09 15:40:52 INFO mapreduce.Job:  map 67% reduce 0%
17/10/09 15:51:05 INFO mapreduce.Job: Task Id : attempt_1507562353923_0001_m_000000_2, Status : FAILED
AttemptID:attempt_1507562353923_0001_m_000000_2 Timed out after 600 secs
17/10/09 15:51:06 INFO mapreduce.Job:  map 0% reduce 0%
17/10/09 15:51:24 INFO mapreduce.Job:  map 67% reduce 0%
17/10/09 16:01:37 INFO mapreduce.Job:  map 100% reduce 100%
17/10/09 16:01:37 INFO mapreduce.Job: Job job_1507562353923_0001 failed with state FAILED due to: Task failed task_1507562353923_0001_m_000000
Job failed as tasks failed. failedMaps:1 failedReduces:0

17/10/09 16:01:37 INFO mapreduce.Job: Counters: 13
    Job Counters 
        Failed map tasks=4
        Killed reduce tasks=1
        Launched map tasks=4
        Other local map tasks=3
        Data-local map tasks=1
        Total time spent by all maps in occupied slots (ms)=5025782
        Total time spent by all reduces in occupied slots (ms)=0
        Total time spent by all map tasks (ms)=2512891
        Total time spent by all reduce tasks (ms)=0
        Total vcore-milliseconds taken by all map tasks=2512891
        Total vcore-milliseconds taken by all reduce tasks=0
        Total megabyte-milliseconds taken by all map tasks=10292801536
        Total megabyte-milliseconds taken by all reduce tasks=0

To determine the root cause of this problem I've spent many hours meticulously pouring over the log4j output logs, but they don't turn up anything conclusive- for instance- they look almost identical for sucessful jobs and unsuccessful jobs, with the minor caveat that unsucessful jobs, abruptly terminate like so:

The most interesting part seems to be here (from one of the previous attempts(Hadoop always makes 3 tries at completing the job)):

2017-10-09 15:40:35,821 WARN [main] OpcodeCount$TokenizerMapper: Code: opAdd, Time: 214
2017-10-09 15:40:35,821 WARN [main] OpcodeCount$TokenizerMapper: Code: opAdd, Time: 450
2017-10-09 15:40:35,821 WARN [main] OpcodeCount$TokenizerMapper: Code: opAdd, Time: 217
2017-10-09 15:40:35,821 WARN [main] OpcodeCount$TokenizerMapper: Code: opStop, Time: 165
2017-10-09 15:40:35,851 INFO [main] org.apache.hadoop.mapred.MapTask: Starting flush of map output
2017-10-09 15:40:35,851 INFO [main] org.apache.hadoop.mapred.MapTask: Spilling map output
2017-10-09 15:40:35,851 INFO [main] org.apache.hadoop.mapred.MapTask: bufstart = 0; bufend = 1977; bufvoid = 104857600
2017-10-09 15:40:35,851 INFO [main] org.apache.hadoop.mapred.MapTask: kvstart = 26214396(104857584); kvend = 26214396(104857584); length = 1/6553600
(&container_1507562353923_0001_01_000005???stderr0stdout105912017-10-09 16:01:35
Full thread dump OpenJDK 64-Bit Server VM (25.131-b11 mixed mode):

The word "Full thread" lead me to believe that my MapOutputBuffer was filling up, as described in this article, I tried the recommended to solution of increasing the size in my configuration files, i.e. increasing mapreduce.task.io.sort.mb, to 1000, and mapreduce.map.sort.spill.percent to 0.99, but it had no positive effect.

The next thing I tried was to change the program from outputting the debug/sanity check output in the form of a Log from log4j, and instead have it write that to a file as decribed here, but that also didn't work.

At the moment my ideas of what could be wrong with this program are as follows:

1) MapOutputBuffer is filling up causing the program to stall out

Something I should mention is that- it's important for one file to execute as an atomic map task- if the command from the file gets broken up into smaller component pieces, the operation that it issues to the underlying OS won't make sense- therefore:

2) The input file is broken up inside of HDFS and not being read in as one sequential unit- in the correct order

3) That the command is being broken up between containers and that part of the command is being issued in one container, while another container takes care of the other part- thereby causing the issue I just specified above.

So then as the logs have been inconclusive my next goal is to change my code in some way such that I can isolate, test for, and eliminate each one of these possibilities --> any ideas on how I can do that in an efficient way- or any and all comments, insights, remarks about anything else I've written here will be much appreciated.

Here's how the map job is written at the moment:

  public static class TokenizerMapper extends Mapper<Object, Text, Text, IntWritable>{

    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    // declare logger
    private final Logger LOG = org.apache.log4j.Logger.getLogger(this.getClass());

    public void map(Object key, Text value, Context context) throws IOException, InterruptedException {

      // output data struct
      List<Map.Entry<String,Integer>> pairList = new ArrayList<>();

      StringTokenizer itr = new StringTokenizer(value.toString());
      while (itr.hasMoreTokens()) {
          try {
              // command execution
              Runtime rt = Runtime.getRuntime();
              String evmDir = "/home/ubuntu/go/src/github.com/ethereum/go-ethereum/build/bin/evm";
              String command = evmDir + " --debug --code " + value.toString() + " run";
              Process proc = Runtime.getRuntime().exec(command);

              LOG.warn(command);

              BufferedReader stdInput = new BufferedReader(new InputStreamReader(proc.getInputStream()));
              BufferedReader stdError = new BufferedReader(new InputStreamReader(proc.getErrorStream()));

              // define and initialise representation to hold 'evm' command output
              ArrayList<String> consoleOutput = new ArrayList<String>();
              ArrayList<String> debugOutput   = new ArrayList<String>();

              String s = null;
              while ((s = stdInput.readLine()) != null) {
                  consoleOutput.add(s);
              }
              while ((s = stdError.readLine()) != null) {
                  debugOutput.add(s);
              }

              for (String p : consoleOutput) {
                  Pattern pattern = Pattern.compile("([A-Za-z]+)([ \t]+)(\\d+)");
                  Matcher matcher = pattern.matcher(p);
                  while (matcher.find()) {
                      String opcodeName = matcher.group(1);
                      Integer executionStepTime = Integer.valueOf(matcher.group(3));
                      // add component pieces of line to output data structure
                      pairList.add(new AbstractMap.SimpleEntry<>(opcodeName, executionStepTime));
                  }
              }

          } catch (IOException e) {
              //LOG.warn(e);
              LOG.warn("Exception Encountered! " + e);
          }

          // log output for debugging
          for (Map.Entry<String, Integer> entry : pairList) {
              String opcodeRep = entry.getKey().toString();
              Integer stepTime = entry.getValue();
              LOG.warn("Code: " + opcodeRep + ", Time: " + stepTime);
          }

          word.set(itr.nextToken());
          context.write(word, one);
      }
    }
  }

Maybe this addtional information from the logs looks like it could be pertinent to correctly diagnosing this issue:

"VM Thread" os_prio=0 tid=0x00007f0a1007a000 nid=0x7328 runnable 

"GC task thread#0 (ParallelGC)" os_prio=0 tid=0x00007f0a10026000 nid=0x7326 runnable 

"GC task thread#1 (ParallelGC)" os_prio=0 tid=0x00007f0a10027800 nid=0x7327 runnable 

"VM Periodic Task Thread" os_prio=0 tid=0x00007f0a100bc800 nid=0x732f waiting on condition 

JNI global references: 277

Heap
 PSYoungGen      total 100352K, used 7502K [0x0000000780000000, 0x0000000788a00000, 0x00000007c0000000)
  eden space 94720K, 2% used [0x0000000780000000,0x0000000780216d78,0x0000000785c80000)
  from space 5632K, 95% used [0x0000000788480000,0x00000007889bcbc8,0x0000000788a00000)
  to   space 8192K, 0% used [0x0000000787a00000,0x0000000787a00000,0x0000000788200000)
 ParOldGen       total 143360K, used 113628K [0x0000000700000000, 0x0000000708c00000, 0x0000000780000000)
  object space 143360K, 79% used [0x0000000700000000,0x0000000706ef71d0,0x0000000708c00000)
 Metaspace       used 25981K, capacity 26224K, committed 26496K, reserved 1073152K
  class space    used 3019K, capacity 3083K, committed 3200K, reserved 1048576K

venkat sam · Accepted Answer · 2017-10-11T02:34:37.510

To understand where exactly the mapper is stuck at, jstack can be used [to get the thread dump].

Jstack ships with the jdk and you can use it on the stuck mapper process as follows.

Step0: Find the hostname in which your map task were running and make a note of the task_id

step1: login to the node and run

ps aux | grep task_id

identify the process id and the username of the process which is starting with /usr/java/jdk/bin/java

step2: su to the process owner username

step3: export java home and bin path [example: export JAVA_HOME=/usr/java/jdk1.7.0_67 && export PATH=$JAVA_HOME/bin:$PATH]

step4:replace pid with pid you obtained in step1:

export PID=PID for i in $(seq 1 10); do echo "Jstack Iteration $i"; jstack $PID > /tmp/hungtask-hostname-${PID}.jstack.$i; sleep 5s; done tar zcvf hungtask.tar.gz /tmp/hungtask-hostname-${PID}.jstack.*

the hungtask.tar.gz will contain the thread dump of the process taken at intervals of 5 seconds for ten times. You may need to run the script at the point when the task goes into hung state.

After this if you can upload hungtask.tar.gz to this thread, I can see and share my observation.

Also to understand if the process is undergoing frequent GC you can try the below command

jstat -gc -t PID STEP

PID is the process ID of the java process to monitor STEP is the sample timestep

You can paste the contents to the website http://nix-on.blogspot.in/2015/01/java-jstat-how-to-visualize-garbage.html to understand if it is undergoing excessive GC

so, I'm pretty confused about how to do this... on the slave node that was executing the job I tried to find the `pid` first with `jps`, and I got the following output: `6432 DataNode 7986 Jps 7826 YarnChild 7876 YarnChild 6573 NodeManager`, then I tried to execute your command on both of those `pid`'s, in this way `export PID=7826 for i in $(seq 1 10); do echo "Jstack Iteration $i"; jstack $PID > /tmp/hungtask-hostname-${PID}.jstack.$i; sleep 5s; done tar zcvf hungtask.tar.gz /tmp/hungtask-hostname-${PID}.jstack.*`, but I got the error `-bash: syntax error near unexpected token `do'` — smatthewenglish, Oct 12 '17 at 19:38
man, what do you think of this one https://stackoverflow.com/questions/46721969/g%C3%B6del-escher-bach-the-epiphenomenological-hadoop-job-fail-even-in-uber-mod — smatthewenglish, Oct 13 '17 at 13:21

Determine cause of Hadoop error in code, as standard logs inconclusive: file splits, container memory, or block size

1 Answers1