I'm running Hadoop in pseudodistributed mode for testing on my local machine. I'd like to monitor my mappers' and reducers' memory and CPU usage in JVisualVM. However, in JVisualVM's list of local applications, I only see org.apache.hadoop.util.RunJar
.
- Are the mappers and reducers running as separate processes? (In
top
, it looks like they are: two processes named "java" are using 100% CPU while my two mappers run.) If they are separate processes, why doesn't JVisualVM list them as applications that I can monitor? - Are the mappers and reducers contained within the
org.apache.hadoop.util.RunJar
process? If so, (a) why do I only seeTool
andToolRunner
in the JVisualVM Sampler, not any mapper/reducer code, and (b) why does JVisualVM report nearly 0% CPU whentop
reports 100%?
Is there some way I can modify my mappers/reducers so that JVisualVM can see them, at least while debugging in pseudodistributed mode?
For completeness, I should say that I'm running Hadoop 0.20 from Cloudera. (It was installed on Ubuntu using apt-get install hadoop-0.20-conf-pseudo
from the http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh precise-cdh4 contrib
repository. Even though Cloudera puts 2.x in the version number, it's not YARN, it's the original Hadoop.)
% hadoop version
Hadoop 2.0.0-cdh4.4.0
Subversion file:///var/lib/jenkins/workspace/generic-package-ubuntu64-12-04/CDH4.4.0-Packaging-Hadoop-2013-09-03_18-48-35/hadoop-2.0.0+1475-1.cdh4.4.0.p0.23~precise/src/hadoop-common-project/hadoop-common -r c0eba6cd38c984557e96a16ccd7356b7de835e79
Compiled by jenkins on Tue Sep 3 19:33:54 PDT 2013
From source with checksum ac7e170aa709b3ace13dc5f775487180
This command was run using /usr/lib/hadoop/hadoop-common-2.0.0-cdh4.4.0.jar