I am following this tutorial http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster
I want to see which method or java process that use all the CPU as wordCount mapper going through the corpus by using yourkit. As I understand that Hadoop type "Text" is in utf-8 and Java String is in utf-16. I am trying to see that wordCount mapper does most the work in converting from utf-8 to utf-16.
However, yourkit doesn't show much details about this process. It only shows CPU time as follow: org.apache.hadoop.mapred.Child.main(String[]) ~ 96% and org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run() ~ 4%
I am not very familiar with yourkit. Could someone please point out how should I approach this?