I want to use HPROF to profile my Hadoop job. The problem is that I get TRACES
but there is no CPU SAMPLES
in the profile.out
file. The code that I am using inside my run method is:
/** Get configuration */
Configuration conf = getConf();
conf.set("textinputformat.record.delimiter","\n\n");
conf.setStrings("args", args);
/** JVM PROFILING */
conf.setBoolean("mapreduce.task.profile", true);
conf.set("mapreduce.task.profile.params", "-agentlib:hprof=cpu=samples," +
"heap=sites,depth=6,force=n,thread=y,verbose=n,file=%s");
conf.set("mapreduce.task.profile.maps", "0-2");
conf.set("mapreduce.task.profile.reduces", "");
/** Job configuration */
Job job = new Job(conf, "HadoopSearch");
job.setJarByClass(Search.class);
job.setOutputKeyClass(Text.class);
job.setOutputValueClass(NullWritable.class);
/** Set Mapper and Reducer, use identity reducer*/
job.setMapperClass(Map.class);
job.setReducerClass(Reducer.class);
/** Set input and output formats */
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(TextOutputFormat.class);
/** Set input and output path */
FileInputFormat.addInputPath(job, new Path("/user/niko/16M"));
FileOutputFormat.setOutputPath(job, new Path(cmd.getOptionValue("output")));
job.waitForCompletion(true);
return 0;
How do I get the CPU SAMPLES
to be written in the output?
I also have s trange error message on the stderr
but I think it is not related, since it is present also when the profiling is set to false or the code for enabling profiling is commented out. The error is
log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.impl.MetricsSystemImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.