2

I know I can control the max memory for a map (or reduce) task by setting JVM parameters. But I am wondering if there is a way to see current memory usage of a task?

kee
  • 10,969
  • 24
  • 107
  • 168

1 Answers1

6

enable remote HPROF profiling. HPROF is a profiling tool that comes with the JDK that, although basic, can give valuable information about a program’s CPU and heap usage. To use it, you can try this in your code:

conf.setBoolean("mapred.task.profile", true);
conf.set("mapred.task.profile.params", "-agentlib:hprof=cpu=samples," +
    "heap=sites,depth=6,force=n,thread=y,verbose=n,file=%s");
conf.set("mapred.task.profile.maps", "0-2");
conf.set("mapred.task.profile.reduces", ""); // no reduces

See "Hadoop The Definitve Guide", Chapter 5 -> "Tuning a Job" -> "Profiling Tasks" for more details.

Tejas Patil
  • 6,149
  • 1
  • 23
  • 38