1

I need to compare two Hadoop scheduling algorithms by job execution time. What could I use to get the duration of the execution for all tasks?

Roman Nikitchenko
  • 12,800
  • 7
  • 74
  • 110
MihaelaO
  • 55
  • 5

2 Answers2

0

You can see detailed information about the tasks and jobs from the address:

http://hostnameofmachinerunningtheJobTracker:50030/jobtracker.jsp

you can have other information from link

Alper
  • 771
  • 1
  • 9
  • 27
  • 1
    It should actually be the hostname of machine running the JobTracker service. If it is not on the machine same as NameNode then you have to keep that in mind. And there is nothing like hostname of HDFS. HDFS is spread across all the machines. There is either NameNode hostname or DataNode hostname. – Tariq Jun 05 '13 at 16:44
  • I am running the jobs on a remote grid, from a terminal and I don't have access to an interface to view this information. Does Hadoop also store this information in a log file? – MihaelaO Jun 06 '13 at 11:52
0

JobTracker web UI gives you very useful reports which allow to compare everything up to available logs for every mapper and reducer.

Also look on mrbench class inside hadoop-test.jar archive. There is plenty of information on the net about its usage for Hadoop cluster benchmarking like this article.

Roman Nikitchenko
  • 12,800
  • 7
  • 74
  • 110