0

I tried to capture the disk I/O and network I/O of hadoop tasks(mapper and reducer), namely, instant bandwidth, accumulated traffic, source address and destination address. I found two popular monitoring tools for Hadoop: Ganglia (usually combined with Nagios) and X-Trace. Ganglia was brought up in 2004 by UC Berkeley and X-Trace was developed in 2007 also by UC Berkeley.

Any suggestion as to the pros and cons of these two tools is appreciated.

user1687035
  • 253
  • 1
  • 4
  • 13
  • Why did anybody rate my question as not useful without explaining anything. One can answer and then tell me it's a bad question. – user1687035 Jan 08 '13 at 02:27

1 Answers1

1

I'd get started with ganglia or munin--those will tell you about the resource utilization on different machines in your cluster.

X-trace is a fairly academic project that generates data about distributed transactions, latency and bottlenecks, and flow of control in distributed systems. Unfortunately, it's not really well supported currently.

dkuebric
  • 415
  • 1
  • 3
  • 6
  • Thanks. You're right, X-trace is not well supported right now. I'm using Ganglia because Hadoop has built-in support for it and it outputs lots of metrics. – user1687035 Jan 12 '13 at 00:09