I'm using ganglia 3.7.2 for monitoring hadoop(2.6.0-cdh5.4.0) cluster(7 servers), And I enabled metrics2 on hadoop & hbase; I installed gmetad on one server, and gmond on the other servers with yum ; At the beginning , the monitor runs very well, I can see the normal monitor data on the ganglia web page, but the problem is : After several hours, the rrd files are too many, so I have to make symbol link for path /var/lib/ganglia/rrds, and after a couple of days, the rrd files occupied almost 1TB disk space, and web page cannot show up the monitor data, anybody know how to fix this ?
gmond config (using single channel):
globals {
daemonize = yes
setuid = yes
user = ganglia
debug_level = 0
max_udp_msg_len = 1472
mute = no
deaf = no
allow_extra_data = yes
host_dmax = 86400 /*secs. Expires (removes from web interface) hosts in 1 day */
host_tmax = 20 /*secs */
cleanup_threshold = 300 /*secs */
gexec = no
send_metadata_interval = 60 /*secs */
}