I use ganglia to monitor performance-related metrics of cluster nodes.
I installed gmond python modules for richer functionality.
However, some metrics from some nodes are missing (i.e. disk_*_read_bytes_per_sec)
There are a few nodes that work as expected reporting the metrics. But some nodes are missing either disk__read_bytes_per_sec or disk__write_bytes_per_sec or both of them.
If I restart gmond daemon some work correctly again and some work incorrectly again....
I checked /etc/ganglia/gmond.conf, /etc/ganglia/conf.d/* configurations files. All the computation nodes in the cluster have the exactly same configuration settings. How can they behave such differently? Where should I check first to resolve the problem?
Thanks