1

For some reason Collectd works fine on all my virtual servers, but on the dom0 of my Xen setup it keeps returning a value of NaN, I have checked the RRD files and it seems somehow all plugins are returning NaN values.

Now NaN meaning "Not a Number" I wonder why it's not getting numbers from its plugins. Any idea what this might be, I can only think it's Xen related as it only happens on the host, and not on the VMs

Edit:

Worth noting, I have tested making collectd write to local rrd files, so not sending to a central server, and those RRD files also all contained NaN values.

Not Available
  • 226
  • 1
  • 16
  • Have you ran snmpwalk on the host/OIDs to verify that whatever you're trying to get info from exists at that OID? – Kendall Jun 02 '11 at 22:34
  • What plugins are you using? Are you sending the data from collectd on dom0 to the same collectd server that the VMs are sending the data to? – sciurus Jun 02 '11 at 23:12
  • @sciurus Yes it's sending the data to the same host as all the VMs. and plugins, it just uses some of the default ones cpu, disk, interface, load, etc – Not Available Jun 03 '11 at 01:16
  • NaN is the result of a division by zero. Not sure that helps, but FYI :) – Halfgaar Jun 03 '11 at 10:30
  • @Halfgaar: That is true, but not the only cause of NaN. For example, if the plugin is requesting an OID that does not exist on that host, you'll get something along the lines of "No such service(?) exists at this OID". I don't remember the specific message. But, that will also give you a NaN value. – Kendall Jun 03 '11 at 17:01
  • Ehm I'm not entirely sure how snmpwalk is of use with this, as collectd (as far as I know) doesn't use SNMP – Not Available Jun 04 '11 at 00:41
  • [Have you enabled the logs as described here?](http://collectd.org/faq.shtml) What do you get? Further question: Is this an AMD-multi-CPU-server? – Nils Jun 05 '11 at 20:37
  • Logging is enabled, and debug logging only lists that the Initialization is complete. And yes it's a dual core amd CPU – Not Available Jun 08 '11 at 18:55
  • Do you have the same version of collectd running on the dom0 host, and probably more importantly, do you have the same "types" files on the domU and dom0 hosts? Briefly try turning on a "CSV" logger as well as the RRD logger and look at the output of the dom0's csv files. Are you just using the collection.cgi thing for looking at the RRDs? – chris Jun 10 '11 at 20:53
  • @chris It also happens when not using a remote server to send the data too, and I looked at the RRD files using RRDTool. – Not Available Jun 11 '11 at 10:28

2 Answers2

0

Check the time on the dom0 host.

Chances are the clocks on your collectd collector and the dom0 system are out of sync.

chris
  • 11,944
  • 6
  • 42
  • 51
  • It also happens when not using a remote server but local RRD files, I have updated my question with this info, should've noted it. – Not Available Jun 11 '11 at 10:29
0

Okay so the issue has apparently disappeared, the statistics are now being reported correctly, now I have recently done 2 things (If I recall correctly):

  • Updated the kernel to a newer version
  • Updated collectd

I think it might've been a kernel bug that was the culprit, anyway, it's working again now, thank you all for your time.

Not Available
  • 226
  • 1
  • 16