I want to use CollectD to gather some statistics (about storage) and have Graphite display them nicely. Apparently this can be done either by
- having CollectD store the data as RRD files and pointing Graphite at those, or
- using a CollectD plugin to push the data to Graphite's Carbon API, which will store the data in a Whisper database (which is similar to RRD but not compatible).
I think I want to go with RRDs, but I found this statement in the Whisper docs that concerns me:
In many cases (depending on configuration) if an update is made to an RRD series but is not followed up by another update soon, the original update will be lost.
Hmmm. That's a bit scary, but the accusation is so vague that I don't know what to make of it. What is the configuration they are talking about, and the situation in which it causes data loss?
My situation is that the metrics data I am gathering will be available in chunks -- periodically I will go get the latest data and make as many entries into the database as there are new samples available. So, for example, I might grab some data and update the database with the values from 3 minutes ago, 2 minutes ago, and 1 minute ago, one right after the other. In fact, I might have dozens of new samples to put in the database at once. Does using RRD this way have anything to do with the Whisper accusation?
NOTE: I do not need to back-fill data; I will always be adding newer data than what has already been stored.