1

I work for a company which receives data from smart meters. This data can be as much as 2 days old for a live stream and may get post populated in the case errors are made (gaps etc.). Currently we store this typically for 5 years. The data is then pulled into an SSAS Cube and aggregated into 1 minute, 5m, 30m, 1h, 1 day, 1 week, 1month aggregations. For each of these aggregations the Min, Max, Avg is also stored. Building this cube is slow and is not currently scalable since it mines its data from a singular source.

I think that an RRD style database per data point would be a better fit driven by the data push. However I have several questions about RRD (examples would be most welcome)

  1. Can RRD retain data granularity whilst also performing roll up over time?
  2. Can data be fed into RRD to correct gaps?

Examples would be welcome.

Nakilon
  • 34,866
  • 14
  • 107
  • 142
Mark
  • 1,544
  • 1
  • 14
  • 26

1 Answers1

2
  1. Yes - you need to configure your RRAs appropriately.

An RRA is a round-robin-archive and defines numbers of data points and resolution. So you can - assuming a 5 minute sample rate:

RRA:AVERAGE:0.5:1:2000
RRA:AVERAGE:0.5:12:2400

Will hold about a week of 5m resolution, and 100d of 1hr resolution. But you could quite easily extend your 5m resolution RRA - although it will make your RRD bigger. The question is - do you actually need to? The whole point of RRDs is the auto archiving vs. graphing resolution - looking at a year's worth of stats and you can't render 5m resolution anyway. With 5m samples, a 1600px wide graph is only about 6 days anyway.

  1. Yes, but because of the way RRD works, it's somewhat annoying. Effectively you have to extract and replay the data to backfill the gaps. This doesn't necessarily work too well if you're 'replaying' things where you've lost resolution, because you won't have enough samples. You can rrdtool dump to extract the content of the RRD in XML form, which you can also directly modify and then rrdtool restore it. If you need to do this with any real frequency, I'd suggest using something other than rrdtool.
Sobrique
  • 52,974
  • 7
  • 60
  • 101
  • Unfortunately I have asked if this requirement is a real requirement many times before and I get told that customers wish to compare this week with the same week(aligned mon-fri) -1y -2y -3y etc. I'm thinking RRD style rather than RRDtool itself. – Mark Aug 28 '15 at 09:04
  • RRDtool will do it, but you're losing a lot of the archiving benefit. There's no way to get around the fact that if you're wanting high res for a long timeframe, you need to store a lot of data points. – Sobrique Aug 28 '15 at 09:08