8

I'm trying to graph data using statsd and graphite. I have a simple counter, I increment it by 1, and then when I graph the values for the counter over the day, I see strange values like 0.09 as the peak in my graph (see https://i.stack.imgur.com/o4gmz.png)

This graph should be showing 2 logins, but instead it's showing 0.09. If I change the time scale from 1 day to the last 15 minutes, then it correctly shows the two logins (see https://i.stack.imgur.com/23vDJ.png)

I've set up my finest retention to be in 10s increments in storage-schemas.conf:

retentions = 10s:7d,1m:21d,24h:5y

I've set up my storage-aggregation.conf file to sum counts:

[sum]
pattern = \.count$
xFilesFactor = 0
aggregationMethod = sum

(And, before you ask, yes; this is a .count).

If I try my URL with &rawData=true then in either case I see some Nones, some 0.0s, and a pair of 1.0s separated by some 0.0s. I never see these fractional values that somehow show up on the graph. So... Is this a bug? Am I doing something wrong?

gturri
  • 13,807
  • 9
  • 40
  • 57
Jason Walton
  • 920
  • 8
  • 11
  • 2
    Aha! There's a bug open for this: https://bugs.launchpad.net/graphite/+bug/850475 It seems that Graphite will aggregate stats together when there would be more data points than there are pixels in the width of your graph. You can (sort of) fix this with the summarize function: &target=summarize(counter.login, "5 min", "sum") – Jason Walton Oct 22 '12 at 20:30
  • Jason, what says whisper-fetch of the same metric? Get into the graphite host and run: whisper-fetch --pretty yourfile.wsp – Valor Oct 23 '12 at 19:01
  • Does anybody found the solution to this issue with graphite? I am having the same problem. – duckhunt Sep 10 '14 at 12:51
  • @JasonWalton, I think you can post your own answer below and mark it as valid. :) Summarize is already out there. – dukebody Nov 10 '14 at 09:14

1 Answers1

2

There's also consolidateBy function which tells graphite what to do if there's no enough pixels to draw everything accurately. By default it's using "avg" function and therefore strange results when time ranges are greater. Here excerpt from documentation:

When a graph is drawn where width of the graph size in pixels is smaller than the number of datapoints to be graphed, Graphite consolidates the values to to prevent line overlap. The consolidateBy() function changes the consolidation function from the default of ‘average’ to one of ‘sum’, ‘max’, or ‘min’. This is especially useful in sales graphs, where fractional values make no sense and a ‘sum’ of consolidated values is appropriate.

Another function that could be useful is hitcount. Short excerpt from here why it's useful:

This function is like summarize(), except that it compensates automatically for different time scales (so that a similar graph results from using either fine-grained or coarse-grained records) and handles rarely-occurring events gracefully.

I spent some time scratching my head why I get fractions for my counter with time ranges longer than couple hours when my aggregation rule is max. It's pretty confusing, especially at the beginning when you play with single counters to see if everything works. Checking rawData is quite a good way for debugging sanity check ;)

slawek
  • 2,709
  • 1
  • 25
  • 29