0

I have one year's 15 minute interval data in my kairosdb. I need to do following things sequentially: - filter data using a tag - group filtered data using few tags. I am not specifying values of tags because I want them to automatically grouped by tag values at runtime. - once grouped on those tags, I want to aggregate sum 15 min interval data into a month.

I wrote this query to run from python script based on information available on kairosdb google code forum. But the aggregated values seem incorrect. Output seem skewed. I want to understand where I am going wrong. I am doing this in python. Here is my json query:

agg_query = {
             "start_absolute": 1412136000000,
             "end_absolute": 1446264000000,
             "metrics":[
               {
                "tags": {
                    "insert_date": ["11/17/2015"]
                },
                "name": "gb_demo",
                "group_by": [
                   {
                       "name": "time",
                       "range_size": {
                            "value": "1",
                            "unit": "months"
                       },
                       "group_count": "12"
                   },
                   {
                       "name": "tag",
                       "tags": ["usage_kind","building_snapshot_id","usage_point_id","interval"]
                   }
                ],
                "aggregators": [
                    {
                        "name": "sum",
                        "sampling": {
                           "value": 1,
                           "unit": "months"
                        }
                    }
                 ]
                }
              ]
           }

For reference: Data is something like this: [[1441065600000,53488],[1441066500000,43400],[1441067400000,44936],[1441068300000,48736],[1441069200000,51472],[1441070100000,43904],[1441071000000,42368],[1441071900000,41400],[1441072800000,28936],[1441073700000,34896],[1441074600000,29216],[1441075500000,26040],[1441076400000,24224],[1441077300000,27296],[1441078200000,37288],[1441079100000,30184],[1441080000000,27824],[1441080900000,27960],[1441081800000,28056],[1441082700000,29264],[1441083600000,33272],[1441084500000,33312],[1441085400000,29360],[1441086300000,28400],[1441087200000,28168],[1441088100000,28944],[1443657600000,42112],[1443658500000,36712],[1443659400000,38440],[1443660300000,38824],[1443661200000,43440],[1443662100000,42632],[1443663000000,42984],[1443663900000,42952],[1443664800000,36112],[1443665700000,33680],[1443666600000,33376],[1443667500000,28616],[1443668400000,31688],[1443669300000,30872],[1443670200000,28200],[1443671100000,27792],[1443672000000,27464],[1443672900000,27240],[1443673800000,27760],[1443674700000,27232],[1443675600000,27824],[1443676500000,27264],[1443677400000,27328],[1443678300000,27576],[1443679200000,27136],[1443680100000,26856]]

This is snapshot of some data from Sep and Oct 2015. When I run this, if I give start timestamp of Sep, it will sum Sep data correctly, but for october it doesn't.

Shilpi
  • 5
  • 3

1 Answers1

0

I believe your group by time will create groups by calendar month (January to December), but your sum aggregator will sum values by a running month starting withyour start date... Which seems a bit weird. COuld that be the cause of what you see?

What is the data like? What is the aggregated result like?

Loic
  • 1,088
  • 7
  • 19
  • Yes, results are a little bit weird. Aggregation is being done correctly only for first month, for rest values are not accurate. Can you elaborate on what you meant by sum aggregator summing by running month. Any suggestion how I can fix this. In data, I have 15 min interval data for one year, and I need to aggregate and create monthly data. – Shilpi Dec 05 '15 at 07:47
  • The group by time feature will create clusters by calendar month. I don;t advise to do that, because if you query more than one year of data you will sum upp the same month from different years. Simply setup a monthly aggregator and use align_sampling=true in the option of your aggregator. – Loic Dec 08 '15 at 11:45
  • Thanks a lot Loic. It kind of worked. Different months are being summed up now. But the results are slightly off. – Shilpi Dec 11 '15 at 04:57
  • It worked. The second issue was due to timezone difference. Thanks a ton. – Shilpi Dec 11 '15 at 15:15