1

I'm trying to learn the flux query language for InfluxDB. I'm using InfluxDB OSS 2.7.

I have a time-series with power usage from my power meter. It reports an ever increasing number in KWh, and I want to show how many Wh I have used per day, by using a custom function with aggregateWindow. Here is what I have tried:

myFunc = (tables=<-, column) => {
  a = tables
    |> first(column: column)
    |> findRecord(fn: (key) => true, idx: 0)

  b = tables
    |> last(column: column)
    |> findRecord(fn: (key) => true, idx: 0)

  d = b._value - a._value

  return tables
    |> first()
    |> map(fn: (r) => ({ r with _value: d}))
}

from(bucket: "a")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "el")
  |> filter(fn: (r) => r["_field"] == "ACTIVE_IMPORT")
  |> aggregateWindow(every: 1d, fn: myFunc, createEmpty: false)
  |> yield(name: "Wh") 

But this returns a new table, where all _value have the same number (in my case 134).

I was hoping that the variables a and b would have the first and the last value of each window, and that d would represent the usage in each window - but this does not seem to be the case.

Allan
  • 4,562
  • 8
  • 38
  • 59

1 Answers1

0

If you want to find for each window the difference between the first and last values you can use the spread function. To be precise spread calculates the difference between the minimum and maximum values (not first and last) but in an always-increasing series the two are the same.

This though is not precise. If you consider the following data:

timestamp value
2023-04-12T00:00:00Z 100
2023-04-12T01:00:00Z 101
2023-04-12T02:00:00Z 102
2023-04-12T03:00:00Z 103
2023-04-12T04:00:00Z 104
2023-04-12T05:00:00Z 105
2023-04-12T06:00:00Z 106
2023-04-12T07:00:00Z 107
2023-04-12T08:00:00Z 108
2023-04-12T09:00:00Z 109
2023-04-12T10:00:00Z 110
2023-04-12T11:00:00Z 111
2023-04-12T12:00:00Z 112
2023-04-12T13:00:00Z 113
2023-04-12T14:00:00Z 114
2023-04-12T15:00:00Z 115
2023-04-12T16:00:00Z 116
2023-04-12T17:00:00Z 117
2023-04-12T18:00:00Z 118
2023-04-12T19:00:00Z 119
2023-04-12T20:00:00Z 120
2023-04-12T21:00:00Z 121
2023-04-12T22:00:00Z 122
2023-04-12T23:00:00Z 123
2023-04-13T00:00:00Z 124
2023-04-13T01:00:00Z 125
2023-04-13T02:00:00Z 126
2023-04-13T03:00:00Z 127
2023-04-13T04:00:00Z 128
2023-04-13T05:00:00Z 129
2023-04-13T06:00:00Z 130
2023-04-13T07:00:00Z 131
2023-04-13T08:00:00Z 132
2023-04-13T09:00:00Z 133
2023-04-13T10:00:00Z 134
2023-04-13T11:00:00Z 135
2023-04-13T12:00:00Z 136
2023-04-13T13:00:00Z 137
2023-04-13T14:00:00Z 138
2023-04-13T15:00:00Z 139
2023-04-13T16:00:00Z 140
2023-04-13T17:00:00Z 141
2023-04-13T18:00:00Z 142
2023-04-13T19:00:00Z 143
2023-04-13T20:00:00Z 144
2023-04-13T21:00:00Z 145
2023-04-13T22:00:00Z 146
2023-04-13T23:00:00Z 147

Then if you take the spread in each day you will get:

from(bucket: "a")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "el")
  |> filter(fn: (r) => r["_field"] == "ACTIVE_IMPORT")
  |> aggregateWindow(every: 1d, fn: spread, createEmpty: false)
  |> yield(name: "Wh") 
day first last spread
2023-04-13T00:00:00Z 100 123 23
2023-04-13T00:00:00Z 124 147 23

The actual consumption for each day, instead is 24. By using spread you are missing one interval for each day (the consumption from 123 to 124 is never accounted). Of course if your data has higher granulatiry (e.g.: every minute or every second) the missing value will be a lot less significant.

To solve this I would suggest just getting a single value for each day (the last) and then using the difference function. This will then do a "rolling difference" so subtract to each value the previous one and will give you a better result (all hours will be accounted for).

from(bucket: "a")
  |> range(start: v.timeRangeStart, stop: v.timeRangeStop)
  |> filter(fn: (r) => r["_measurement"] == "el")
  |> filter(fn: (r) => r["_field"] == "ACTIVE_IMPORT")
  |> aggregateWindow(every: 1d, fn: last, createEmpty: false)
  |> difference()
  |> yield(name: "Wh")

Aggregating with last will give:

day last
2023-04-13T00:00:00Z 123
2023-04-13T00:00:00Z 147

Then applying difference will result in:

day difference(last)
2023-04-13T00:00:00Z
2023-04-13T00:00:00Z 24

NOTE: the first value will be null since it does not have anything before it to di the difference with

Matteo Zanoni
  • 3,429
  • 9
  • 27