InfluxDB: Combine data and validation series of same measurement

Question

In my InfluxDB (V 1.8) I have one measurement with one field ("value") and one tag ("id") and a lot of series. There are two kinds of series in this measurement, one type contains raw data (float values) the other one contains validation data (float values = 0.0 or 1.0). There are always two series which belong together (id = "xyz_data" and id = "xyz_validation") and they do have the same time stamps.

I would like to query the data so that I only receive the data where the validation series has a value of 1.

Example:

xyz_data contains: 2.3, 5.4, 0.0, 2.1, 6.4

xyz_validation contains: 1.0, 0.0, 1.0, 1.0, 0.0

I expect the query to return: 2.3, 0.0, 2.1

Is there a way to do that using InfluxQL?

score 0 · Answer 1 · answered Dec 10 '21 at 07:19

In the meantime I was able to find some kind of solution which helps in my case but is not very nice. It also requires that the duration between data points is constant or there is at least a minimum duration between points:

SELECT "val" FROM (SELECT sum("value")-1 AS "val", count("value") AS "cnt" FROM "channel" WHERE "id" = 'xyz_data' OR  ("id" = 'xyz_validation' AND value > 0)  AND $timeFilter GROUP BY time(1s)) where "cnt" > 1

In the inner select statement it takes all data points from the data series and only the valid points from the validation series. It also groups by time where the group time interval needs to equal to or smaller than the data frequency. Then there are three cases: One group contains no data point, one data point (then it is invalid because the validation data point is missing) or two data points (then it is a valid data point). To be able to filter only the valid points I have used a count of the points here ("cnt") and a filter in the outer select statement to return only those points where the counter is > 1. In order to get the data point value only I take the sum of the data point and validation point and subtract 1 because I know that the validation data point always has a value of 1.

score 0 · Answer 2 · answered Jun 28 '22 at 09:24

As I try to get into flux, I managed to solve this by using pivot in the flux query language:

from(bucket: "default")
  |> range(start: 2016-06-12, stop: 2016-06-14)
  |> filter(fn: (r) => r["_measurement"] == "stack")
  |> pivot(rowKey: ["_time"], columnKey: ["id"], valueColumn: "_value")
  |> filter(fn: (r) => r["xyz_validation"] == 1.0)
  |> keep(columns: ["_time", "xyz_data"])

The first three lines select the data, depending on the selected bucket, time and measurement, then pivot makes the id into columns with the values from the value column. Like this you can filter on xyz_validation and then get the xyz_data column.

You can try this in the Influx DB UI using the script builder in the Explore tab, it works in Influx DB 1.8 and 2.x.

InfluxDB: Combine data and validation series of same measurement

2 Answers2