I need to calculate weekly and monthly moving averages per sensor per day for a large set of sample data based on some quality criteria. I have a working solution based on correlated sub-queries (or self joins), but I was wondering if using analytic functions is possible and would result in better performance?
Here is what I have right now (simplified):
CREATE TABLE Samples(
SensorId int,
SampleTime datetime,
Value float,
Quality float
)
WITH DailyAvg (SensorId, SampleDate, ValueSum, ValueCount)
AS
(
SELECT
SensorId,
CAST(SampleTime AS DATE) AS SampleDate,
SUM(Value) AS ValueSum,
COUNT_BIG(Value) AS ValueCount
FROM Samples
WHERE Quality > 0.95
GROUP BY SensorId, CAST(SampleTime AS DATE)
)
SELECT
SensorId,
SampleDate,
( SELECT SUM(d2.ValueSum) / SUM(d2.ValueCount) FROM DailyAvg AS d2 WHERE d2.SensorId = d1.SensorId AND d2.SampleDate BETWEEN DATEADD(DAY, -7, d1.SampleDate) AND d1.SampleDate) AS AverageLastWeek,
( SELECT SUM(d2.ValueSum) / SUM(d2.ValueCount) FROM DailyAvg AS d2 WHERE d2.SensorId = d1.SensorId AND d2.SampleDate BETWEEN DATEADD(DAY, -14, d1.SampleDate) AND d1.SampleDate) AS AverageLast2Weeks,
( SELECT SUM(d2.ValueSum) / SUM(d2.ValueCount) FROM DailyAvg AS d2 WHERE d2.SensorId = d1.SensorId AND d2.SampleDate BETWEEN DATEADD(MONTH, -1, d1.SampleDate) AND d1.SampleDate) AS AverageLastMonth
FROM DailyAvg d1
ORDER BY SensorId, SampleDate
I've tried replacing the sub-query for weekly average with the snippet below, but it obviously cannot handle days without any samples correctly. I thought of using RANGE or PARTITION BY expressions, but I cannot figure out how to specify the window frame to select the samples from e.g. "last week".
SUM(ValueSum) OVER(PARTITION BY SensorId ORDER BY SampleTime ROWS 7 PRECEDING) / SUM(ValueCount) OVER(PARTITION BY SensorId ORDER BY SampleTime ROWS 7 PRECEDING) AS AverageLastWeek
I even considered "Quirky Update", but besides being messy I don't think it makes sense with this many days being averaged over.