11

In our database we do have a table that keeps track of the power consumption of a device. The rate at which new values get inserted is not fixed, they only get written when there really is a change, so the temporal distance between the values is varying and may reach from 1 second to several minutes. The entries consist of a timestamp and a value. The value always increases with every new row as it counts the kWh.

What I want to achieve is the following: I want to specify a start and an end datetime, let's say a month. I also want to specify an interval like 15 minutes, 1 hour, 1 day or similar. The outcome I need to get is in the form of [Beginning of interval as datetime], [power consumption in that interval], e.g. like this (where interval would be set to 1 hour):

2015-01.01 08:00:00   -   65
2015-01.01 09:00:00   -   43
2015-01.01 10:00:00   -   56

This is what the table looks like:

TimeStamp            Value
-------------------------
2015-01-08 08:29:47, 5246
2015-01-08 08:36:15, 5247
2015-01-08 08:37:10, 5248
2015-01-08 08:38:01, 5249
2015-01-08 08:38:38, 5250
2015-01-08 08:38:51, 5251
2015-01-08 08:39:33, 5252
2015-01-08 08:40:20, 5253
2015-01-08 08:41:10, 5254
2015-01-09 08:56:25, 5255
2015-01-09 08:56:43, 5256
2015-01-09 08:57:31, 5257
2015-01-09 08:57:36, 5258
2015-01-09 08:58:02, 5259
2015-01-09 08:58:57, 5260
2015-01-09 08:59:27, 5261
2015-01-09 09:00:06, 5262
2015-01-09 09:00:59, 5263
2015-01-09 09:01:54, 5265
2015-01-09 09:02:44, 5266
2015-01-09 09:03:39, 5267
2015-01-09 09:04:22, 5268
2015-01-09 09:05:11, 5269
2015-01-09 09:06:08, 5270

I have the feeling that I would have to combine the SUM() function with a GROUP BY, but I have no clue how to do that, because as far as I can see I would also have to consider only the growth per interval and not the sum of the absolute values within that interval. It would be great if someone could bring me onto the right track.

Johan
  • 74,508
  • 24
  • 191
  • 319
Rob
  • 11,492
  • 14
  • 59
  • 94
  • growth per interval ? Compared to highest value of last interval or first value in same interval ? Logic would dictate the first. Which version of sqlserver are you using ? – t-clausen.dk Apr 22 '15 at 08:38
  • The value column is like an ever increasing counter that saves the total consumption up to that point, just like a water meter in your house. The growth per interval thereby is the maximum (or last) value minus the minimum (=first) value within a given interval. – Rob Apr 23 '15 at 09:11

3 Answers3

6

I think the best way to deal with this is to first generate your intervals, and then left join your data, since this firstly makes the grouping much less complicated for variable intervals, and also means you still get results for intervals with no data. To do this you will need a numbers table, since many people don't have one below is a quick way of generating one on the fly:

WITH N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
Numbers (N) AS (SELECT ROW_NUMBER() OVER(ORDER BY N1.N) FROM N2 AS N1 CROSS JOIN N2 AS N2)
SELECT *
FROM Numbers;

This simply generates a sequence from 1 to 10,000. For more reading on this see the following series:

You can then define a start time, an interval and the number of records to show, and along With your numbers table you can generate your data:

DECLARE @Start DATETIME2 = '2015-01-09 08:00',
        @Interval INT = 60,     -- INTERVAL IN MINUTES
        @IntervalCount INT = 3; -- NUMBER OF INTERVALS TO SHOW

WITH N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
Numbers (N) AS (SELECT ROW_NUMBER() OVER(ORDER BY N1.N) FROM N2 AS N1 CROSS JOIN N2 AS N2)

SELECT  TOP (@IntervalCount)
        Interval = DATEADD(MINUTE, (N - 1) * @Interval, @Start)
FROM    Numbers;

Finally you can LEFT JOIN this to your data to get the minimum and the maximum values for each interval

DECLARE @Start DATETIME2 = '2015-01-09 08:00',
        @Interval INT = 60,     -- INTERVAL IN MINUTES
        @IntervalCount INT = 3; -- NUMBER OF INTERVALS TO SHOW

WITH N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
Numbers (N) AS (SELECT ROW_NUMBER() OVER(ORDER BY N1.N) FROM N2 AS N1 CROSS JOIN N2 AS N2),
Intervals AS
(   SELECT  TOP (@IntervalCount)
            IntervalStart = DATEADD(MINUTE, (N - 1) * @Interval, @Start),
            IntervalEnd = DATEADD(MINUTE, N  * @Interval, @Start)
    FROM    Numbers AS n
)
SELECT  i.IntervalStart,
        MinVal = MIN(t.Value),
        MaxVal = MAX(t.Value),
        Difference = ISNULL(MAX(t.Value)  - MIN(t.Value), 0)
FROM    Intervals AS i
        LEFT JOIN T AS t
            ON t.timestamp >= i.IntervalStart
            AND t.timestamp < i.IntervalEnd
GROUP BY i.IntervalStart;

If your values can go up and down within the inverval, then you will need to use a ranking function to get the first and last record for each hour, rather than min and max:

DECLARE @Start DATETIME2 = '2015-01-09 08:00',
        @Interval INT = 60,     -- INTERVAL IN MINUTES
        @IntervalCount INT = 3; -- NUMBER OF INTERVALS TO SHOW

WITH N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
Numbers (N) AS (SELECT ROW_NUMBER() OVER(ORDER BY N1.N) FROM N2 AS N1 CROSS JOIN N2 AS N2),
Intervals AS
(   SELECT  TOP (@IntervalCount)
            IntervalStart = DATEADD(MINUTE, (N - 1) * @Interval, @Start),
            IntervalEnd = DATEADD(MINUTE, N  * @Interval, @Start)
    FROM    Numbers AS n
), RankedData AS
(   SELECT  i.IntervalStart,
            t.Value,
            t.timestamp,
            RowNum = ROW_NUMBER() OVER(PARTITION BY i.IntervalStart ORDER BY t.timestamp),
            TotalRows = COUNT(*) OVER(PARTITION BY i.IntervalStart)
    FROM    Intervals AS i
            LEFT JOIN T AS t
                ON t.timestamp >= i.IntervalStart
                AND t.timestamp < i.IntervalEnd
)
SELECT  r.IntervalStart,
        Difference = ISNULL(MAX(CASE WHEN RowNum = TotalRows THEN r.Value END) - 
                            MAX(CASE WHEN RowNum = 1 THEN r.Value END), 0)
FROM    RankedData AS r
WHERE   RowNum = 1
OR      TotalRows = RowNum
GROUP BY r.IntervalStart;

Example on SQL Fiddle with 1 Hour intervals

Example on SQL Fiddle with 15 minute intervals

Example on SQL Fiddle with 1 Day intervals


EDIT

As pointed out in comments neither of the above solutions account for the advance over period boundaries, the below will account for this:

DECLARE @Start DATETIME2 = '2015-01-09 08:25',
        @Interval INT = 5,     -- INTERVAL IN MINUTES
        @IntervalCount INT = 18; -- NUMBER OF INTERVALS TO SHOW

WITH N1 AS (SELECT N FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) t (N)),
N2 (N) AS (SELECT 1 FROM N1 AS N1 CROSS JOIN N1 AS N2),
Numbers (N) AS (SELECT ROW_NUMBER() OVER(ORDER BY N1.N) FROM N2 AS N1 CROSS JOIN N2 AS N2),
Intervals AS
(   SELECT  TOP (@IntervalCount)
            IntervalStart = DATEADD(MINUTE, (N - 1) * @Interval, @Start),
            IntervalEnd = DATEADD(MINUTE, (N - 0) * @Interval, @Start)
    FROM    Numbers AS n
), LeadData AS
(   SELECT  T.timestamp,
            T.Value,
            NextValue = nxt.value,
            AdvanceRate = ISNULL(1.0 * (nxt.Value - T.Value) / DATEDIFF(SECOND, T.timestamp, nxt.timestamp), 0),
            NextTimestamp = nxt.timestamp
    FROM    T AS T
            OUTER APPLY 
            (   SELECT  TOP 1 T2.timestamp, T2.value
                FROM    T AS T2
                WHERE   T2.timestamp > T.timestamp
                ORDER BY T2.timestamp
            ) AS nxt
)
SELECT  i.IntervalStart,
        Advance = CAST(ISNULL(SUM(DATEDIFF(SECOND, d.StartTime, d.EndTime) * t.AdvanceRate), 0) AS DECIMAL(10, 4))
FROM    Intervals AS i
        LEFT JOIN LeadData AS t
            ON t.NextTimestamp >= i.IntervalStart 
            AND t.timestamp < i.IntervalEnd
        OUTER APPLY
        (   SELECT  CASE WHEN t.timestamp > i.IntervalStart THEN t.timestamp ELSE i.IntervalStart END,
                    CASE WHEN t.NextTimestamp < i.IntervalEnd THEN t.NextTimestamp ELSE i.IntervalEnd END
        ) AS d (StartTime, EndTime)
GROUP BY i.IntervalStart;
GarethD
  • 68,045
  • 10
  • 83
  • 123
  • 1
    One point to mention: The interval-growths don't sum up to the total difference between the first and the last data row. This is because there usually is a increase between the last data row of interval 1 and the first data row within the next interval 2. This growth is neither considered in i1 nor i2. – flo Apr 23 '15 at 08:31
  • I have a little remark, too: in order to get a continuous list of intervals (without gaps), I had to change the LEFT JOIN to a LEFT OUTER JOIN and add an ON condition that filters by a device id column in my consumption values table, which I didn't mention in my question. – Rob Apr 23 '15 at 08:54
  • @flo: You're right, the values are incorrect. In a quick test that lead to a cumulated consumption of 11 kWh for a period of 48 hours, which was in fact a correct consumption of 19 kWh (total consumption at the period end datetime minus value at period start datetime). – Rob Apr 23 '15 at 09:37
  • 1
    I have added a version that will account for this advance, it is fairly similar to the accepted answer, but I saw that one of your questions was tagged with SQL-Server-2008 so posted a solution that will work in that version of SQL Server. FWIW a `LEFT JOIN` and a `LEFT OUTER JOIN` are exactly the same thing. – GarethD Apr 23 '15 at 10:05
  • @Gareth: Thanks for your improved answer, it works now. And you're right about the LEFT OUTER JOIN of course, I think I mixed up things as I'm not really used to this kind of DB "magic" ;) – Rob Apr 23 '15 at 10:38
4

Your sample data does not match the result intervals, so you may miss increases within an interval at the end or the beginning. Therefore, I assumed a linear increase between sample data rows and matched them to the result interval.

declare @start datetime2 = '2015-01-09 09:00:00'
declare @end datetime2 = '2015-01-09 09:30:00'
declare @intervalMinutes int = 5

;with intervals as (
      select @start iStart, dateadd(minute, @intervalMinutes, @start) iEnd

      union all

      select iEnd, dateadd(minute, @intervalMinutes, iEnd) from intervals
      where iEnd < @end

), increases as (
        select 
             T.Timestamp sStart, 
             lead(T.Timestamp, 1, null ) over (order by T.Timestamp) sEnd, -- the start of the next period if there is one, null else
             lead(T.value, 1, null ) over (order by T.Timestamp) - T.value increase  -- the increase within this period
        from @T T
), rates as (
        select 
           sStart rStart, 
           sEnd rEnd, 
           (cast(increase as float))/datediff(second, sStart, sEnd) rate -- increase/second
        from increases where increase is not null
), samples as (
        select *, 
            case when iStart > rStart then iStart else rStart end sStart, -- debug
            case when rEnd>iEnd then iEnd else rEnd end sEnd,  -- debug                
            datediff(second, case when iStart > rStart then iStart else rStart end, case when rEnd>iEnd then iEnd else rEnd end)*rate x -- increase within the period within the interval
        from intervals i
        left join rates r on rStart between iStart and iEnd or rEnd between iStart and iEnd or iStart between rStart and rEnd -- overlaps
)
select iStart, iEnd, isnull(sum(x), 0) from samples 
group by iStart, iEnd

The CTEs:

  • intervals holds the intervales you want data for
  • increaese calculates the increases within the sample data periods
  • rates calculates the increase per second in the sample data periods
  • samples matches the result intervals to the sample intervals by respecting the overlaps between the bounds

Finally the select sums up the sample periods matched to a single interval.

NOTES:

  • For an interval amount > [your max recursion depth] you have to use another solution to crate the intervals CTE (see @GarethD solution)
  • Debug hint: By simply using select * from samples you can see the sample periods matched to your result intervals
flo
  • 9,713
  • 6
  • 25
  • 41
1

A fast way to do this is to get the date+hour from your TimeStamp, than GROUP BY on it and the Value for the Power Consumption would be the MAX(Value) - MIN(Value). You can manipulate that TimeStamp on other ways to get different intervals, this example is just for the hourly consumption.

SELECT 
    CONVERT(datetime, CONVERT(varchar(10), TimeStamp, 120) + ' ' + CONVERT(varchar(2), DATEPART(hour, TimeStamp)) + ':00:00'),
    MAX(Value) - MIN(Value) AS Value
FROM [Table]
GROUP BY CONVERT(datetime, CONVERT(varchar(10), TimeStamp, 120) + ' ' + CONVERT(varchar(2), DATEPART(hour, TimeStamp)) + ':00:00')
George T
  • 859
  • 8
  • 16