1

I have an application that logs data after around 1 second if the packet is healthy which means the logging is not guaranteed. Now the client wants to see the log data only with the gap of at least 1 minute. For example, this the sample schema and data

declare @mytable as table(ID int, tm DATETIME, Val int)

insert into @mytable values
(1,     convert(DATETIME, N'2018-12-24 17:45:25.000', 21), 1),
(2,     convert(DATETIME, N'2018-12-24 17:45:35.000', 21), 2),
(3,     convert(DATETIME, N'2018-12-24 17:45:55.000', 21), 1),
(4,     convert(DATETIME, N'2018-12-24 17:46:05.000', 21), 5),
(5,     convert(DATETIME, N'2018-12-24 17:47:25.000', 21), 2),
(6,     convert(DATETIME, N'2018-12-24 17:47:55.000', 21), 7),
(7,     convert(DATETIME, N'2018-12-24 17:48:25.000', 21), 9),
(8,     convert(DATETIME, N'2018-12-24 17:48:50.000', 21), 1),
(9,     convert(DATETIME, N'2018-12-24 17:49:25.000', 21), 1),
(10,    convert(DATETIME, N'2018-12-24 17:50:05.000', 21), 5),
(11,    convert(DATETIME, N'2018-12-24 17:50:50.000', 21), 4),
(12,    convert(DATETIME, N'2018-12-24 17:51:25.000', 21), 8),
(13,    convert(DATETIME, N'2018-12-24 17:55:25.000', 21), 4),
(14,    convert(DATETIME, N'2018-12-24 17:56:58.000', 21), 4),
(15,    convert(DATETIME, N'2018-12-24 17:58:15.000', 21), 5),
(16,    convert(DATETIME, N'2018-12-24 18:10:25.000', 21), 8);

The output should be

1,     convert(DATETIME, N'2018-12-24 17:45:25.000', 21), 1
5,     convert(DATETIME, N'2018-12-24 17:47:25.000', 21), 2
7,     convert(DATETIME, N'2018-12-24 17:48:25.000', 21), 9
9,     convert(DATETIME, N'2018-12-24 17:49:25.000', 21), 1
11,    convert(DATETIME, N'2018-12-24 17:50:50.000', 21), 4
13,    convert(DATETIME, N'2018-12-24 17:55:25.000', 21), 4
14,    convert(DATETIME, N'2018-12-24 17:56:58.000', 21), 4
15,    convert(DATETIME, N'2018-12-24 17:58:15.000', 21), 5
16,    convert(DATETIME, N'2018-12-24 18:10:25.000', 21), 8

Here, after a qualified entry, next entry to qualify is to have at least 1 minute gap. However, I have to log all the healthy entries as in some other procedure I'll have to return all records. So skipping during logging is not an option. Also, I know I can achieve it using a cursor but given the amount of data, this is not an acceptable solution.

Harsh Shankar
  • 506
  • 5
  • 16
  • The best performing solution is likely an additional bit field that is calculated on insert, which indicates healthy or not. – Pablo Henkowski Dec 25 '18 at 08:12
  • @PabloHenkowski Only healthy packets are inserted. However, when displaying, I have to show records with gap of 1 minutes. Here, user have option to select the start date time so some flag in the log table will not be very helpful since it will be dependent on the fixed data starting from first record only. – Harsh Shankar Dec 25 '18 at 08:21
  • Maybe the needed solution is to group by time rounded to minutes and showing average/minimum/maximum/sum value for the grouped period. – Andrey Nikolov Dec 25 '18 at 08:24
  • @AndreyNikolov rounding to minute will not work since in case suppose entries are of 10:00:50, 10:01:10 and 10:01:55, the query have to pick entries of 10:00:50 and 10:01:55 since they have gap of 1 min after the first qualified record(10:00:50) – Harsh Shankar Dec 25 '18 at 08:35

1 Answers1

1

Using recursive common table expression:

;WITH CTE (Id, tm, val)
AS
(
    SELECT TOP 1 *
    FROM @mytable m
    WHERE m.ID = 1

    UNION ALL

    SELECT
        p.Id,
        p.tm,
        p.val
    FROM
    (
        SELECT
            m.*, 
            ROW_NUMBER() OVER(ORDER BY c.Id) AS row_num
        FROM CTE c
        INNER JOIN @mytable m ON c.Id < m.ID AND DATEADD(MINUTE, 1, c.tm) <= m.tm
    ) AS p
    WHERE p.row_num = 1

)
SELECT *
FROM CTE c

Id          tm                      val
----------- ----------------------- -----------
1           2018-12-24 17:45:25.000 1
5           2018-12-24 17:47:25.000 2
7           2018-12-24 17:48:25.000 9
9           2018-12-24 17:49:25.000 1
11          2018-12-24 17:50:50.000 4
13          2018-12-24 17:55:25.000 4
14          2018-12-24 17:56:58.000 4
15          2018-12-24 17:58:15.000 5
16          2018-12-24 18:10:25.000 8

But, I'm not sure, this solution will work on big tables with many records. Maybe, you need some kind of post-processing after inserting in log table.

Backs
  • 24,430
  • 5
  • 58
  • 85