0

I've got some periodic counter data (like once a second) from different objects that I wish to combine into an hourly total.

If I do it with separate column names, it's pretty straightforward:

CREATE TABLE ts1 (
        id      INTEGER,
        ts      TIMESTAMP,
        count0  integer,
        count1  integer,
        count2  integer
);

INSERT INTO ts1 VALUES
        (1, '2017-12-07 10:37:48', 10, 20, 50),
        (2, '2017-12-07 10:37:48', 13, 7, 88),
        (1, '2017-12-07 10:37:49', 12, 23, 34),
        (2, '2017-12-07 10:37:49', 11, 13, 46),
        (1, '2017-12-07 10:37:50', 8, 33, 80),
        (2, '2017-12-07 10:37:50', 9, 3, 47),
        (1, '2017-12-07 10:37:51', 17, 99, 7),
        (2, '2017-12-07 10:37:51', 9, 23, 96);

SELECT id, date_trunc('hour', ts + '1 hour') nts, 
           sum(count0), sum(count1), sum(count2) 
    FROM ts1 GROUP BY id, nts;

id |         nts         | sum | sum | sum
----+---------------------+-----+-----+-----
  1 | 2017-12-07 11:00:00 |  47 | 175 | 171
  2 | 2017-12-07 11:00:00 |  42 |  46 | 277
(2 rows)

The problem is that different objects have different numbers of counts (though each particular object's rows -- ones sharing the same ID -- all have the same number of counts). Hence I want to use an array.

The corresponding table looks like this:

CREATE TABLE ts2 (
        id      INTEGER,
        ts      TIMESTAMP,
        counts  INTEGER[]
);

INSERT INTO ts2 VALUES
        (1, '2017-12-07 10:37:48', ARRAY[10, 20, 50]),
        (2, '2017-12-07 10:37:48', ARRAY[13, 7, 88]),
        (1, '2017-12-07 10:37:49', ARRAY[12, 23, 34]),
        (2, '2017-12-07 10:37:49', ARRAY[11, 13, 46]),
        (1, '2017-12-07 10:37:50', ARRAY[8, 33, 80]),
        (2, '2017-12-07 10:37:50', ARRAY[9, 3, 47]),
        (1, '2017-12-07 10:37:51', ARRAY[17, 99, 7]),
        (2, '2017-12-07 10:37:51', ARRAY[9, 23, 96]);

I have looked at this answer https://stackoverflow.com/a/24997565/1076479 and I get the general gist of it, but I cannot figure out how to get the correct rows summed together when I try to combine it with the grouping by id and timestamp.

For example, with this I get all the rows, not just the ones with matching id and timestamp:

SELECT id, date_trunc('hour', ts + '1 hour') nts, ARRAY(
        SELECT sum(elem) FROM ts2 t, unnest(t.counts)
            WITH ORDINALITY x(elem, rn) GROUP BY rn ORDER BY rn
        ) FROM ts2 GROUP BY id, nts;

 id |         nts         |    array
----+---------------------+--------------
  1 | 2017-12-07 11:00:00 | {89,221,448}
  2 | 2017-12-07 11:00:00 | {89,221,448}
(2 rows)

FWIW, I'm using postgresql 9.6

Gil Hamilton
  • 11,973
  • 28
  • 51

1 Answers1

1

The problem with you original query is that you're summing all elements, because GROUP BY id, nts is executed in outer query. Combining a CTE with LATERAL JOIN would do the trick:

WITH tmp AS (
    SELECT 
        id, 
        date_trunc('hour', ts + '1 hour') nts, 
        sum(elem) AS counts
    FROM 
        ts2 
        LEFT JOIN LATERAL unnest(counts) WITH ORDINALITY x(elem, rn) ON TRUE 
    GROUP BY 
        id, nts, rn
)
SELECT id, nts, array_agg(counts) FROM tmp GROUP BY id, nts 
Michel Milezzi
  • 10,087
  • 3
  • 21
  • 36