Based on the BigQuery best practice of using ARRAY_AGG()
to obtain the latest record, here's how I'm getting the first, last minimum and maximum values of a field for a day. The data is reported approximately hourly.
WITH t AS (
SELECT TIMESTAMP('2021-01-01 01:00:00') as l, 10 as v
UNION ALL
SELECT TIMESTAMP('2021-01-01 02:00:00') as l, 12 as v
UNION ALL
SELECT TIMESTAMP('2021-01-01 03:00:00') as l, 15 as v
UNION ALL
SELECT TIMESTAMP('2021-01-01 04:00:00') as l, 2 as v
UNION ALL
SELECT TIMESTAMP('2021-01-02 01:00:00') as l, 600 as v
UNION ALL
SELECT TIMESTAMP('2021-01-02 02:00:00') as l, 120 as v
UNION ALL
SELECT TIMESTAMP('2021-01-02 03:00:00') as l, 150 as v
UNION ALL
SELECT TIMESTAMP('2021-01-03 04:00:00') as l, 0 as v)
SELECT EXTRACT(DATE FROM l) d,
ARRAY_AGG(t.v ORDER BY t.l ASC LIMIT 1)[OFFSET(0)] first_value,
ARRAY_AGG(t.v ORDER BY t.l DESC LIMIT 1)[OFFSET(0)] last_value,
ARRAY_AGG(t.v ORDER BY t.v DESC LIMIT 1)[OFFSET(0)] max_value,
ARRAY_AGG(t.v ORDER BY t.v ASC LIMIT 1)[OFFSET(0)] min_value,
FROM
t
GROUP BY
d
Output:
Row | d | max_value | min_value | last_value | first_value |
---|---|---|---|---|---|
1 | 2021-01-01 | 15 | 2 | 2 | 10 |
2 | 2021-01-02 | 600 | 120 | 150 | 600 |
3 | 2021-01-03 | 0 | 0 | 0 | 0 |
Since there are only six BigQuery questions on Code Review, I thought I'd ask here on the main Stack Overflow. Is this the fastest method? Do I have anything extraneous in my query? (I'm not too sure that [OFFSET(0)] is doing anything.)
I've seen this question asked on Stack Overflow for Oracle, T-SQL and Postgres but I haven't seen anything specific for BigQuery. Thanks!