-1

I would like to keep a running/cumulative array of new IDs.

Starting with this:

Date IDs_Used_Today New_IDs
Dec 6 1, 2, 3 1, 2, 3
Dec 7 1, 4 4
Dec 8 2, 3, 4 3
Dec 9 1, 2, 3, 5 5

And getting this:

Date IDs_Used_Today New_IDs All_IDs_To_Date
Dec 6 1, 2, 3 1, 2, 3 1, 2, 3
Dec 7 1, 4 4 1, 2, 3, 4
Dec 8 2, 3, 4 null 1, 2, 3, 4
Dec 9 1, 2, 3, 5 5 1, 2, 3, 4, 5

I need to do this by getting the values for "All_IDs_To_Date" from previous "All_IDs_To_Date" + "New_IDs"

by doing it that way, the table will always be accurate as long as there is one previous row of data.

So basically a combination of CONCAT( LAG(All_IDs_To_Date), New_IDs) with an IF conditional when there is no LAG(ALL_IDs_To_Date) then use that date's "New_IDs" value.

It is very important that if old rows are deleted, the most current rows keep the same data. Meaning if I start with 10 rows stored, with the last running total being "1,2,3,4,5" and then I delete the first 9 rows. My next calculation would be based off that last stored row, so my running total would still be adding to the "1,2,3,4,5" that was previously stored.

1 Answers1

0

Once you have unnested every element of "New_IDs", you can select the first time each element appears, then use ARRAY_AGG window function to compute a running array aggregation over your date. ARRAY_REMOVE is needed to remove null values, generated by days without new ids.

WITH cte AS (
    SELECT *, ROW_NUMBER() OVER(PARTITION BY elements ORDER BY date_) AS rn
    FROM tab, UNNEST(New_IDs) AS elements
)
SELECT DISTINCT date_, ids_used_today, new_ids, 
                ARRAY_REMOVE(ARRAY_AGG(CASE WHEN rn = 1 THEN elements END) OVER(ORDER BY date_), NULL) AS All_IDs_To_Date
FROM cte
ORDER BY date_

Check the demo here.

lemon
  • 14,875
  • 6
  • 18
  • 38
  • so each calculation does not rely on tallying up every field in the table each time? This way still requires having the first entries available to get the accurate running ids right? – theOtherOne Dec 15 '22 at 20:38
  • if the running ids was "1,2,3,4,5" and we removed the first row when id 1 was entered. Would the next calculation of running ids become "2,3,4,5" or stay as "1,2,3,4,5" because it would be grabbing from previous running total entry. Not recalculating all running totals each time? – theOtherOne Dec 15 '22 at 20:39
  • also instead of having it be NULL when there is no previous value, can it else to that dates new_fields? would I simply replace NULL with ARRAY_AGG(new_fields) – theOtherOne Dec 15 '22 at 20:41
  • also this query results in all rows for the new column being null – theOtherOne Dec 15 '22 at 20:46
  • "*if the running ids was "1,2,3,4,5" and we removed the first row when id 1 was entered*" >> try it directly in the fiddle directly, I made it for you for this very purpose, so that you can change data and see how the query behaves. – lemon Dec 15 '22 at 21:19
  • "*can it else to that dates new_fields? would I simply replace NULL with ARRAY_AGG(new_fields)*" >> what do you mean? – lemon Dec 15 '22 at 21:20