SQL query/UDF across columns in GROUP by

Question

I'm working with a table similar to this in bigquery at my job:

id | x | y

a | 1 | 2
a | 2 | 3
a | 3 | 4

b | 1 | 2
b | 2 | 3
b | 3 | 2

c | 3 | 2
c | 2 | 4
c | 3 | 4
...

We want to take this data and perform the following transformation:

For each unique id (eg a, b, c), we want to aggregate the x and y values into an array.

For example, for id a, we would get the array [1,2,3,2,3,4].

Basically, the output table should look like this:

id | array

a | [1,2,3,2,3,4]
b | [1,2,3,2,3,2]
c | [3,2,3,2,4,4]

I'm not sure how to achieve using just SQL/ JS UDFs in big query.

I would have just written a quick Python script to do this, but given that we have a massive number of ids, we want this to be scalable.

Is there a way to achieve this using only SQL/ JS UDFs. My understanding is that there are not any aggregation functions to achieve this across columns. Am I correct?

I looked into the Google big query docs for user defined functions: https://cloud.google.com/bigquery/docs/reference/standard-sql/user-defined-functions, but could not find a solution.

Does this answer your question? [Aggregate multiple columns into array](https://stackoverflow.com/questions/73890364/aggregate-multiple-columns-into-array) or, for unpivot you can find another approach here not using `UNPIVOT` clause [BigQuery Standard SQL Group by aggregate multiple columns](https://stackoverflow.com/questions/55126050/bigquery-standard-sql-group-by-aggregate-multiple-columns) — Jaytiger, Jan 25 '23 at 04:01

Jaytiger · Accepted Answer · 2023-01-25T04:26:16.637

Another option not mentioned previous links would be using array concat operator || like below. (similar with @Mikhail' answer though)

WITH sample_table AS (
  SELECT 'a' id, 1 x, 2 y UNION ALL
  SELECT 'a' id, 2 x, 3 y UNION ALL
  SELECT 'a' id, 3 x, 4 y UNION ALL
  SELECT 'b' id, 1 x, 2 y UNION ALL
  SELECT 'b' id, 2 x, 3 y UNION ALL
  SELECT 'b' id, 3 x, 2 y UNION ALL
  SELECT 'c' id, 3 x, 2 y UNION ALL
  SELECT 'c' id, 2 x, 4 y UNION ALL
  SELECT 'c' id, 3 x, 4 y
)
SELECT id, ARRAY_AGG(x) || ARRAY_AGG(y) AS `array`
  FROM sample_table
 GROUP BY 1;

-- from @Mikhail's answer
-- you will see different order of element in an array from above, but both have same set of elements.
SELECT id, ARRAY_CONCAT_AGG([x, y]) AS `array`
  FROM sample_table
 GROUP BY 1;

Query results

SQL query/UDF across columns in GROUP by

1 Answers1