I'm working with a table similar to this in bigquery at my job:
id | x | y
a | 1 | 2
a | 2 | 3
a | 3 | 4
b | 1 | 2
b | 2 | 3
b | 3 | 2
c | 3 | 2
c | 2 | 4
c | 3 | 4
...
We want to take this data and perform the following transformation:
For each unique id (eg a, b, c), we want to aggregate the x and y values into an array.
For example, for id a, we would get the array [1,2,3,2,3,4].
Basically, the output table should look like this:
id | array
a | [1,2,3,2,3,4]
b | [1,2,3,2,3,2]
c | [3,2,3,2,4,4]
I'm not sure how to achieve using just SQL/ JS UDFs in big query.
I would have just written a quick Python script to do this, but given that we have a massive number of ids, we want this to be scalable.
Is there a way to achieve this using only SQL/ JS UDFs. My understanding is that there are not any aggregation functions to achieve this across columns. Am I correct?
I looked into the Google big query docs for user defined functions: https://cloud.google.com/bigquery/docs/reference/standard-sql/user-defined-functions, but could not find a solution.