1

How to make a UNION ALL work for repeated fields if the order of the fields does not match?

In the example below I try to UNION data_1_nested and data_2_nested, while the repeated field nested has two fields: id and age but in different order.

I could UNNEST and renest but this would not be very helpful if I have more then 1 nested field that I need to UNION on.

Example:

with 
data_1 as (
Select 'a123' as id, 1 as age, 'a' as grade
union all 
Select 'a123' as id, 3 as age,'b' as grade
union all 
Select 'a123' as id, 4.5 as age,'c' as grade
)
,
data_2 as (
Select 'b456' as id, 6 as age,'e' as grade
union all 
Select 'b456' as id, 5 as age,'f' as grade
union all 
Select 'b456' as id, 2.5 as age,'g' as grade
)
,
data_1_nested as (
SELECT id, 
       array_agg(STRUCT(
                      age,grade
                        ))  as nested
from data_1                      
group by 1
)
,
data_2_nested as (
SELECT id, 
       array_agg(STRUCT(
                      grade, age
                        ))  as nested
from data_2                      
group by 1
)


SELECT * from data_1_nested
union all 
SELECT * from data_2_nested
Ilja
  • 993
  • 2
  • 17
  • 36

2 Answers2

1

Below should work for you

select * from data_1_nested
union all 
select id, array(select as struct age, grade from t.nested) from data_2_nested t   

if applied to sample data from your question - output is

enter image description here

Mikhail Berlyant
  • 165,386
  • 8
  • 154
  • 230
0

I modified your data a little bit to make 2 nested fields that need to be union. I also added a JS function for parsing the JSON. It is an ugly solution, but it seems to be working. Not sure if it is scalable (how many functions have to be created to covert different nested fields).

CREATE TEMP FUNCTION JsonToItems(input STRING)
RETURNS ARRAY<STRUCT<age INT64, grade STRING>>
LANGUAGE js AS """
return JSON.parse(input);
""";

with
data_1 as (
Select 'a123' as id, 1 as age, 'a' as grade
union all
Select 'a123' as id, 3 as age,'b' as grade
union all
Select 'a123' as id, 4.5 as age,'c' as grade
)
,
data_2 as (
Select 'b456' as id, 6 as age,'e' as grade
union all
Select 'b456' as id, 5 as age,'f' as grade
union all
Select 'b456' as id, 2.5 as age,'g' as grade
)
,
data_1_nested as (
SELECT id,
       array_agg(STRUCT(
                      age,grade
                        ))  as nested,
       array_agg(STRUCT(
                      age,grade
                        ))  as nested2
from data_1
group by 1
)
,
data_2_nested as (
SELECT id,
       array_agg(STRUCT(
                      grade, age
                        ))  as nested,
       array_agg(STRUCT(
              grade, age
                ))  as nested2
from data_2
group by 1
)

select id, JsonToItems(json), JsonToItems(json2)  from (
    SELECT id, TO_JSON_STRING(nested) as json, TO_JSON_STRING(nested2) as json2 from data_1_nested
    union all
    SELECT id, TO_JSON_STRING(nested) as json, TO_JSON_STRING(nested2) as json2 from data_2_nested
  );
Kyrylo Bulat
  • 740
  • 8
  • 11
  • Kyrylo, it was by design that I reversed grade and age in data_2_nested -- so as to show the problem that I actually face. data_1_nested and data_2_nested are "given" and all manipulations should start from there. – Ilja Dec 15 '20 at 17:42