4

The docs for the groupArray function warns that

Values can be added to the array in any (indeterminate) order.... In some cases, you can still rely on the order of execution. This applies to cases when SELECT comes from a subquery that uses ORDER BY.

Does this just mean that the array will not neccessarily be in the order specified in the ORDER BY? Can I depend on the order of multiple groupArrays in the same query being consistent with each other?

For instance given the records:

{commonField:"common", fieldA: "1a", fieldB:"1b"}
{commonField:"common", fieldA: "2a", fieldB:"2b"}
{commonField:"common", fieldA: "3a", fieldB:"3b"}

Can I depend on the query

SELECT commonField, groupArray(fieldA), groupArray(fieldB) FROM myTable GROUP BY commonField

to return

{
  commonField:"common",
  groupedA:[
    "2a", "3a", "1a"
  ],
  groupedB:[
    "2b", "3b", "1b"
  ]
}
lights
  • 1,034
  • 1
  • 8
  • 22

1 Answers1

4

multiple groupArrays in the same query being consistent with each other?

Yes. They will be consistent.

Anyway you can use Tuple & single groupArray. And Tuple is usefull if you have NULLs, because ALL aggregate functions skip Nulls.

create table test (K Int64, A Nullable(String), B Nullable(String)) Engine=Memory;
insert into test values(1, '1A','1B')(2, '2A', Null);


select groupArray(A), groupArray(B) from test;
┌─groupArray(A)─┬─groupArray(B)─┐
│ ['1A','2A']   │ ['1B']        │
└───────────────┴───────────────┘

---- Tuple (A,B) one groupArray ----

select groupArray( (A,B) ) from test;
┌─groupArray(tuple(A, B))───┐
│ [('1A','1B'),('2A',NULL)] │
└───────────────────────────┘

select (groupArray( (A,B) ) as ga).1 _A, ga.2 _B from test;
┌─_A──────────┬─_B──────────┐
│ ['1A','2A'] │ ['1B',NULL] │
└─────────────┴─────────────┘

---- One more Tuple trick - Tuple(Null) is not Null ----

select groupArray(tuple(A)).1 _A , groupArray(tuple(B)).1 _B from test;
┌─_A──────────┬─_B──────────┐
│ ['1A','2A'] │ ['1B',NULL] │
└─────────────┴─────────────┘

---- One more Tuple trick tuple(*)

select groupArray( tuple(*) ) from test;
┌─groupArray(tuple(K, A, B))────┐
│ [(1,'1A','1B'),(2,'2A',NULL)] │
└───────────────────────────────┘
Denny Crane
  • 11,574
  • 2
  • 19
  • 30