2

If i have 2 columns viz., ID & Name, ID column containing duplicates, and if i want to group by ID to get unique ID's but name column should be a comma-separated list, can this be possible in Google Query?

| ID   | Name |
===============
| 1001 | abc  |
---------------
| 1001 | def  |
---------------
| 1002 | kjg  |
---------------
| 1003 | aof  |
---------------
| 1003 | lmi  |
---------------
| 1004 | xyz  |
---------------

into

| ID   | Name      |
====================
| 1001 | abc, def  |
--------------------
| 1002 | kjg       |
--------------------
| 1003 | aof, lmi  |
--------------------
| 1004 | xyz       |
--------------------
sifar
  • 1,086
  • 1
  • 17
  • 43

4 Answers4

3

try:

=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B}, 
 "select Col1,max(Col2) 
  where Col1 is not null 
  group by Col1 
  pivot Col3"), 
 "select Col1 
  offset 1", 0), REGEXREPLACE(TRIM(
 TRANSPOSE(QUERY(TRANSPOSE(QUERY(QUERY({A2:B&",", B2:B}, 
 "select max(Col2) 
  where Col1 is not null 
    and Col2 <> ',' 
  group by Col1 
  pivot Col3"), 
 "offset 1", 0)),,999^9))), ",$", )})

enter image description here

however, this may not work for massive datasets due to TRIM (which is needed to remove empty spaces) and REGEXREPLACE (which is needed to remove the end comma) limitations. otherwise, without it, the formula can handle anything:

=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B}, 
 "select Col1,max(Col2) 
  where Col1 is not null 
  group by Col1 
  pivot Col3"), 
 "select Col1 
  offset 1", 0), 
 TRANSPOSE(QUERY(TRANSPOSE(QUERY(QUERY({A2:B&",", B2:B}, 
 "select max(Col2) 
  where Col1 is not null 
    and Col2 <> ',' 
  group by Col1 
  pivot Col3"), 
 "offset 1", 0)),,999^9))})
player0
  • 124,011
  • 12
  • 67
  • 124
  • 1
    Very ingenious answer! This is what i call sheer brilliance. Thanks a lot. I am sure that Query without the select statement and large count of headers did the trick. I am mainly puzzled by the Offset 1. Can you explain the entire workings of the formula? – sifar Dec 07 '20 at 19:05
  • 1
    @sifar the query pivots 3rd column so the query outputs the 3rd column as header row number 1, which is totally unnecessary so we use "offset 1" to offset the whole query output by 1 row which results in removing the header row (pivot residue) – player0 Dec 07 '20 at 19:11
2

I looked through Query specification. I could not find a solution. So I made some formulas that do the job (because I found this task interesting). enter image description here

D2 contains =unique(a2:a)

E2 contains =join(", ",transpose(filter($B$2:$B,$A$2:$A=D2)))and it's copied down.

I had to copy formulas down (far from beautiful formula) Hope you find it helpful.

Reference

Raserhin
  • 2,516
  • 1
  • 10
  • 14
Krzysztof Dołęgowski
  • 2,583
  • 1
  • 5
  • 21
1

Here is an answer using QUERY.

=ARRAYFORMULA(REGEXREPLACE(TRIM(SPLIT(TRANSPOSE(SPLIT(
 CONCATENATE(TRANSPOSE(QUERY({"♦"&A2:A&"♠", B2:B&", "}, 
 "select max(Col2) where Col2 is not null group by Col2 pivot Col1", 0))), 
 "♦")), "♠")), ",$", ))

This comes directly from this question. Player0 has answers with just amazing formulas that are able to reorganise data in a huge variety of ways.

enter image description here

kirkg13
  • 2,955
  • 1
  • 8
  • 12
  • i am getting `Text result of CONCATENATE is longer than the limit of 50000 characters.` – sifar Dec 07 '20 at 13:58
  • Is that with sample data, or your "production" sheet? How many rows of data in columns A and B? Are the values in A or B large blocks of text? That would explain the error. It is possible that this formula could be redeveloped to avoid this issue. – kirkg13 Dec 07 '20 at 14:08
  • it is with my production sheet which contains 6K rows. – sifar Dec 07 '20 at 14:09
1

if you could live with the end-comma present in the output you can try:

=ARRAYFORMULA({QUERY(QUERY({A2:B, B2:B}, 
 "select Col1,max(Col3) 
  where Col1 is not null 
    and Col3 <> ',' 
  group by Col1 
  pivot Col2"),
 "select Col1 offset 1", 0), 
 TRANSPOSE(QUERY(TRANSPOSE(IFERROR(VLOOKUP(QUERY(QUERY({A2:B, B2:B}, 
 "select Col1,max(Col3) 
  where Col1 is not null 
    and Col3 <> ',' 
  group by Col1 
  pivot Col2"),
 "select Col1 offset 1", 0), 
 QUERY(QUERY({A2:B, B2:B&","}, 
 "select Col1,max(Col3) 
  where Col1 is not null 
    and Col3 <> ',' 
  group by Col1 
  pivot Col2"),
 "offset 1", 0), 
 SPLIT(TRANSPOSE(QUERY(TRANSPOSE(IF(QUERY(QUERY({A2:B, B2:B&","}, 
 "select max(Col3) 
  where Col1 is not null 
    and Col3 <> ',' 
  group by Col1    
  pivot Col2"),
 "offset 1", 0)="",,COLUMN(B2:XXX)&",")),,999^99)), ","), 0))),,999^99))})

enter image description here

(tho this was never tested on an ultra-massive dataset but in theory, it should handle anything too)

player0
  • 124,011
  • 12
  • 67
  • 124
  • Wow! brilliant. I will check it in the morning with a larger dataset. – sifar Dec 07 '20 at 21:40
  • can you help me with [this](https://stackoverflow.com/questions/65212862/create-percentage-table-from-2-other-tables-google-query)? – sifar Dec 09 '20 at 10:42