-2

I have a table like this:

+--------+-------+--------+-------+
| attr1  | attr2 | attr3  | attr4 |
+--------+-------+--------+-------+
| purple | wine  | clear  |  10.0 |
| red    | wine  | solid  |  20.0 |
| red    | beer  | cloudy |  10.0 |
| purple | ale   | clear  |  34.0 |
| blue   | ale   | solid  |  16.0 |
+--------+-------+--------+-------+

that i want to transform like this:

+--------+-------+-------+-------+-------+
|        | attr1 | attr2 | attr3 | attr4 |
+--------+-------+-------+-------+-------+
| purple |     2 |       |       |       |
| red    |     2 |       |       |       |
| blue   |     1 |       |       |       |
| wine   |       |     2 |       |       |
| beer   |       |     1 |       |       |
| ale    |       |     2 |       |       |
| clear  |       |       |     2 |       |
| solid  |       |       |     2 |       |
| cloudy |       |       |     1 |       |
| 10.0   |       |       |       |     2 |
| 20.0   |       |       |       |     1 |
| 34.0   |       |       |       |     1 |
| 16.0   |       |       |       |     1 |
+--------+-------+-------+-------+-------+

This pivoted or cross-table will show me the count of each attribute value in their respective columns.

How do i use the Google Query language to display such a cross-table?

sifar
  • 1,086
  • 1
  • 17
  • 43

1 Answers1

2

Well if the data were laid out in two columns it would be straightforward e.g. for something like this

Attrib  Column
Red     1
Red     1
Green   1
Blue    1
Beer    2
Ale     2
Ale     2

you could use a query like

=query(A:B,"select A,count(A) where A<>'' group by A pivot  B")

So the problem is to organise OP#s data into two columns.

This can be done by what is by now a fairly standard split/join/transpose technique

=ArrayFormula(split(transpose(split(textjoin("|",true,if(A2:D="","",A2:D&" "&column(A2:D))),"|"))," "))

Giving

enter image description here

You could either run the query on the result of this or combine the two like this

 =ArrayFormula(query({"Attrib","Number";split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" "&column(A2:D))),"|"))," ")},"Select Col1,count(Col1) group by Col1 pivot Col2"))

I have joined the column number to the attribute e.g. 1-blue so that it sorts into the right order. If you don't like it, you could get rid of it using regexreplace.

Edit

Slightly shorter formula - I didn't need to put the headers in separately:

=ArrayFormula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" Attr"&column(A2:D))),"|"))," "),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))

Edit 2

I was being a bit thick there, should have used first row of OP's data as attribute labels instead of column numbers

=ArrayFormula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" "&A1:D1)),"|"))," "),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))

enter image description here

Edit 3

Should have chosen a better pair of delimiters

    =ArrayFormula(query(split(transpose(split(textjoin("",true,if(A2:D="","",column(A2:D)&"-"&A2:D&""&A1:D1)),"")),""),
"Select Col1,count(Col1) group by Col1 pivot Col2",0))
Tom Sharpe
  • 30,727
  • 5
  • 24
  • 37
  • how do we replace the values for the headers `1,2,3,4` with `Attr1,Attr2,Attr3,Attr4`? – sifar Jul 20 '18 at 09:01
  • This should be easy but for some reason the query wanted to use the whole first row of data as headers. Anyway the formula is =ArrayFormula(query({"Attrib","Number";split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&" Attr"&column(A2:D))),"|"))," ")}, "Select Col1,count(Col1) group by Col1 pivot Col2 label Col1 ''",1)) – Tom Sharpe Jul 20 '18 at 09:33
  • Brilliant and concise answer! I implemented the headers myself by slightly modifying your formula to consider the 1st row instead of columns, viz., `=arrayformula(query(split(transpose(split(textjoin("|",true,if(A3:D="","",column(A3:D)&"-"&A3:D&"^"&A2:D2)),"|")),"^"),"SELECT Col1, COUNT(Col1) GROUP BY Col1 PIVOT Col2"))`. Could you please edit and add it to your answer. Thanks a bunch! :) – sifar Jul 20 '18 at 09:38
  • I just saw your message regarding replacing the headers. The problem with this solution is the headers are not necessarily going to be starting with `Attr`. It was used as an example. So replacing column numbers with the 1st columns values works! `=arrayformula(query(split(transpose(split(textjoin("|",true,if(A2:D="","",column(A2:D)&"-"&A2:D&"^"&A1:D1)),"|")),"^"),"SELECT Col1, COUNT(Col1) GROUP BY Col1 PIVOT Col2"))`. Thanks. – sifar Jul 20 '18 at 09:45
  • That's a much better idea, have added my version of it to my answer. – Tom Sharpe Jul 20 '18 at 10:11
  • the only problem i am facing is if the `headers are made of more than 1 words` i.e. have spaces in between them. So the top-left header shows all headers concatenated together. Any ideas? – sifar Jul 20 '18 at 11:06
  • 1
    Congrats on 10k – TheMaster Jul 20 '18 at 11:25
  • Congratulations @TomSharpe :) – sifar Jul 20 '18 at 11:26
  • 1
    Cheers guys, that's really kind - and thank you for your upvotes that helped to make it happen ;-) – Tom Sharpe Jul 20 '18 at 15:32
  • 1
    Re the headers having spaces, I shouldn't have chosen space for a delimiter (should have copied |'-'| and used a skull or something) but using "^" like you did in your comment should be OK? – Tom Sharpe Jul 20 '18 at 15:42
  • @TomSharpe Yes it works fine using any other chars `* ! | ^`. The problem i mentioned earlier is faced when i remove the `column(A2:D)&"-"&A2:D&` appended for sorting as showed by you in your formula. Without it, i get the top-left cell as a concatenated string of headers and next header cell as Blank. Any ideas why this behaviour is happening? BTW, I do have some blank values in the columns. – sifar Jul 20 '18 at 18:14
  • Yes I got the same thing earlier because the Query was guessing how many of the rows of data were header rows. It should be OK if you put in a zero at the end of the query as in my edits to tell it there is no header row. – Tom Sharpe Jul 20 '18 at 19:12
  • 1
    @TomSharpe how about wrapping the `split function` into a `trim function`? Seems to work! But i like your idea of adding a zero parameter to `Query`. `=arrayformula(QUERY(TRIM(split(transpose(split(textjoin("!",TRUE,if(TEST!A2:D="","",TEST!A2:D & "~" & TEST!A1:D1)),"!")),"~")),"SELECT Col1, COUNT(Col1) GROUP BY Col1 PIVOT Col2"))` – sifar Jul 20 '18 at 21:12