2

I know how to use the GROUP BY clause in the QUERY function with either a single or multiple fields. This can return the single row per grouping with the maximum value for one of the fields.

This page explains it nicely using these queries and image:

=query({A2:B10},"Select Col1,min(Col2) group by Col1",1)

=query({A14:C22},"Select Col1,Col2,min(Col3) group by Col1,Col2",1)

Google Sheets using GROUP BY

However, what if I only want a query that returns the corresponding values for the most recent row, grouped by multiple fields? Is there a query that can do this?

Example

Source Table

created_at first_name last_name email address city st zip amount
4/12/2022 19:15:00 Ava Anderson ava@domain.com 123 Main St Anytown IL 12345 1.00
8/30/2022 21:38:00 Brooklyn Brown bb@domain.com 234 Lake Rd Baytown CA 54321 2.00
2/12/2022 16:58:00 Ava Anderson ava@new.com 123 Main St Anytown IL 12345 3.00
4/28/2022 01:41:00 Brooklyn Brown brook@acme.com 456 Ace Ave Bigtown NY 23456 4.00
5/03/2022 17:10:00 Brooklyn Brown bb@domain.com 234 Lake Rd Baytown CA 54321 5.00

Desired Query Result

Group by first_name, last_name, address, city, st, and zip, but return the created_at, email, and amount for the maximum (most recent) value of created_at:

created_at first_name last_name email address city st zip amount
4/12/2022 19:15:00 Ava Anderson ava@domain.com 123 Main St Anytown IL 12345 1.00
8/30/2022 21:38:00 Brooklyn Brown bb@domain.com 234 Lake Rd Baytown CA 54321 2.00
4/28/2022 01:41:00 Brooklyn Brown brook@acme.com 456 Ace Ave Bigtown NY 23456 4.00

Is such a query possible in Google Sheets?

marikamitsos
  • 10,264
  • 20
  • 26
dhalsim2
  • 936
  • 2
  • 12
  • 35

2 Answers2

1

Use this formula

=QUERY({QUERY(A1:I, " Select max(A),min(B),min(C),min(D),min(E),min(F),min(G),min(H),min(I) Group by B,C,E,F,G,H ", 1)},
                    " Select *  Where Col1 is Not null ")

enter image description here

Osm
  • 2,699
  • 2
  • 4
  • 26
  • This isn't quite it. In the example table, I have three unique individuals, as determined by name and address; two happen to have the same name. I'd like to get the most recent record (including email and amount) for each of the three unique individuals. I've found that if I use `=QUERY(A1:I, "select max(A),B,C,E,F,G,H group by B,C,E,F,G,H", 1)`, then I properly reduce the data down to the proper set of individuals, but I don't know how to then join that data with the most recent record for each person. – dhalsim2 Sep 19 '22 at 15:34
0

I believe that this is the formula you need:

=ARRAY_CONSTRAIN(SORTN(SORT(
     QUERY({A1:I9,INDEX(IFERROR(REGEXEXTRACT(D1:D9,"(\D+)@")))},
                "where Col2 is not null"),
                                10,1,1,0),9^9,2,10,1),9^9,9)

(Do adjust the formula according to your ranges and locale)

enter image description here

For the formula to work we create the helper column
INDEX(IFERROR(REGEXEXTRACT(D1:D9,"(\D+)@"))).
We also use 9^9 which equals to 387420489 rows, making sure that all rows are included in our sorting calculations.
Finally in our ARRAY_CONSTRAIN function we return the first 9 columns discarding the 10th helper column.

Functions used:

marikamitsos
  • 10,264
  • 20
  • 26