1

SQL or R Solution needed (prefer SQL).

I have 3 datasets, all with a common identifier. After appending Table 2 and Table 3 to Table 1 using FROM

Table 1

Common ID District Floor
A001 East High
A002 South Low
A003 West Med
A004 North High
A005 East High
A006 South Low
A007 West Med
A008 North High
A009 West High

Table 2

Common ID Last Enquiry Date
A001 31/01/2022
A001 18/07/2022
A002 20/05/2021
A002 20/05/2020
A002 20/05/2022
A003
A004 13/05/2022
A005 18/08/2021
A005 18/08/2020
A006
A007
A008 27/01/2021
A009 14/02/2020

Table 3

Common ID Last Completion Date
A001
A002 29/11/2021
A002 29/11/2020
A002 20/05/2022
A003
A004 13/05/2022
A005 22/08/2020
A006 21/07/2019
A007 24/04/2018
A007 24/04/2019
A007 24/04/2021
A008
A009 15/05/2021
A009 15/12/2021

In SQL, tried the solution here but missed the empty rows. If I replace Inner Join with Full Outer Join, the max.date effect will be gone.

how do I query sql for a latest record date for each user

Ideal output: | Common ID | District | Floor | Last Enquiry Date | Last Completion Date| | -------- | -------- | ------| --------------- | --------------- | | A001 | East | High | 18/07/2022 | | | A002 | South | Low | 20/05/2022 | 29/11/2021 | | A003 | West | Med | | | | A004 | North | High | 13/05/2022 | 13/05/2022 | | A005 | East | High | 18/08/2021 | 22/08/2020 | | A006 | South | Low | | 21/07/2019 | | A007 | West | Med | | 24/04/2021 | | A008 | North | High | 27/01/2021 | | | A009 | West | High | 14/02/2020 | 15/12/2021 |

1 Answers1

0
SELECT t1.*
FROM table_1 t1
INNER JOIN (
  SELECT id, MAX(last_enquiry_date) AS max_last_enquiry_date, 
  MAX(last_completion_date) AS max_last_completion_date
  FROM table_1
  WHERE last_enquiry_date IS NOT NULL OR last_completion_date IS NOT NULL
  GROUP BY id
) t2 ON t1.id = t2.id AND t1.last_enquiry_date = t2.max_last_enquiry_date AND t1.last_completion_date = t2.max_last_completion_date;

Assuming you have merged your tables, you first select everything from said table, you then create an inner join where you only select entries that have the latest dates and where both columns are not empty and group them by their id. Finally you filter your initial selection by only keeping entries that match the criteria of your inner join. Note, if you use AND instead of OR, columns where either one is empty will not be selected, the AND statement returns 0 if at least one of the conditions returns 0, the OR statement only returns 0 if both conditions return 0.

Sven Kuffer
  • 61
  • 1
  • 5
  • Hello, thanks for your prompt response. I tried your solution but it will not include the row where last_enquiry_date and/or last_completion_date is empty. I tried /*WHERE last_enquiry_date IS NOT NULL OR last_completion_date IS NOT NULL */ and use 2 OR to replace the 2 AND at the last part. But this would remove the effect of max date. – walkinglemon Mar 22 '23 at 03:53
  • I would like the table to show for every id i have in Table 1, if it is empty in last_enquiry_date and/or last_completion_date, just return empty; if there is only 1 record in last_enquiry_date and/or last_completion_date, just return that 1; if there are multiple records in last_enquiry_date and/or last_completion_date, return the latest date that record only – walkinglemon Mar 22 '23 at 03:58