SELECT *
FROM epc e
INNER JOIN epc max_
ON e.ADDRESS1 = max_.ADDRESS1
AND e.POSTCODE = max_.POSTCODE
AND e.INSPECTION_DATE < max_.INSPECTION_DATE;
I have a table with 13 million rows. Without the AND e.INSPECTION_DATE < max_.INSPECTION_DATE
it returns really quickly, but with it, very slow.
I've created index on INSPECTION_DATE and also a composite index on ADDRESS1, POSTCODE and INSPECTION_DATE to see if any of those would work, but it's still slow. I'm sure I left it one time and it returned "0 row(s) returned" but in sqlfiddle it seemed to remove duplicates and keep most recent confused.
My main objective is to delete all duplicate rows from the database where ADDRESS1 and POSTCODE are identical, while leaving the most recent row based on INSPECTION_DATE in the table.