When a new customer contacts us, they are allocated a reference number.
Unfortunately our contact centre sometimes logs the same person without checking if they have contacted us before and the customer ends up with two reference numbers. We want to cleanse this, so:
I would like to output instances where the customer's surname, address1 and zipcode are duplicated but only if the customer has different reference numbers.
This is the type of data that I'd like to see output:
Ref LastName Address 1 Zip 1875 Faulkner 10 Smith Street 08540 1876 Faulkner 10 Smith Street 08540
I have tried a few ideas, the latest being (forgive the huge amount of code here):
with Duplicates as
(
select r.LastName
, a.Address1
, a.ZipCode
, COUNT(*) as DuplicateCount
FROM Reference r
INNER JOIN Address a ON a.ReferenceNumber = r.ReferenceNumber
LEFT OUTER JOIN Telephone t ON r.ReferenceNumber = t.ReferenceNumber
LEFT OUTER Join Email e ON r.ReferenceNumber = e.ReferenceNumber
group by r.LastName
, a.Address1
, a.ZipCode
having COUNT(*) > 1
)
SELECT
r.ReferenceNumber
, r.LastName
, r.FirstName
,a.ReferenceNumber
, a.Address1
, a.Address2
, a.Address3
, a.Address4
, a.ZipCode
,t.ReferenceNumber
, t.TelephoneNumber
,e.ReferenceNumber
, e.EmailAddress
, d.DuplicateCount
FROM Reference r
INNER JOIN Address a ON a.ReferenceNumber = r.ReferenceNumber
LEFT OUTER JOIN Telephone t ON r.ReferenceNumber = t.ReferenceNumber
LEFT OUTER Join Email e ON r.ReferenceNumber = e.ReferenceNumber
join Duplicates d on d.LastName = r.LastName
AND d.Address1 = a.Address1
AND d.ZipCode = a.ZipCode;
Unfortunately this returns all duplicates, not those with the same surname, address1 and zipcode and different reference numbers. Do you have any advice on how I can achieve this?
Many thanks.