I am attempting to write a query that finds when a Name changes but a Customer ID does not. To do this I used
Group by CustID, Name
Then counted Customer IDs, where any result greater than 1 is important. Here is the query I have so far:
Select
CustID,
NumDiffNames
from
(select
CustID,
count(CustID) NumDiffNames
from
(Select
CustID,
Name
from
(select
CustID,
concat(rtrim(ltrim(FIRSTNAME)),rtrim(ltrim(LASTNAME))) Name
FROM
[Branch]
union
select
coCustID as CustID,
concat(rtrim(ltrim(COFNAME)),rtrim(ltrim(COLNAME))) Name
FROM
[Branch]) a
group by CustID, Name) b
group by CustID) c
where NumDiffNames>1
However, I am getting many false positives. I removed a large number of them by trimming the whitespace around each name. Another set of false positives is formed by data entry and misspellings. For example in one case the customers name is spelled Vanessa, and for the same Customer ID, the name is spelled Venessa. Or a name is spelled correctly in one case, but two letters are transposed in another John & Jhon for example.
Is it possible to have groups with more variance, like one or two character differences?