Fuzzy match column to column

Question

I'm trying to find a way to match a column of clean data in table 1 to a column of dirty data in table2 without making any changes to the dirty data. I was thinking a fuzzy match, but there are too many entries in the clean table to allow for CDEs to be used. So, for example:

Table 1
GroupID    CompanyName
123        CompanyA
445        CompanyB
556        CompanyC

Table 2
GroupID    Patientname
AE123789   PatientA
123987     PatientB
445111     PatientC

And I'm trying to match the insurance company to the patient using the group number. Is there a matching method out there? (Fortunately the group numbers are actually much longer and when looking for a single group's worth of patients, fuzzy matching works really well, so they seem to be unique enough to be applied here).

Working in SQL server 2008.

What kind of fuzziness do you expect? I'd suppose that `table_2.GroupID` should end with a value of `table_1.GroupID`. — 9000, May 12 '14 at 19:32
It's occurring to me that I'm perhaps using match and join interchangeably, which is wrong and confusing to others. I need the final table to be Table2.PatientName, Table1.Company name. So in a perfect world, I'd be able to do something like: Join table2 on table1.groupid like table2.%groupid% but that isn't a real syntax. — user3629815, May 12 '14 at 19:40

score 3 · Accepted Answer · answered May 12 '14 at 19:39

3

This changes slightly depending on which database you are using, but it looks like you're looking for something like this:

MSSQL

select *
from table1 t1
  join table2 t2 on t2.groupid like '%'+cast(t1.groupid as varchar(max))+'%'

SQL Fiddle Demo

MySQL - use Concat():

select *
from table1 t1
  join table2 t2 on t2.groupid like concat('%',t1.groupid,'%')

answered May 12 '14 at 19:39

sgeddes

62,311
6
61
83

@user3629815 -- np, glad I could help. – sgeddes May 12 '14 at 19:47

Fuzzy match column to column

1 Answers1