2

I'm trying to find a way to match a column of clean data in table 1 to a column of dirty data in table2 without making any changes to the dirty data. I was thinking a fuzzy match, but there are too many entries in the clean table to allow for CDEs to be used. So, for example:

Table 1
GroupID    CompanyName
123        CompanyA
445        CompanyB
556        CompanyC

Table 2
GroupID    Patientname
AE123789   PatientA
123987     PatientB
445111     PatientC

And I'm trying to match the insurance company to the patient using the group number. Is there a matching method out there? (Fortunately the group numbers are actually much longer and when looking for a single group's worth of patients, fuzzy matching works really well, so they seem to be unique enough to be applied here).

Working in SQL server 2008.

user3629815
  • 37
  • 1
  • 6
  • What kind of fuzziness do you expect? I'd suppose that `table_2.GroupID` should end with a value of `table_1.GroupID`. – 9000 May 12 '14 at 19:32
  • It's occurring to me that I'm perhaps using match and join interchangeably, which is wrong and confusing to others. I need the final table to be Table2.PatientName, Table1.Company name. So in a perfect world, I'd be able to do something like: Join table2 on table1.groupid like table2.%groupid% but that isn't a real syntax. – user3629815 May 12 '14 at 19:40

1 Answers1

3

This changes slightly depending on which database you are using, but it looks like you're looking for something like this:

MSSQL

select *
from table1 t1
  join table2 t2 on t2.groupid like '%'+cast(t1.groupid as varchar(max))+'%' 

MySQL - use Concat():

select *
from table1 t1
  join table2 t2 on t2.groupid like concat('%',t1.groupid,'%') 
sgeddes
  • 62,311
  • 6
  • 61
  • 83