I am using Double-Metaphone for fuzzy searching within my database. I have a table of names, and both the first and last names have double metaphone entries already created (and updated, via a Trigger). In my application, I am allowing the user to search by Lastname and/or Firstname.
What is the best way to query the database, to get the best results from the Double-Metaphone indexes when dealing with both last AND first names ? Querying just based on lastname is easy - generate the DM tags and query the database. It's when querying by both first and last that I'd like to get some fine tuning.
The database layout is similar to the following:
tblName
FirstName
LastName
MetaPhoneFN1
MetaPhoneFN2
MetaPhoneLN1
MetaPhoneLN2
Application: [Lastname] [FirstName]
User inputs just a lastname, or a combination of lastname + [First initial, first name, part of first name].
Lastname: SMITH
FirstName: J or Jo or John or Johnathan
If I pass in "J" as the firstname - I'd like all name entries matching "J%".
If I pass in "JO" as the firstname - I'd like all name entries matching "JO%".
If I pass in "JOHN" or "JOHNATHAN" as the firstname - I'd like to use DM
or maybe also "JOHN%" ?
I'm really open to suggestions here, for the firstname. I want the results to be as good as possible and return what the user wants.
What is the best way to query the database for last + any of those combinations of first name ? Here's a sample of what I've gotten so far.. and I'm not completely thrilled with the results:
SELECT *
FROM tblName
WHERE
--There will always be a last name
(MetaPhoneLN1 = @paramMetaPhoneLN1
OR (CASE WHEN @paramMetaPhoneLN2 IS NOT NULL AND MetaPhoneLN2 = @paramMetaPhoneLN2 THEN 1
WHEN @paramMetaPhoneLN2 IS NULL THEN 0
END) = 1)
-- Match Firstname 1
AND (CASE WHEN @paramMetaPhoneFN1 IS NULL THEN 1
WHEN @paramMetaPhoneFN1 IS NOT NULL AND MetaPhoneFN1 = @paramMetaPhoneFN1 THEN 1
WHEN LEN(@paramMetaPhoneFN1) > 1 AND LEN(@paramMetaPhoneFN1) < 4 AND MetaPhoneFN1 LIKE @paramMetaPhoneFN1 + '%' THEN 1
WHEN LEN(@paramMetaPhoneFN1) = 1 THEN 1
END) = 1
-- Match Firstname 2
AND (CASE WHEN @paramMetaPhoneFN2 IS NULL THEN 1
WHEN @paramMetaPhoneFN2 IS NOT NULL AND MetaPhoneFN2 = @paramMetaPhoneFN2 THEN 1
WHEN LEN(@paramMetaPhoneFN2) > 1 AND LEN(@paramMetaPhoneFN2) < 4 AND MetaPhoneFN2 LIKE @paramMetaPhoneFN2 + '%' THEN 1
WHEN LEN(@paramMetaPhoneFN2) = 1 THEN 1
--ELSE 0
END) = 1
AND (CASE WHEN @paramFirstName IS NULL THEN 1
WHEN FirstName LIKE @paramFirstName + '%' THEN 1
--WHEN LEN(@paramMetaPhoneFN1) = 1 AND @paramFirstName IS NOT NULL AND LEN(@paramFirstName) > 1 AND FirstName LIKE @paramFirstName + '%' THEN 1
--ELSE 1
END) = 1
What I've tried to do is account for the different variations for firstname. My results however, aren't exactly what I would want.
I've been able to find lots of implementations of Double Metaphone in SQL/C#, etc. for /generating/ the Double-Metaphone values, but nothing on how to actually query the database effectively once you have those values.
SUMMARY:
When I search by both lastname and firstname -- I'd like to query the database for the Double Metaphone match only on Lastname, but I'd like a lot of flexibility when a firstname is also passed in.. first initial ? sounds like ? etc. I am open to suggestions and SQL examples!
UPDATE 1: When I say that I'm not thrilled with the results.. what I'm saying is that I'm not sure how to formulate the Firstname part of the query, to maximize results. If I search for "WILL" - what results should be returned ? WILLIAM, WILL, WILBERT .. but not WALKER - though with what I have here, WALKER would be returned because WILL -> FL and WALKER IS [FLKR] but WILLIAM IS [FLM]. If I do only DM = DM then I wouldn't get WILLIAM even returned, which is why I'm doing a LIKE in the first place, if the DM length is < 4.
Basically, I'd like to know if anyone else has run into this issue, and see what solutions others have come up with.
First initial only - should show all firstnames starting with that initial - Here's where I'm uncertain: Partial name - should should all firstnames starting with the partial ? [how do you know if it's just a partial name ?!] Full name - should use DM ?