11

Does anybody know if it's possible to do a %LIKE% search against a column in a SQL Server database but get it to ignore any special characters in the column?

So, for example if I have a column called "songs" and they contain the following...


Black Or White

No Sleep 'till Brooklyn

The Ship Song

Papa Don't Preach


If the user searches for "no sleey till brooklyn" then I would like it to return a match even though they forgot to include the apostrophe. I would also like it to return the 4th row if they search for "SOUL". I'm sure you get the idea....

Any help would really be appreciated.

jonhobbs
  • 26,684
  • 35
  • 115
  • 170

4 Answers4

14

I would look into using a Full Text Index and then you can use the power of FREETEXT and CONTAINS to do your search.

EDIT: I would still look into refining the Full Text Index searching, however, to follow on from another answer, this is an option using REPLACE.

SELECT
    Artist,
    Title
FROM
    Songs
WHERE
    REPLACE(REPLACE(REPLACE(Artist, '#',''), '*', ''), '"', '') LIKE '%Keywords%'
Robin Day
  • 100,552
  • 23
  • 116
  • 167
  • Dang, beat me to it. This really sounds like the way to go, IMO. In fact I'm facing a similar situation right now and I'm just about to switch to full-text searching. – Steve Wortham Aug 17 '09 at 17:19
  • Unfortunately, the reason I am investigating the possibility of writing our own queries for this is that we currently use full-text indexing and the client doesn't like it because they say it returns too many irrelevant results! – jonhobbs Aug 17 '09 at 17:28
  • @Jon, have you looked into this? http://msdn.microsoft.com/en-us/library/cc879245.aspx – Steve Wortham Aug 17 '09 at 18:00
2

You will have various characters to remove. Single quotes, double quotes, hyphens, dots, commas, etc.

You can use Regular expressions in your where clause and do a match on the clean value. Read more about regex within SQL here.

As for the art where you want to return the 4th row for SOUL.. you will need a a data structure to tag songs and you will have to search on the tags for the match. I'm afraid we will need more details on your data structure for that.

Raj More
  • 47,048
  • 33
  • 131
  • 198
  • I am terrible at using Regex's but may be able to find a regex on line which removes special characters. Thanks – jonhobbs Aug 17 '09 at 17:30
1

This is an old question but I just stumbled upon it and am also working with song titles and want to expand upon the accepted answer that uses REPLACE. You can create a list of the characters you want to ignore and create a simple function in any language to generate the quick'n'dirty never-ending REPLACE lines. For example, in Python:

def sanitize(db_field):
    special_chars = ['•', '"', "\\'", '*', ',']
    sanitized = "REPLACE({}, '{}', '')".format(db_field, special_chars.pop(0))
    for s in special_chars:
       sanitized = "REPLACE({}, '{}', '')".format(sanitized, s)
    return sanitized

A call such as sanitize("name") will return

REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(name, '•', ''), '"', ''), '\'', ''), '*', ''), ',', '')

which can be used in your query. Just wrote this so hope it helps someone.

Neil Menon
  • 169
  • 1
  • 5
0

Use a combination of TRANSLATE, UPPER, and TRIM.

mcandre
  • 22,868
  • 20
  • 88
  • 147
  • I just did a search for these functions but can't find any information. – jonhobbs Aug 17 '09 at 17:29
  • The SQL Server equivalents are UPPER, LTRIM, RTRIM. There isn't a TRANSLATE function but you can use multiple REPLACE functions for each of your special characters. – Robin Day Aug 17 '09 at 17:33
  • So, can I use a REPLACE on a column name, like this.... SELECT Artist, Title FROM Songs WHERE Artist.REPLACE('#','').REPLACE('*','') LIKE '%Keywords%'; – jonhobbs Aug 17 '09 at 17:38