2

I'm using full-text search to search through a database of usernames (these are mostly email addresses, but not all). However, since the '_' is not considered a word breaking character, it is difficult to search for certain users.

e.g. Searching for "pancakes*" will return results for "bobby-pancakes" or "bobby@pancakes.com", but not "_bobby_robert_pancakes@email.com". I need a search that will match all of these.

To fix this, I want to treat underscores as word breaking characters like spaces or hyphens. Is there a way to update which characters break words in full text search?

Why not use the LIKE operator?

Unfortunately, we have a large database of usernames, and it's just too slow.

What about a custom dictionary?

As far as I can tell, custom dictionaries allow you to stop the word breaker from breaking certain combinations of characters, but don't allow you to introduce new breaking characters.

What about a custom word breaker?

I can't seem to find any documentation on how to create one of these.

Jacob Soderlund
  • 345
  • 1
  • 9

1 Answers1

0

You have to define your own language to add in custom word breakers, per the documentation here: https://learn.microsoft.com/en-us/sql/relational-databases/search/configure-and-manage-word-breakers-and-stemmers-for-search?view=sql-server-2017

Whilst this will give you ultimate flexibility on how your search terms are handled, it also lays the responsibility of keeping this up to date at your door too. As SQL Server upgrades, you will slowly fall behind in terms of linguistic logic, but if your requirement is fairly simple, this shouldn't be too much of an issue.


I thoroughly recommend you actually read the full article and the linked documentation to actually understand the implications of your use and modification of the Full Text Search capabilities.

If you don't want to do that however; within the linked article above, you will find a section called If you add or remove a word breaker. Within this section is a link to View or Change Registered Filters and Word Breakers, which explains how you can install a new language into your SQL Server instance. Along the left sidebar, you will find a multitude of articles regarding Full Text Search configuration, including how to create a Custom Dictionary for a language.

iamdave
  • 12,023
  • 3
  • 24
  • 53
  • 1
    Defining a custom language would be perfect. Unfortunately, the documentation you've linked (as well as the other documentation I've found) don't seem to describe how to do this – Jacob Soderlund Jul 23 '19 at 01:57
  • @JacobSoderlund I would recommend reading the docs more thoroughly in the future. I have updated my answer with specifics. – iamdave Jul 24 '19 at 08:07
  • 1
    Thank you for that. I still can't seem to find any mention of how to create or modify filters/word breakers/languages. The closest I could find were references to a DLL that needs to be installed, so I assume this would involve creating a DLL of some sort. Looks like this is more trouble than it's worth. – Jacob Soderlund Jul 25 '19 at 02:22