3

Where can I find the list of default word breakers for English in sql server full text search?

trs
  • 2,454
  • 13
  • 42
  • 61

3 Answers3

0

Neutral word breakers (white space and punctuation) + Locale specific values. So, it would depend on which English Locale is running.

See http://technet.microsoft.com/en-us/library/ms142509(v=sql.100).aspx

dfrevert
  • 366
  • 5
  • 17
0

The list of languages which have word breakers associated with them can be obtained by running the following query -

SELECT * FROM sys.fulltext_languages; 

I am not sure if there's a stored-proc or an internal table which shows you the .dll file associated with each language but that can be looked up under the following registry key -

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\{SQL Instance Name}\MSSearch\CLSID\

The language mappings for each CLSID is stored in MSSearch\Language.

aks
  • 24,359
  • 3
  • 32
  • 35
  • I believe the SP you're looking for which shows the associated .dll is EXEC sp_help_fulltext_system_components 'wordbreaker'; – godel Oct 03 '19 at 20:07
0

With the stored procedure sys.dm_fts_parser you can test given strings against the word breaker. The following query tests all ASCII chars from char(32) to char(255) and returns a list of currently active word breaker chars.

declare @i integer
declare @cnt integer
set @i=32
while @i<255
begin
  set @cnt=0
  select @cnt=COUNT(1) FROM sys.dm_fts_parser ('"word1'+CHAR(@i)+'word2"', 1033, 0, 0)
  if @cnt>1
  begin
  print CONCAT('ASCII ', @i, ': ', char(@i))
  end
  set @i=@i+1
end

Result:

ASCII 32:  
ASCII 33: !
ASCII 34: "
ASCII 35: #
ASCII 36: $
ASCII 37: %
ASCII 38: &
ASCII 40: (
ASCII 41: )
ASCII 42: *
ASCII 43: +
... and so on ...

Source: https://stuart-moore.com/generating-a-list-of-full-text-word-breakers-for-sql-server/

djk
  • 943
  • 2
  • 9
  • 27