2

SQL Server Full Text Search uses language specific Word Breakers.

For the German language this is used to break/split words including compound words. However, it appears not all known compound words are included in the Word Breaker. I would like to know if a list is available of the words the Word Breaker does know about.

Coolcoder
  • 4,036
  • 6
  • 28
  • 35

2 Answers2

0

in sql server 2008 this works... the language_id i put here is for german. I wanted to see the same thing but for spanish.

SELECT * FROM sys.fulltext_system_stopwords
WHERE language_id = 1031

edit: in sql server 2005 the words are stored here "$SQL_Server_Install_Path\Microsoft SQL Server\MSSQL.1\MSSQL\FTDATA\", If you edit the noise-word file, you have to repopulate the full-text.

Alan Featherston
  • 1,086
  • 3
  • 14
  • 27
  • Stop words are the new "noise words" in 2008. Effectively , these are the words that are excluded from full text search. I want to know what are the words Full Text knows how to break up. – Coolcoder Dec 03 '08 at 13:12
  • Specifically, in German, they have compound words - the Word Breaker appears to break some words but not others. So I would like to know which words it "knows" about. – Coolcoder Dec 03 '08 at 13:13
0

The answer is there is no answer. According to Microsoft , the words are not stored - they use a formula to "break" them. This will never be 100% accurate so i will just have to live with this fact.

Coolcoder
  • 4,036
  • 6
  • 28
  • 35