8

I have MyTable with a Column Message NVARCHAR(MAX).

Record with ID 1 contains the Message '0123456789333444 Test'

When I run the following query

DECLARE @Keyword NVARCHAR(100)

SET @Keyword = '0123456789000001*'

SELECT *
FROM MyTable
WHERE CONTAINS(Message, @Keyword) 

Record ID 1 is showing up in the results and in my opinion it should not because 0123456789333444 does not contains 0123456789000001.

Can someone explain why the records is showing up anyway?

EDIT

select * from sys.dm_fts_parser('"0123456789333444 Test"',1033,0,0)

returns the following:

group_id phrase_id occurrence special_term  display_term        expansion_type source_term
1        0         1           Exact Match  0123456789333444    0              0123456789333444 Test
1        0         1           Exact Match  nn0123456789333444  0              0123456789333444 Test
1        0         2           Exact Match  test                0              0123456789333444 Test
BenMorel
  • 34,448
  • 50
  • 182
  • 322
gsharp
  • 27,557
  • 22
  • 88
  • 134
  • 1
    This looks very similar to [Contains() function falters with strings of numbers?](http://stackoverflow.com/questions/8011044/contains-function-falters-with-strings-of-numbers) – Eric J. Price Nov 11 '13 at 16:47
  • I'm unable to reproduce this in SQL Server 2012, with and without a stoplist. Which version are you running, what's your stoplist (I know you said it's not a factor but still I'm curious), and what is the full Message value which contains "0123456789333444"? – Keith Nov 18 '13 at 20:18
  • Hi @Keith and thanks for having a look at it. I'm using SQL Server 2008 R2. The Full Message is like above: '0123456789333444 Test'. What you mean by "what's your stoplist" ? Thanks Giuseppe – gsharp Nov 19 '13 at 15:30
  • @gsharp In SSMS right-click the table, select Full-Text Index > Properties, and let us know what the value of "Full-Text Index Stoplist" is. – Keith Nov 19 '13 at 15:56
  • I'm starting to think this is an issue with SS 2008 since I'm unable to reproduce this in SS 2012, and the thread @Love2Learn linked to also concerns SS 2008. Unfortunately I can't verify this since I don't have SS 2008 installed. – Keith Nov 20 '13 at 16:13
  • Please report output of this query: `select * from sys.dm_fts_parser('"0123456789333444 Test"',1033,0,0)` – Stoleg Nov 20 '13 at 16:44
  • @Stoleg updated the questions with the result. thanks for your help. – gsharp Nov 21 '13 at 12:04

3 Answers3

1

This is because the @Keyword is not wrapped in double quotes. Which forces zero, one, or more matches.

Specifies a match of words or phrases beginning with the specified text. Enclose a prefix term in double quotation marks ("") and add an asterisk () before the ending quotation mark, so that all text starting with the simple term specified before the asterisk is matched. The clause should be specified this way: CONTAINS (column, '"text"'). The asterisk matches zero, one, or more characters (of the root word or words in the word or phrase). If the text and asterisk are not delimited by double quotation marks, so the predicate reads CONTAINS (column, 'text*'), full-text search considers the asterisk as a character and searches for exact matches to text*. The full-text engine will not find words with the asterisk (*) character because word breakers typically ignore such characters.

When is a phrase, each word contained in the phrase is considered to be a separate prefix. Therefore, a query specifying a prefix term of "local wine*" matches any rows with the text of "local winery", "locally wined and dined", and so on.

Have a look at the MSDN on the topic. MSDN

Matt
  • 1,441
  • 1
  • 15
  • 29
  • Thanks for your answer. I've tried to use SET @Keyword = '"0123456789000001*"' but the result is unfortunately the same. – gsharp Nov 12 '13 at 09:01
  • Could you please post your DDL for the FullText Index creation. I'd like to take a look at that if possible. – Matt Nov 12 '13 at 15:57
  • It's no big magic: ALTER FULLTEXT INDEX ON [dbo].[MyTable] ADD ([Message]) – gsharp Nov 13 '13 at 10:33
  • Can you please try removing the asterisk. I re-read the MSDN documentation again today and I consulted my Microsoft Exam book on the topic too. I'd like to see what the behavior is without it. I don't have a box I can run fulltext tests for you on otherwise I would be doing that. Hopefully the comments are helping you get at a solution. This is one of those good problems that stumps you for a bit. So it would be : SELECT...WHERE CONTAINS(Message, @Keyword). The @Keyword would be set to '"0123456789000001"'. Notice no esterisk which from the snippet above states "zero, one, or more". – Matt Nov 13 '13 at 17:57
  • Does it work if you use this string directly, without variable? – Stoleg Nov 20 '13 at 16:47
1

Have you tried to query the following view to see what's on the system stoplist?

select * from sys.fulltext_system_stopwords where language_id = 1033;
NickyvV
  • 1,720
  • 2
  • 16
  • 18
1

Found a solution that works. I've added language 1033 as an additional parameter.

SELECT * FROM MyTable WHERE CONTAINS(Message, @Keyword, langauge 1033) 
gsharp
  • 27,557
  • 22
  • 88
  • 134