4

I have a question what's different between "blah" and "blah*" in the containstable? below is my context.

create table testWildcard(pk int identity constraint twpk primary key, displayName varchar(100))
GO
create fulltext catalog test12 as default
GO
create fulltext index on testWildcard(displayName) key index twpk
GO
insert into testWildcard(displayName) values('blahBlahBlah')
insert into testWildcard(displayName) values('blah<bold>Blah</bold>Blah')
insert into testWildcard(displayName) values('blah Blah Blah')
GO

when use "blah*" as keyword to search, the result is 3 records.

select * from testWildcard t 
inner join CONTAINSTABLE(testWildcard, *, '"blah*"', LANGUAGE 1033) as w on t.pk=w.[Key]

    results:
            pk  displayName                  Key RANK 
            1   blahBlahBlah                 1  16
            2   blah<bold>Blah</bold>Blah    2  48
            3   blah Blah Blah               3  48

when use "blah" as keyword to search, the result is 2 records.

select * from testWildcard t 
inner join CONTAINSTABLE(testWildcard, *, '"blah"', LANGUAGE 1033) as w on t.pk=w.[Key]

resuts:
            pk  displayName                  Key RANK 
            2   blah<bold>Blah</bold>Blah    2  48
            3   blah Blah Blah               3  48

Then I use system function sys.dm_fts_parser to analyse the keywords, they don't have any difference. I google it but I still cannot find the proper answer.

SELECT * FROM sys.dm_fts_parser ('"blah"', 1033, 0, 0) 
SELECT * FROM sys.dm_fts_parser ('"blah*"', 1033, 0, 0)  
result:
keyword             group_id    phrase_id  occurrence   special_term  display_term expansion_type source_term
 0x0062006C00610068 1           0          1            Exact Match        blah       0             blah

Thank you.

David Tansey
  • 5,813
  • 4
  • 35
  • 51
Dino Liu
  • 500
  • 4
  • 17

2 Answers2

2

Star(*) in FTS is used for wild card the same concept as normal query.

So when you search blah, it search the word comes with blah. When you use blah*, it search the word starting with blah whatever written after that.

If you search with *blah, it search where the end word blah.

Now your question is why blahBlahBlah entry not comes in first search, the reason as per msdn it is not any of this: (means FTS only search which is known word in simple term, not any word which we create)

CONTAINS can search for:

A word or phrase.

The prefix of a word or phrase.

A word near another word.

A word inflectionally generated from another (for example, the word drive is the inflectional stem of drives, drove, driving, and driven).

A word that is a synonym of another word using a thesaurus (for example, the word "metal" can have synonyms such as "aluminum" and "steel").

https://technet.microsoft.com/en-us/library/cc879300%28v=sql.105%29.aspx

How do you get leading wildcard full-text searches to work in SQL Server?

Community
  • 1
  • 1
Ajay2707
  • 5,690
  • 6
  • 40
  • 58
  • So does this mean Full-Text is not a good choice for text containing weird (arbitrary) product names? – Tim Apr 24 '18 at 10:37
  • 1
    yes, ideally it search dictionary word. above comment from MSDN, where fts search on phrase. – Ajay2707 Apr 24 '18 at 10:48
  • 1
    As you can see in the **More Info** part of your second offered link, `*blah` doesn't work at all while `blah*` does. – Muhammad Musavi Aug 18 '18 at 08:19
0

If you look at the list of contents of the index using sys.dm_fts_index_keywords

SELECT display_term, column_id, document_count
FROM sys.dm_fts_index_keywords(DB_ID('DatabaseName'), OBJECT_ID('testWildcard'))

You'll see еркуу words: blah, blahblahblah, bold.

That means that space, < and > "break" a word when creating the full-text index.

Now, when you search by "blah" CONTAINS returns only strings that contain word blah exactly.
And when you add wildcard * and so search by "blah*" CONTAINS returns all strings that contain words, beginning with blah.

Igor Borisenko
  • 3,806
  • 3
  • 34
  • 49