3

I do not understand why following Query1 and Query2 don't return similar result sets.

Samples

Query1:

select * from dbo.tFTS_test where contains (*, '"qwe-asd*"')

Returns:

Id          Value
----------- ----------

(0 row(s) affected)

Query2:

select * from dbo.tFTS_test where contains (*, '"qwe-asd"')

Returns:

Id          Value
----------- ----------
Informational: The full-text search condition contained noise word(s).
1           qwe-asd

(1 row(s) affected)

Table:

select * from dbo.tFTS_test

Returns:

Id          Value
----------- ----------
1           qwe-asd

(1 row(s) affected)

Here are some helper queries

Helper Query1:

select * from sys.dm_fts_index_keywords_by_document (db_id(), object_id('dbo.tFTS_test'))

Returns:

keyword                         display_term  column_id   document_id          occurrence_count
------------------------------- ------------- ----------- -------------------- ----------------
0x007100770065                  qwe           2           1                    1
0x007100770065002D006100730064  qwe-asd       2           1                    1
0xFF                            END OF FILE   2           1                    1

(3 row(s) affected)

Helper Query2:

select p.*
from sys.fulltext_stoplists s
cross apply sys.dm_fts_parser ('"qwe-asd"', 1033, s.stoplist_id, 0) p
where s.name = 'FTS_test_stoplist'

Returns:

keyword                         group_id    phrase_id   occurrence  special_term     display_term  expansion_type source_term
------------------------------- ----------- ----------- ----------- ---------------- ------------- -------------- -----------
0x007100770065002D006100730064  1           0           1           Exact Match      qwe-asd       0              qwe-asd
0x007100770065                  1           0           1           Exact Match      qwe           0              qwe-asd
0x006100730064                  1           0           2           Noise Word       asd           0              qwe-asd

(3 row(s) affected)

Helper Query3:

select p.*
from sys.fulltext_stoplists s
cross apply sys.dm_fts_parser ('"qwe-asd*"', 1033, s.stoplist_id, 0) p
where s.name = 'FTS_test_stoplist'

Returns:

keyword                         group_id    phrase_id   occurrence  special_term     display_term  expansion_type source_term
------------------------------- ----------- ----------- ----------- ---------------- ------------- -------------- -----------
0x007100770065002D006100730064  1           0           1           Exact Match      qwe-asd       0              qwe-asd
0x007100770065                  1           0           1           Exact Match      qwe           0              qwe-asd
0x006100730064                  1           0           2           Exact Match      asd           0              qwe-asd

(3 row(s) affected)

Structure is following:

-- ****************************
-- Step 1. Cleanup FTS Structure
-- ****************************

if exists (select 1 from sys.fulltext_indexes where object_id = object_id('dbo.tFTS_test'))
    drop fulltext index on dbo.tFTS_test;
go
if exists (select 1 from sys.fulltext_catalogs where name = 'FTS_test')
    drop fulltext catalog FTS_test;
go
if exists (select 1 from sys.fulltext_stoplists where name = 'FTS_test_stoplist')
    drop fulltext stoplist FTS_test_stoplist;
go
if object_id ('dbo.tFTS_test') is not null
    drop table dbo.tFTS_test;
go

-- ****************************
-- Step 2. Create FTS Structure
-- ****************************

create table dbo.tFTS_test (
    Id int not null,
    Value varchar(100) not null,
    constraint [PK_tFTS_test] primary key clustered (Id asc)
);
go
create fulltext stoplist FTS_test_stoplist from system stoplist;
go
alter fulltext stoplist FTS_test_stoplist add 'asd' language 'English';
go
create fulltext catalog FTS_test with accent_sensitivity = off;
go
create fulltext index on dbo.tFTS_test (Value language English) key index PK_tFTS_test on (FTS_test);
go
if not exists (
    select 1
    from sys.fulltext_indexes i
    inner join sys.fulltext_stoplists l on l.stoplist_id = i.stoplist_id
    where i.object_id = object_id('dbo.tFTS_test') and l.name = 'FTS_test_stoplist'
)
    alter fulltext index on dbo.tFTS_test set stoplist FTS_test_stoplist;
go
insert into dbo.tFTS_test (Id, Value) values (1, 'qwe-asd');
go

P.S. Sorry for such vast question.

GriGrim
  • 2,891
  • 1
  • 19
  • 33

1 Answers1

0

The difference is caused by the hyphen, Full Text Search is treating your query string as two words instead of one. And additionally, since "asd" is a noise word it will not find it.

When the word breaker encounters a word-breaking character in a term, the word breaker parses the character as a white space character.

Word-breaking characters include the following:

  • •$ (dollar sign)
  • , (comma)
  • & (ampersand)
  • # (number sign)

When the word breaker encounters a hyphen (-) in a term, the word breaker correctly parses the term. However, the full-text thesaurus component treats the characters that are connected by the hyphen together with the hyphen itself as empty characters. For example, if the original term is "well-known celebrity," the term appears as "celebrity" in the thesaurus file.

This is from the Microsoft site, not the same problem but one that share the root cause:

You obtain incorrect results when you run a full-text search query that uses a thesaurus file in SQL Server 2005

Baltico
  • 483
  • 3
  • 9
  • Thanks for your explanation. I added Helper Query3. It shows difference between search strings `"qwe-asd"` and `"qwe-asd*"`. Can you explain why in case of search string `"qwe-asd*"` the word `asd` is not a Noise Word? Whereas in case of search string `"qwe-asd"` it is. – GriGrim Jun 03 '14 at 07:57