0

Basically running the following:

SELECT * 
FROM document 
WHERE contains(cachedText, 'after') 

returns 0 records, same with any search term with the word after in it.

Searching for:

SELECT * 
FROM document 
WHERE cachedText like '%after%'

returns about 200k records in my dataset. Just wondering if there is any reason for this?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Owain Esau
  • 1,876
  • 2
  • 21
  • 34
  • it might have to do with the fact that the second parameter in CONTAINS is nvarchar? – user5328504 Sep 17 '19 at 01:46
  • also take a look at this https://stackoverflow.com/questions/588025/sql-problem-using-contains-doesnt-work-but-like-works-fine – user5328504 Sep 17 '19 at 01:47
  • 2
    `CONTAINS` is the full-text searching in SQL Server, which also defines a list of **stop words** that are not being considered when searching - and there's a fair chance that "after" is considered one of those stopwords and therefore won't be found. Also check out: https://stackoverflow.com/questions/8435602/sql-2008-turn-off-stop-words-for-full-text-search-query – marc_s Sep 17 '19 at 03:59
  • The two queries are completely different. The first is a full-text search query, the second is a simple pattern match. The first probably won't even run if `cachedText` isn't part of an FTS index. – Panagiotis Kanavos Sep 17 '19 at 14:28

1 Answers1

1

The first query matches the word "after", i.e. a distinct word within other text. (If "after" is in the stopwords list then it is ignored entirely.) You can read about the details at Full-Text Search.

The second query matches "afternoon", "crafters", "dafter", ..., i.e. any string containing "after" regardless of word breaks.

Examine the matches that the second query returns and you'll likely find "after" embedded in longer words.

HABO
  • 15,314
  • 5
  • 39
  • 57