I have a table of keywords in a SQL Server box that are currently stored with a full text index. I have been asked to add a facility to our searches that prioritise results where the result is closer to the start of the column.
CREATE TABLE [dbo].[ftsTest]
(
[ID] [INT] NOT NULL,
[Keywords] [NVARCHAR](MAX) NOT NULL,
CONSTRAINT [PK_ftsTest]
PRIMARY KEY CLUSTERED ([ID] ASC)
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
INSERT INTO [dbo].[ftsTest]
VALUES (1, N'Test Two Three Four Five Six Seven Eight Nine Ten'),
(2, N'One Test Three Four Five Six Seven Eight Nine Ten'),
(3, N'One Two Test Four Five Six Seven Eight Nine Ten'),
(4, N'One Two Three Test Five Six Seven Eight Nine Ten'),
(5, N'One Two Three Four Test Six Seven Eight Nine Ten'),
(6, N'One Two Three Four Five Test Seven Eight Nine Ten'),
(7, N'One Two Three Four Five Six Test Eight Nine Ten'),
(8, N'One Two Three Four Five Six Seven Test Nine Ten'),
(9, N'One Two Three Four Five Six Seven Eight Test Ten'),
(10, N'One Two Three Four Five Six Seven Eight Nine Test ')
After setting up a generic full text search for this, I can generate a list of results with either:
SELECT *
FROM ftsTest
WHERE CONTAINS (Keywords, 'test')
SELECT *
FROM CONTAINSTABLE(ftsTest, Keywords, 'test')
The Contains
query returns the expected results, however has no priority. The ContainsTable
query does return a rank, but has the same value assigned to each.
I understand that the rank takes occurrence (offset of word into document) into account, but I don't know how to access that data. This is probably an RTFM error on my part, but I can't see how to influence the rank to help with the clients requirement.
I've been looking at https://technet.microsoft.com/en-us/library/ms142524(v=sql.105).aspx but I'm still a bit in the dark.
I am working on multiple versions of SQL Server (2008 R2 to 2016) at the moment, but it looks like we will be updating all to 2016 in the future.
Any help gratefully received.
Many thanks.
Possible solution:
FTS has the option of ranking by how close together words are. By adding a control word to the start of the field, I can then calculate distance from that word to give me a ranking.
UPDATE ftsTest
SET keywords = '_start_ ' + keywords
SELECT *
FROM CONTAINSTABLE(ftsTest, Keywords, 'NEAR((_start_,test))')
This does do what I'm looking for but not going to mark it as an answer yet in case someone has something better. Doesn't seem like the most graceful of solutions and it adds to the size of the full text index.