1

Have a question about full text search.

I have a query that works, but I want to improve it. For example, suppose I'm searching for "best solution". In my result (using FTS,CONTAINSTABLE,NEAR,ISABOUT,*) I have the columns rank, id, string:

rank| id | string
----+----+-------
430 | 33 | my own best solution
430 | 32 | the way for best solution
430 | 30 | the best solution sample
430 | 31 | best solution 
430 | 34 | best solution creation how
300 | 40 | usefull best software solution
300 | 41 | best software solution
200 | 50 | wolrds bests solutions
200 | 51 | bests solutions of the world

So this query is 100% right for me, and all RANKs are correct, but I want to make this query more relevant.

If a keyword's position in a string is more to the left, it should appear earlier in the result. For example:

rank| id | string
----+----+-------
430 | 31 | best solution 
430 | 34 | best solution creation how
430 | 30 | the best solution sample
430 | 33 | my own best solution
430 | 32 | the way for best solution
300 | 41 | best software solution
300 | 40 | usefull best software solution
200 | 51 | bests solutions of the world
200 | 50 | wolrds bests solutions

Is this possible? If so, how can I get this result?

outis
  • 75,655
  • 22
  • 151
  • 221
Katya
  • 13
  • 4

2 Answers2

2

You need an indexOf function. Search your sql manual for a similar function.

Then add a sorting criteria looking like ORDER BY rank, CASE indexOf(string, 'best') WHEN -1 THEN 100000 ELSE indexOf(string, 'best') END.

If you can't find a function doing a kind of indexOf, write it yourself using the CREATE FUNCTION feature of your database manager.

Joel
  • 3,427
  • 5
  • 38
  • 60
  • Yes... if i add `charaindex(string, key) ASC` in `order by` clause and add all keywords like this separate by comma, it will work... but only for this sample when im looking "best solution" but if i will search for "bests solutions", the result of FTS query will be the same(rows, not rank, but rank gropus will be same too), and in this sample `charaindex(string, key) ASC` will work only for `200 | 51 | bests solutions of the world` `200 | 50 | wolrds bests solutions` we need to implement somthing like this charindex(string, formsof(FORMSOF(INFLECTIONAL,inflectional,key)) – Katya May 10 '11 at 09:42
  • you have to enumerate the possible string litterals yourself using case or if cascades, because there is no way you sql manager has any linguistic feature built-in. I suggest you write a function performing all the linguistic stuff, and use it in a sort clause. – Joel May 10 '11 at 09:50
  • there is no other way? can we compare two words "best" and "bests"(in variables) without full text search index? – Katya May 10 '11 at 11:51
  • I don't think there are any built-in ways to do this using a database engine. The closest i can think of is to pass an array of strings to the dedicated function, which will handle the parsing algorithm. Alternatively, you could implement this function using automatons parsing techniques if you want to reduce the cost. But it is very complicated: http://en.wikipedia.org/wiki/LALR_parser – Joel May 10 '11 at 12:16
  • maybe exists 3rd party solutions for this needs? what we can install on the server... for whole process of search... not only for sort "to top" records? iFilter? – Katya May 10 '11 at 13:08
  • I don't believe such a product does exists. It's just a matter of 10 lines of code if you want to write the function without optimisation. Other than that, I don't know which database server you are using but if it is a good one, it is possible to set an index on the text column. With an index, a proper database engine would maybe optimize automatically a query containing a clause such as `where string like '%best solution%' or string like '%bests solutions%')`. You can run tests with an index and this sort of where clause and see how it performs. – Joel May 10 '11 at 13:35
1

The third party solution you want is Sphinx (there are still others). Read more about here:

http://www.ioncannon.net/programming/685/full-text-search-with-sphinx/

Manuel
  • 11
  • 1