I have one index "name_and_title_index" with two fields "name" and "title".
Indextool gives me this information on interested keywords:
keyword ,docs ,hits ,offset
word7 ,56 ,57 ,519386707
word8 ,154 ,161 ,475390304
word2 ,2438 ,2597 ,14258546
word3 ,26599 ,29074 ,68018978
word5 ,475349 ,656569 ,191390685
word1 ,645079 ,881965 ,303666122
word6 ,1089457 ,1435180 ,350540391
indexed_documents - 10742342, total keywords - 1379888
It seems to me I do not understand rankers since all off them returns results in different order than I've expect.
I expect any result with word7 would have higher weight (there is only 56 docs out of 10.7M)
The SphinxQL is:
SELECT
ID,
WEIGHT(),
SNIPPET(name, 'word1 word2 word3 word4 word5 word6') AS _name,
SNIPPET(title, 'word7 word8 word9') AS _title
FROM
name_and_title_index
WHERE
MATCH('@name "word1 word2 word3 word4 word5 word6"/0.5 @title "word7 word8 word9"/0.5')
Different rankers gives me next results:
RANKER=PROXIMITY_BM25;
| 1 | 6546 | _ <b>word6</b> <b>word1</b> <b>word2</b> <b>word3</b> | _ _ <b>word8</b> _ _ <b>word7</b> |
| 4 | 6528 | _ _ _ _ _ _ _ _ <b>word2</b> <b>word3</b> <b>word4</b> _ | _ _ <b>word8</b> _ _ _ _ _ ... |
| 2 | 4521 | <b>word5</b> <b>word6</b> _ _ _ _ _ _ <b>word1</b> _ _ | _ <b>word7</b> _ _ _ _ _ _ _ _ ... |
| 3 | 4520 | <b>word5</b> _ <b>word1</b> _ _ _ _ _ <b>word6</b> _ _ | _ _ _ _ _ _ _ _ _ _ _ _ <b>word7</b> |
| 5 | 4519 | <b>word1</b> _ _ _ _ _ <b>word5</b> <b>word6</b> _ _ _ _ | _ _ _ _ _ _ <b>word8</b> _ _ _ _ _ _ |
| 6 | 2520 | <b>word5</b> _ _ _ _ _ ... _ _ _ _ <b>word6</b> _ _ _ _ _ ... | ... _ _ _ _ _ _ _ <b>word8</b> _ _ |
RANKER=BM25;
| 1 | 2546 | _ <b>word6</b> <b>word1</b> <b>word2</b> <b>word3</b> | _ _ <b>word8</b> _ _ <b>word7</b> |
| 4 | 2528 | _ _ _ _ _ _ _ _ <b>word2</b> <b>word3</b> <b>word4</b> _ | _ _ <b>word8</b> _ _ _ _ _ ... |
| 2 | 2521 | <b>word5</b> <b>word6</b> _ _ _ _ _ _ <b>word1</b> _ _ | _ <b>word7</b> _ _ _ _ _ _ _ _ ... |
| 3 | 2520 | <b>word5</b> _ <b>word1</b> _ _ _ _ _ <b>word6</b> _ _ | _ _ _ _ _ _ _ _ _ _ _ _ <b>word7</b> |
| 5 | 2520 | <b>word1</b> _ _ _ _ _ <b>word5</b> <b>word6</b> _ _ _ _ | _ _ _ _ _ _ <b>word8</b> _ _ _ _ _ _ |
| 6 | 2519 | <b>word5</b> _ _ _ _ _ ... _ _ _ _ <b>word6</b> _ _ _ _ _ ... | ... _ _ _ _ _ _ _ <b>word8</b> _ _ |
RANKER=SPH04;
| 4 | 16528 | _ _ _ _ _ _ _ _ <b>word2</b> <b>word3</b> <b>word4</b> _ | _ _ <b>word8</b> _ _ _ _ _ ... |
| 1 | 14546 | _ <b>word6</b> <b>word1</b> <b>word2</b> <b>word3</b> | _ _ <b>word8</b> _ _ <b>word7</b> |
| 2 | 14521 | <b>word5</b> <b>word6</b> _ _ _ _ _ _ <b>word1</b> _ _ | _ <b>word7</b> _ _ _ _ _ _ _ _ ... |
| 3 | 14520 | <b>word5</b> _ <b>word1</b> _ _ _ _ _ <b>word6</b> _ _ | _ _ _ _ _ _ _ _ _ _ _ _ <b>word7</b> |
| 5 | 14519 | <b>word1</b> _ _ _ _ _ <b>word5</b> <b>word6</b> _ _ _ _ | _ _ _ _ _ _ <b>word8</b> _ _ _ _ _ _ |
| 6 | 10520 | <b>word5</b> _ _ _ _ _ ... _ _ _ _ <b>word6</b> _ _ _ _ _ ... | ... _ _ _ _ _ _ _ <b>word8</b> _ _ |
Why result 4 is always higher than result 2 and 3 (and with SPH04 it is higher than result 1)?