2

I am having problems using MySQL's fulltext search and returning the results in order by relevance. I have to use boolean full text search, which does not return results in order by relevance. However, I need these results in order of relevance. If I attempt to add an order by clause on the end of the query, the query results to using filesort, which makes the query incredibly slow (over 1000 times slower than without). I am not sure what can be done.

Here is my query:

SELECT g.id, MATCH(g.searchable_name) AGAINST ('test*' IN BOOLEAN MODE) AS relevance
 FROM games g
 WHERE MATCH(g.searchable_name) AGAINST ('test*' IN BOOLEAN MODE)
 ORDER BY relevance DESC
 LIMIT 0, 31

Thanks in advance.

OMG Ponies
  • 325,700
  • 82
  • 523
  • 502
Dan D.
  • 1,117
  • 1
  • 11
  • 18
  • 1
    `relevance` is a computed column - you can't apply an index to it; seeing "using filesort" is as good as you'll get. Why does the `score` value not work for ordering? – OMG Ponies Aug 12 '10 at 21:25
  • Score value? Can you explain what you mean? (I'm new to full text searching.) – Dan D. Aug 12 '10 at 21:27
  • 1
    http://stackoverflow.com/questions/3471733/ordering-fulltext-searches-on-relevance-and-other-fields – OMG Ponies Aug 12 '10 at 21:32
  • Isn't that exactly what I am doing? – Dan D. Aug 12 '10 at 21:36
  • How many rows are in the table, and is searchable_name a lot of text? – Chris Henry Aug 12 '10 at 21:42
  • I may not have named it score, but I'm using the exact same method in your link. `SELECT g.id, MATCH(g.searchable_name) AGAINST ('test*' IN BOOLEAN MODE) AS relevance` as compared to `SELECT column_a, column_b, MATCH(...) AGAINST (...) AS score` – Dan D. Aug 12 '10 at 21:42
  • @Chris: Upwards of 500,000 rows, and searchable_name is varchar 255. – Dan D. Aug 12 '10 at 21:43
  • Apologies - you are getting the score/relevance value. – OMG Ponies Aug 12 '10 at 21:48

1 Answers1

0

At first you should consider that IN BOOLEAN MODE does not return a score, instead it returns binary (1 = found, 0 = not found):

mysql>SELECT
        topic_id,
        MATCH(topic_text) AGAINST('+tuning' IN BOOLEAN MODE) AS binary
    FROM
        topics_search
    LIMIT 10
+----------+----------+
| topic_id | binary   |
+----------+----------+
| 2        | 0        |
| 4        | 0        |
| 5        | 0        |
| 6        | 1        |
| 7        | 0        |
| 8        | 0        |
| 11       | 0        |
| 12       | 0        |
| 13       | 0        |
| 14       | 0        |
+----------+----------+
10 rows in set (9 ms)

Only the natural fulltext search is able to generate a score (the IN NATURAL LANGUAGE MODE modifier is not given as it is the default mode):

mysql>SELECT SQL_NO_CACHE
        topic_id,
        MATCH(topic_text) AGAINST('tuning') AS score
    FROM
        topics_search
    WHERE
        host_id = 1
    ORDER BY
        score DESC
    LIMIT 10
+--------------------+--------------------+
| topic_id           | score              |
+--------------------+--------------------+
| 153257             | 5.161948204040527  |
| 17925              | 4.781417369842529  |
| 66459              | 4.648380279541016  |
| 373176             | 4.570812702178955  |
| 117173             | 4.55166482925415   |
| 167016             | 4.462575912475586  |
| 183286             | 4.4519267082214355 |
| 366132             | 4.348565101623535  |
| 95502              | 4.293642520904541  |
| 29615              | 4.178250789642334  |
+--------------------+--------------------+
10 rows in set (478 ms)

Side note: Incredible slow because score can not have an index.

So you need the natural search to sort by score. But the natural search does not support operators like the * wildcard. And now we have our dilemma because its not useful to search for tunin* in BOOLEAN and do a parallel search in NATURAL using the key tunin as no text will include that partial word.

mysql>SELECT SQL_NO_CACHE
        topic_id,
        MATCH(topic_text) AGAINST('tunin') AS score
    FROM
        topics_search
    WHERE
        MATCH(topic_text) AGAINST('tunin*' IN BOOLEAN MODE)
    AND
        MATCH(topic_text) AGAINST('tunin') > 0
    ORDER BY
        score DESC
    LIMIT 10
Empty set (170 ms)

Conclusion
It is not possible to search with the wildcard operator and sort the results by relevance.

Except you find a way to obtain all words in your fulltext index that were hit by your wildcard search and use them in a second query or you build your own score based on LIKE with counting the amount of words inside of one resulting row. Interesting enough to open a new question.

Community
  • 1
  • 1
mgutt
  • 5,867
  • 2
  • 50
  • 77
  • For current version of MySQL (8.0), this is no longer correct. `IN BOOLEAN MODE` now returns score. See: https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html – rickchristie Oct 05 '21 at 20:01