17

Hi I'm worried about how to implement a simple search query, my scenario is:

tag VARCHAR 255

Now I need to search inside the tag field and I can use two kinds of queries:

  SELECT * FROM table WHERE tag LIKE '%1111%' OR LIKE '%2222%' OR LIKE '%3333%';

or

SELECT * ,MATCH(tag) AGAINST('+1111','+2222','+3333' IN BOOLEAN MODE) as score FROM table ORDER BY score DESC ;

Which is more accurate/precise and which is faster?

Thanks

audiojames
  • 82
  • 1
  • 11
itsme
  • 48,972
  • 96
  • 224
  • 345

3 Answers3

35

Your searches aren't equivalent. LIKE %1% will find ANYTHING that contains a 1, e.g. 100, 911, 0.1. It's just a plain substring match. MATCH ('+1') would theoretically work, but FULLTEXT by default ignores any "words" that are <4 characters in length. However, assuming you relaxed the fulltext length limit, +1 would find any INDEPENDENT 1, but not any that are embedded in another word. For that you'd need +*1*.

Marc B
  • 356,200
  • 43
  • 426
  • 500
  • 4
    damn you summarized in 10 lines all that i need to know thanks :) , i'll accept soon .. – itsme Feb 08 '13 at 15:58
  • 2
    Can you explain what you mean by "assuming you relaxed the fulltext length limit"? – Kal Jan 13 '22 at 11:43
  • 1
    @Kal By default, MySQL fulltext indexing ignores "words" less than 4 characters. So if you search for `'+1'` it will find nothing., because such short tokens are not indexed. In natural language text, this is fine because people seldom search for such short words (at least in English). [You can change this with a configuration option](https://dev.mysql.com/doc/refman/8.0/en/fulltext-fine-tuning.html#fulltext-word-length), but it increases the size of the index to include more words. – Bill Karwin Jan 29 '23 at 15:50
17

The fastest solution is to create a properly normalized table for tags, so that each tag is stored on a separate row.

CREATE TABLE tags (
  tag VARCHAR(4),
  tableid INT,
  PRIMARY KEY (tag, tableid),
  KEY (tableid, tag)
);

SELECT * FROM `table` JOIN tags ON table.tableid = tags.tableid 
WHERE tags.tag IN ('1111', '2222', '3333');

Benefits:

  • No more worrying about fulltext indexes, ft_min_length, InnoDB support, etc.
  • No more worrying about the bad performance of substring matching with LIKE.
  • Looking up a given tag and its matching entries in table uses the primary key index.
  • Looking up the set of tags for a given entry in table uses the secondary key index.
  • You have no limit to the number of tags per item in table.
  • You can easily count the frequency of certain tags, you can fetch the set of distinct tags, you can constrain tags against a lookup table, etc.
Bill Karwin
  • 538,548
  • 86
  • 673
  • 828
1

Never use %1% This will cause a full table scan and will be very inefficient if the data grows.

Fulltext is usually faster in bigger datasets when searching in string values. Like operators are usefull when using it like 'text%'

Mad Dog Tannen
  • 7,129
  • 5
  • 31
  • 55
  • 1
    -1 sorry but you should check better my answer -1 i was just using an example , plus i already edited that – itsme Feb 08 '13 at 17:07