9

I have this table:

bussId    |      nameEn        |          keywords
  500          name1 name2             keyword1 keyword2

I want to return bussId 5000 if the user search for (keyword1 or keyword2 or name2 or name1). So I should use this query SELECT * FROM business WHERE nameEn LIKE '%searched_word%'. But this query doesn't use the index nameEn or keywords, according to Comparison of B-Tree and Hash Indexes "The index also can be used for LIKE comparisons if the argument to LIKE is a constant string that does not start with a wildcard character".

I have this solution, I want to create another table and insert all the single words:

bussId |  word
  500        name1
  500        name2
  500        keyword1
  500        keyword2

Then I will search for the bussId using this query:
SELECT * WHERE word LIKE 'searched_word%'.

In that way I will be sure that the MySQL will use the index , and it will be faster, but this table will contain about 20 million rows!

Is there another solution?

Maytham Fahmi
  • 31,138
  • 14
  • 118
  • 137
david
  • 3,310
  • 7
  • 36
  • 59
  • SELECT * FROM business WHERE nameEn LIKE '%searched_word%' This should have worked. I don't understand the issue with it. Could you clarify please? I've been using that query with the same purpose at work – Luigi Apr 17 '15 at 09:47
  • @LuigiMackenzieC.Brito the problem is that query is very slow, because it doesn't use index, it search the whole table ! – david Apr 17 '15 at 09:48
  • Dang, won't it just search the whole "named" column? – Luigi Apr 17 '15 at 09:56
  • 1
    it will not use index, see this http://dev.mysql.com/doc/refman/5.6/en/index-btree-hash.html – david Apr 17 '15 at 09:58
  • Wow, I end up learning something new :) – Luigi Apr 17 '15 at 10:05
  • I'm also interested to an answer to this question. I guess it would take someone of the Gandalf level to answer it – Luigi Apr 17 '15 at 13:23
  • How about normalizing your way around this? Is there anyway you can break down the problem by restructuring your tables? Alternatively you can consider using fulltext instead, that will allow you to use indexes and add fancy 'fuzzy' logic search at the same time, but I'm not sure the performance will be much better. – citywall Jul 20 '15 at 18:04
  • In my opinion, the `words` table you propose will be efficient. You can keep `words` table content with a trigger. [Read this related question](http://stackoverflow.com/a/8456791). – dani herrera Jul 21 '15 at 06:10
  • @david How you pass 'searched_word' to MySQL? Do you use PHP to pass this common searched_word? – alagu Jul 24 '15 at 09:51
  • If you have a large table I would suggest `Lucene` instead of full text search. – Davor Zlotrg Jul 24 '15 at 13:38
  • 2 million records is quite a lot and a mapping table with varchar column will allocate the max allowed chars in byte for each row + 32-bit for integer column. What if you could just create a table like (id int, crc int) and store only the text data's crc32 value. It's case sensitive so you need to convert to uppercase/lowercase while populating the data and the same when comparing. – Engin Jul 27 '15 at 16:48

5 Answers5

8

You have to use a fulltext index using MyISAM or InnoDB from MySQL 5.6 onwards:

mysql> ALTER TABLE business ADD FULLTEXT(nameEn, keywords);

And here is your request:

mysql> SELECT * FROM business
   -> WHERE MATCH (nameEn, keywords) AGAINST ('searched_word');
Morgan Tocker
  • 3,370
  • 25
  • 36
Adam
  • 17,838
  • 32
  • 54
1

Did you try the Instr() or Locate() functions? Here is a SO discussion comparing them with Like but may prove better comparing a front % wildcard. Still it runs full table scans but unaware how the MySQL query optimizer indexes with string functions.

SELECT * FROM business WHERE Instr(nameEN, 'search_word') > 0 

OR

SELECT * FROM business WHERE Locate(nameEN, 'search_word') > 0 

Also, there may be other areas of optimization. See if other potential indices are available in the business table, explicitly declare specific columns instead of the asterisk (*) if all columns are not being used, and parse the nameEN and keywords columns by the spaces so columns retain one value (with potential to transpose), then use the implicit join, WHERE, or explicit join, JOIN. This might even be a table design issue with the challenge of storing multiple values in a singe field.

Community
  • 1
  • 1
Parfait
  • 104,375
  • 17
  • 94
  • 125
0

With new version of MySQL you don't need to make engine "MyISAM", InnoDB is also support FULLTEXT index (I've tested this 5.6.15, supports form version >=5.6.4 ). So if your server version higher then 5.6.4 than you need just add FULLTEXT index to your table and make select with MATCH(...)AGAINST(...), example below

CREATE FULLTEXT INDEX idx ON business (nameEn);

SELECT * FROM business 
WHERE match(nameEn)against('+searched_word' IN BOOLEAN MODE);
Gevorg M
  • 51
  • 4
0

Use the below statement in MySQL or SQL it'll return perfect result:

SELECT * FROM business WHERE (nameEn LIKE 'searched_word%' OR nameEn LIKE '%searched_word%') OR (keywords LIKE 'searched_word%' OR keywords LIKE '%searched_word%') AND bussID = 500;  

This should work.

0

20 million records is quite a lot and a mapping table with varchar column would allocate the max allowed chars in byte for each row + 32-bit for integer column.

What if you could just create a table like (id int, crc int) and store only the text data's crc32 value. It's case sensitive so you need to convert to uppercase/lowercase while populating the data and the same when comparing.

I agree with the full-text approach but to save space and use the advantage of indexing, you can try something like below.

Create Temporary TABLE t (id INT, crc INT);

Insert Into t
Select 500, CRC32(UPPER('name1'))
Union Select 500, CRC32(UPPER('name2'))
Union Select 500, CRC32(UPPER('keyword1'))
Union Select 500, CRC32(UPPER('keyword2'));

Select * From t Where crc = CRC32(UPPER('keyword2');
Engin
  • 385
  • 1
  • 4
  • 15