30

Having this table:

CREATE TABLE `example` (
`id` int(11) unsigned NOT NULL auto_increment,
`keywords` varchar(200) NOT NULL,
PRIMARY KEY  (`id`)
) ENGINE=InnoDB;

We would like to optimize the following query:

SELECT id FROM example WHERE keywords LIKE '%whatever%'

The table is InnoDB, (so no FULLTEXT for now) which would be the best index to use in order to optimize such query?

We've tried a simple :

ALTER TABLE `example` ADD INDEX `idxSearch` (`keywords`);

But an explain query shows that need to scan the whole table if our queries where LIKE 'whatever%' instead, this index performs well, but otherwise has no value.

Is there anyway to optimize this for innoDB ?

Thanks!

jmserra
  • 1,296
  • 4
  • 18
  • 34
  • I recommend some mysql performance tuning as well, [mysqltuner.pl](http://mysqltuner.com/) or [tuning-primer.sh](http://www.day32.com/MySQL/) – jaggedsoft Dec 22 '16 at 20:34

2 Answers2

58

Indexes are built from the start of the string towards the end. When you use LIKE 'whatever%' type clause, MySQL can use those start-based indexes to look for whatever very quickly.

But switching to LIKE '%whatever%' removes that anchor at the start of the string. Now the start-based indexes can't be used, because your search term is no longer anchored at the start of the string - it's "floating" somewhere in the middle and the entire field has to be search. Any LIKE '%... query can never use indexes.

That's why you use fulltext indexes if all you're doing are 'floating' searches, because they're designed for that type of usage.

Of major note: InnoDB now supports fulltext indexes as of version 5.6.4. So unless you can't upgrade to at least 5.6.4, there's nothing holding you back from using InnoDB *AND fulltext searches.

Marc B
  • 356,200
  • 43
  • 426
  • 500
  • Thanks, well i'm using MySQL 5.0.77-log in a cloud environment so no way to upgrade :( Will consider moving the table to MyIsam then – jmserra Apr 27 '12 at 16:22
  • If we move to InnoDB would we need to change the queries to use MATCH/AGAINST? or will the current use of 'LIKE' benefit of the FullText index? Thanks again – jmserra Apr 27 '12 at 16:29
  • 1
    no, fulltext requires the match/against stuff. you'll have to rework the queries. – Marc B Apr 27 '12 at 16:36
  • Postgres offer much faster results for %blah% queries using gin and pg_trgm. https://www.cybertec-postgresql.com/en/postgresql-more-performance-for-like-and-ilike-statements/ Why can't mysql get some inspiration from them ? – Azghanvi Mar 29 '21 at 13:38
3

I would like to comment that surprisingly, creating an index also helped speed up queries for like '%abc%' queries in my case.

Running MySQL 5.5.50 on Ubuntu (leaving everything on default), I have created a table with a lot of columns and inserted 100,000 dummy entries. In one column, I inserted completely random strings with 32 characters (i.e. they are all unique).

I ran some queries and then added an index on this column. A simple

select id, searchcolumn from table_x where searchcolumn like '%ABC%'

returns a result in ~2 seconds without the index and in 0.05 seconds with the index.

This does not fit the explanations above (and in many other posts). What could be the reason for that?

EDIT I have checked the EXPLAIN output. The output says rows is 100,000, but Extra info is "Using where; Using index". So somehow, the DBMS has to search all rows, but still is able to utilise the index?

abulhol
  • 171
  • 1
  • 7
  • Your query does not contain a LIKE condition. That's why the index is being used. – ditscheri Apr 03 '17 at 22:46
  • @ditscheri Sorry, this was a typo, actually it is a LIKE condition (just corrected my post) – abulhol May 03 '17 at 06:25
  • For a benchmark you need to restart mysql in between both queries. – John Jan 11 '19 at 05:50
  • Depending on number of columns, RAM available, order by, limit, and so forth, it's possible that performance was improved because the engine was able to do a full index scan. That wouldn't be as efficient as an index lookup, but it might well be more efficient than a full table scan. – Brian Stanback Oct 29 '19 at 23:34
  • 1
    @abulhol This is a very good question, and I also would love to know answer for this. However, you have posted this question as answer for another question. I think that is not a good practice. Two reasons: 1. The obvious - you are supposed to write an answer for the OPs question if you know the answer. Otherwise comment on some other answer if you just want to discuss related to the answer. 2 This nice question didn't get the visibility it deserve because it is not posted as a question. StackOverflow has its own system to give visibility to new questions, but new answers don't reach that far. – rineez Oct 21 '21 at 14:25