10

This query (with different name instead of "jack") happens many times in my slow query log. Why?

The Users table has many fields (more than these three I've selected) and about 40.000 rows.

select name,username,id from Users where ( name REGEXP '[[:<:]]jack[[:>:]]' ) or ( username REGEXP '[[:<:]]jack[[:>:]]' ) order by name limit 0,5;

id is primary and autoincrement.
name has an index.
username has a unique index.

Sometimes it takes 3 seconds! If I explain the select on MySQL I've got this:

select type: SIMPLE
table: Users
type: index
possible keys: NULL
key: name
key len: 452
ref: NULL
rows: 5
extra: Using where

Is this the best I can do? What can I fix?

Alan Moore
  • 73,866
  • 12
  • 100
  • 156
Pons
  • 1,747
  • 1
  • 13
  • 19
  • try `(name like '%jack%' and name REGEXP '[[:<:]]jack[[:>:]]' ) or ( username like '%jack%' and username REGEXP '[[:<:]]jack[[:>:]]' )`. If you need even better performance then use FULLTEXT index or some third party tool. – Imre L Sep 07 '12 at 16:17
  • As @ImreL suggests, if you add a LIKE %...% clause before the REGEXP, that essentially acts as a filter to reduce the number of potential results before the REGEXP is then also tried. I find this can cut a slow 2 second query down to 0.3s. – fooquency Dec 06 '18 at 14:40

3 Answers3

23

If you must use regexp-style WHERE clauses, you definitely will be plagued by slow-query problems. For regexp-style search to work, MySQL has to compare every value in your name column with the regexp. And, your query has doubled the trouble by also looking at your username column.

This means MySQL can't take advantage of any indexes, which is how all DBMSs speed up queries of large tables.

There are a few things you can try. All of them involve saying goodbye to REGEXP.

One is this:

WHERE name LIKE CONCAT('jack', '%') OR username LIKE CONCAT('jack', '%')

If you create indexes on your name and username columns this should be decently fast. It will look for all names/usernames beginning with 'jack'. NOTICE that

WHERE name LIKE CONCAT('%','jack') /* SLOW!!! */

will look for names ending with 'jack' but will be slow like your regexp-style search.

Another thing you can do is figure out why your application needs to be able to search for part of a name or username. You can either eliminate this feature from your application, or figure out some better way to handle it.

Possible better ways:

  1. Ask your users to break up their names into given-name and surname fields, and search separately.
  2. Create a separate "search all users" feature that only gets used when a user needs it, thereby reducing the frequency of your slow regexp-style query.
  3. Break up their names into a separate name-words table yourself using some sort of preprocessing progam. Search the name-words table without regexp.
  4. Figure out how to use MySQL full text search for this feature.

All of these involve some programming work.

O. Jones
  • 103,626
  • 17
  • 118
  • 172
  • In my query REGEXP '[[:<:]]jack[[:>:]]' means that I want to find records like "Jack" or "The Jack Rock" but not "Jacky", since I'm looking for exact word match. So, using LIKE 'jack%' will be not good. Since "The Jack Rock" will not be found. If I use '%jack%' is it slow like '%jack'? – Pons Sep 07 '12 at 13:56
  • 4
    Understood. regexp search is a correct solution to your problem. But it is *inherently* slow. Any search of a text-column index that isn't anchored -- that is, that doesn't start with the first character position in the column -- has the same problem. So, yes, `LIKE '%jack%'` is slow like regexp-style search. This is because indexes are organized, and can be random-accessed quickly, in order. Think about looking up 'Jones' in the telephone book. You can find the first person with that name easily. But if you look up everybody with the characters 'one', it takes forever. – O. Jones Sep 07 '12 at 14:01
  • 1
    Using FULLTEXT index is better? – Pons Sep 07 '12 at 14:39
  • 1
    Yes, using full text search should help. I think you'll want to use boolean mode. – O. Jones Sep 07 '12 at 19:02
  • Thank you Ollie, I will add FULLTEXT indexes. – Pons Sep 10 '12 at 08:38
  • Has regex performance improved at all since this answer? In Postgres if a regex has a fixed prefix it can test on only index entries matching that prefix, so hopefully MySQL will implement the same eventually – Andy Sep 21 '21 at 01:39
  • I just tried it on MariaDB 10.3. The index doesn't get used with `RLIKE '^something.*'` – O. Jones Jan 13 '22 at 19:19
2

I reached 50% speedup just by adding fieldname != '' in where clause. It makes mysql to use indexes.

SELECT name, username, id 
FROM users 
WHERE name != '' 
    AND (name REGEXP '[[:<:]]jack[[:>:]]' or username REGEXP '[[:<:]]jack[[:>:]]') 
ORDER BY name 
LIMIT 0,5;

Not a perfect solution but helps.

Barmar
  • 741,623
  • 53
  • 500
  • 612
-4

Add "LIKE" in front

from

SELECT cat_ID, categoryName FROM category WHERE cat_ID REGEXP '^15-64-8$' ORDER BY categoryName

to

SELECT cat_ID, categoryName FROM category WHERE cat_ID LIKE '15-64-8%' and cat_ID REGEXP '^15-64-8$' ORDER BY categoryName

Of cos, that works only if U r search for phrases U know starting with what, else full text index is the solution.

C S N
  • 391
  • 3
  • 6