2

I have large mysql database (5 million rows) and data is phone number. I tried many solution but it's still slow. Now, I'm using INT type and LIKE sql query for store and searching phone number. Ex: SELECT phonenumber FROM tbl_phone WHERE phonenumber LIKE '%4567' for searching phone numbers such as 170**4567**, 249**4567**,...

I need a solution which make it run faster. Help me, please!

Kara
  • 6,115
  • 16
  • 50
  • 57
beobeo88
  • 165
  • 2
  • 10
  • Using wildcards (%) at the beginning of strings will always cause serial reads with LIKE operators, which will be slow with large data volumes. Perhaps you can narrow down the search by including all known prefixes, ex. LIKE '170%4567' OR '249%4567'. The phone number column must be indexed of course. – ron tornambe Aug 09 '12 at 16:38
  • Don't use numeric types if you want to search with "LIKE"... – bart Aug 09 '12 at 17:27
  • A lot of countries have phone numbers that start with "0" or "00". Integer types don't store those leading zeros. – bart Aug 09 '12 at 17:27
  • @bart: I add first phone number "0" or "00" in source code, not need in database – beobeo88 Aug 09 '12 at 18:15

4 Answers4

1

You are storing numbers as INT, but querying then as CHAR (the LIKE operator implicitly converts INTs to CHARs) and it surely is not optimal. If you'd like to keep numbers as INT (probably the best idea in IO performance therms), you'd better change your queries to use numerical comparisons:

-- instead of CHAR operators 
WHERE phone_number LIKE '%4567'
WHERE phone_number LIKE '1234%'
-- use NUMERIC operators
WHERE phone_number % 10000 = 4567
WHERE phone_number >= 12340000 -- considering 8 digit numbers

Besides choosing a homogeneous way to store and query data, you should keep in mind to create the appropriate index CREATE INDEX IDX0 ON table (phone_number);.

Unfortunately, even then your query might not be optimal, because of effects similar to @ron have commented about. In this case you might have to tune your table to break this column into more manageable columns (like national_code, area_code and phone_number). This would allow an index efficient query by area-codes, for example.

Gerardo Lima
  • 6,467
  • 3
  • 31
  • 47
  • Waa, search speed has doubled with your solution. it's simple but effective. Thank u very much! – beobeo88 Aug 09 '12 at 18:05
  • WHERE phone_number LIKE '1234%' Use: WHERE phone_number >= 12340000 -- considering 8 digit numbers Result: 1235*, 1236*,... I used: WHERE phone_number DIV 10000 = 1234 – beobeo88 Aug 09 '12 at 18:09
  • +1 nice solution. These predicates still are not sargable, but this form does avoid the implicit conversion of INT to VARCHAR on EVERY row in the table. – spencer7593 Aug 09 '12 at 19:11
  • @beobeo88, while semantically `phone_number >= 12340000` coresponds to `phone_number DIV 10000 = 1234`, there's a syntactical that prevents most DBMSs from using index; as a rule to remember, if you want to use index, don't apply any operator nor function to the column, use to the constant. – Gerardo Lima Aug 10 '12 at 00:26
  • Hi, @beobeo88, I noticed you unaccepted this answer. Did you find any error? – Gerardo Lima Nov 05 '12 at 17:56
0

Check the advice here How to speed up SELECT .. LIKE queries in MySQL on multiple columns?

Hope it helps!

Community
  • 1
  • 1
Luke Baughan
  • 4,658
  • 3
  • 31
  • 54
0

I would experiment with using REGEXP, rather than LIKE as in the following example:

SELECT `field` WHERE `field` REGEXP '[0-9]';

Other than that, indeed, create an index if your part of the phone search pattern has a constant length.

Here is also a link to MySQL pattern mathching document.

favoretti
  • 29,299
  • 4
  • 48
  • 61
0

That LIKE predicate is operating on a string, so you've got an implicit conversion from INT to VARCHAR happening. And that means an index on the INT column isn't going to help, even for a LIKE predicate that has leading characters. (The predicate is not sargable.)

If you are searching for the last digits of the phone number, the only way (that I know of) to get something like that to be fast would be to add another VARCHAR column, and store the reversed phone number in it, where the order of the digits is backwards.

Create an index on that VARCHAR column, and then to find phone number that end with '4567':

WHERE reverse_phone LIKE '7654%'

-or-

WHERE reverse_phone LIKE CONCAT(REVERSE('4567'),'%')
spencer7593
  • 106,611
  • 15
  • 112
  • 140
  • Thank! It's faster than my solution but after i convert INT to VARCHAR then my database too large (50MB in INT, 400MB in VARCHAR) – beobeo88 Aug 09 '12 at 18:00