0

I have a table with 18,310,298 records right now.

And next query

SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE 
`event` = '11' AND
`obj_type` = '2' AND
`region` = 'us' AND 
DATE(`date`) = DATE('20120213010502');

With next structure

CREATE TABLE `logs` (
  `log_id` int(11) NOT NULL AUTO_INCREMENT,
  `event` tinyint(4) NOT NULL,
  `obj_type` tinyint(1) NOT NULL DEFAULT '0',
  `obj_id` int(11) unsigned NOT NULL DEFAULT '0',
  `region` varchar(3) NOT NULL DEFAULT '',
  `date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`log_id`),
  KEY `event` (`event`),
  KEY `obj_type` (`obj_type`),
  KEY `region` (`region`),
  KEY `for_stat` (`event`,`obj_type`,`obj_id`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |

and MySQL explain show the next

+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys                  | key      | key_len | ref         | rows   | filtered | Extra                    |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
|  1 | SIMPLE      | logs  | ref  | event,obj_type,region,for_stat | for_stat | 2       | const,const | 837216 |   100.00 | Using where; Using index |
+----+-------------+-------+------+--------------------------------+----------+---------+-------------+--------+----------+--------------------------+
1 row in set, 1 warning (0.00 sec)

Running such query in daily peak usage time take about 5 seconds.

What can I do to make it faster ?

UPDATED: Regarding all comments I modified INDEX and take off DATE function in WHERE clause

+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| logs  |          0 | PRIMARY  |            1 | log_id      | A         |    15379109 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | event    |            1 | event       | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | obj_type |            1 | obj_type    | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | region   |            1 | region      | A         |          14 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            1 | event       | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            2 | obj_type    | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            3 | region      | A         |         157 |     NULL | NULL   |      | BTREE      |         |
| logs  |          1 | for_stat |            4 | date        | A         |         157 |     NULL | NULL   |      | BTREE      |         |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+


    mysql> explain extended SELECT COUNT(obj_id) as cnt 
    ->     FROM `common`.`logs` 
    ->     WHERE `event`= '11' AND 
    ->     `obj_type` = '2' AND 
    ->     `region`= 'est' AND 
    ->     date between '2012-11-25 00:00:00' and '2012-11-25 23:59:59';
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
| id | select_type | table | type  | possible_keys                  | key      | key_len | ref  | rows | filtered | Extra       |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | logs  | range | event,obj_type,region,for_stat | for_stat | 21      | NULL | 9674 |    75.01 | Using where |
+----+-------------+-------+-------+--------------------------------+----------+---------+------+------+----------+-------------+

It seems it's running faster. Thanks everyone.

user1016265
  • 2,307
  • 3
  • 32
  • 49

5 Answers5

2

The EXPLAIN output shows that the query is using only the first two columns of the for_stat index.

This is because the query doesn't use obj_id in the WHERE clause. If you create a new key without obj_id (or modify the existing key to reorder the columns), more of the key can be used and you may see better performance:

KEY `for_stat2` (`event`,`obj_type`,`region`,`date`)

If it's still too slow, changing the last condition, where you use DATE(), as said by Salman and Sashi, might improve things.

Joni
  • 108,737
  • 14
  • 143
  • 193
  • +1. Add this index and simply change the `COUNT(obj_id)` to `COUNT(*)`. `obj_id` is not nullable so the results will be identical. – ypercubeᵀᴹ Nov 30 '12 at 11:26
  • @joni What about number of indexs in logs table ? in each insert update ALL indexes will be rebuilded ? – user1016265 Nov 30 '12 at 12:47
  • @user1016265 Indexes have to be updated (not rebuilt) when new data is inserted, so they will make `INSERT`s slower, but you need indexes if you want reasonable performance from `SELECT`s. You have to choose which is more important for you. – Joni Dec 01 '12 at 14:56
0

The date function on the date column is making the full table scan. Try this ::

SELECT COUNT(obj_id) as cnt
                FROM
                    `common`.`logs` 
                WHERE 
                    `event`      = 11
                AND
                    `obj_type`   = 2

                AND
                    `region`     = 'us'
                AND
                    `date` = DATE('20120213010502')
Sashi Kant
  • 13,277
  • 9
  • 44
  • 71
  • but I need exactly date not date + time, it means I have to use BETWEEN in last DATE condition, instead of using date function ? – user1016265 Nov 30 '12 at 11:15
  • @user1016265: Yes you can use the between, to get the use the date field directly, you can refer Salman's answer for the between clause, and please paste the explain plan with the updated query – Sashi Kant Nov 30 '12 at 11:25
0

@Joni already explained what is wrong with your index. For query, I assume that your example query selects all records for 2012-02-13 regardless of time. You can change the where clause to use >= and < instead of DATE cast:

SELECT COUNT(obj_id) AS cnt
FROM
`common`.`logs`
WHERE 
`event` = 11 AND
`obj_type` = 2 AND
`region` = 'us' AND 
`date` >= DATE('20120213010502') AND
`date` <  DATE('20120213010502') + INTERVAL 1 DAY
Salman A
  • 262,204
  • 82
  • 430
  • 521
0

As logging (inserts) needs to be fast too, use as less indices as possible.

Evaluation may take long as that is admin, not necessarily needing indices.

CREATE TABLE `logs` (
  `log_id` int(11) NOT NULL AUTO_INCREMENT,
  `event` tinyint(4) NOT NULL,
  `obj_type` tinyint(1) NOT NULL DEFAULT '0',
  `obj_id` int(11) unsigned NOT NULL DEFAULT '0',
  `region` varchar(3) NOT NULL DEFAULT '',
  `date` datetime NOT NULL DEFAULT '0000-00-00 00:00:00',
  PRIMARY KEY (`log_id`),
  KEY `for_stat` (`event`,`obj_type`,`region`,`date`)
) ENGINE=InnoDB AUTO_INCREMENT=83126347 DEFAULT CHARSET=utf8 COMMENT='Logs table' |

And about the date search @SashiKant and @SalmanA already answered.

Joop Eggen
  • 107,315
  • 7
  • 83
  • 138
0

Is Mysql you should place index columns by collation count; less possible values in table - placed closer to the left. Also you can try to change column region to enum() and try to search date with BETWEEN clause. Mysql is not using third column in the index because it's usage takes more efforts then just filtering (it's a common thing in Mysql).

elgato
  • 321
  • 3
  • 13