0

I am trying to optimize a query that is taking almost 30 minutes to run. What I am trying to do is to take advantage of partition pruning to minimize the rows searched. The range of the variable by which the table is partitioned is a variable from a different table. It seems that mysql is searching all partitions.

Here are the tables(cut parts out that are irrevelant): (I do realize, by the way, that the int(x) is not changing the size of the int. I had designed that table before knowing the better, and haven't fixed it)

expectedEvent | CREATE TABLE `expectedevent` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`eventId` int(5) NOT NULL,
`unitGroup_id` int(6) NOT NULL,
`minOccur` int(9) NOT NULL,
`periodInDays` int(4) NOT NULL,
PRIMARY KEY (`id`),
KEY `eventId` (`eventId`),
KEY `unitGroup_id` (`unitGroup_id`),
CONSTRAINT `expectedevent_ibfk_1` FOREIGN KEY (`unitGroup_id`) REFERENCES `unitgroup` (`id`)

event_message | CREATE TABLE `event_message` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`unitId` varchar(15) NOT NULL,
`eventId` smallint(6) NOT NULL,
`eventName` varchar(50) NOT NULL,
`gpsDateTime` datetime NOT NULL,
`weekInfo` tinyint(4) NOT NULL,
`odometer` int(11) NOT NULL,
...
KEY `id` (`id`),
KEY `unitId` (`unitId`,`eventId`),
KEY `eventId` (`eventId`)
...
!50100 PARTITION BY RANGE (weekInfo)
ARTITION p0 VALUES LESS THAN (1) ENGINE = InnoDB,
ARTITION p1 VALUES LESS THAN (2) ENGINE = InnoDB,

unitGroup | CREATE TABLE `unitgroup` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(60) DEFAULT NULL,

unitGroup_devices | CREATE TABLE `unitg
`id` int(11) NOT NULL AUTO_INCREMENT,
`unitGroup_id` int(11) NOT NULL,
`scopeDevice_id` int(11) NOT NULL,
PRIMARY KEY (`id`),
KEY `unitGroup_id` (`unitGroup_id`),
KEY `scopeDevice_id` (`scopeDevice_id`)
...

This query takes almost 30 mins (the select, without the explain):

explain partitions
select ee.eventId, scopeDevice_id as scopeDevId, sd.unitId as unitId, count(em.id) as evtCount, minOccur, periodInDays
from expectedEvent ee left join unitGroup ug on ee.unitGroup_id=ug.id
left join unitGroup_devices ugd on ug.id=ugd.unitGroup_id
left join scopeDevice sd on ugd.scopeDevice_id=sd.id
left join event_message em on sd.unitId=em.unitId and em.eventId=ee.eventId
where gpsDateTime>=DATE_SUB(DATE(now()),INTERVAL periodInDays DAY)
and weekInfo>=WEEKOFYEAR(DATE_SUB(DATE(now()),INTERVAL periodInDays DAY))
and weekInfo <=WEEKOFYEAR(DATE(now()))
group by ee.id, ugd.scopeDevice_id;

| id | select_type | table | partitions                                                                                                                                                | type   | possible_keys               | key          | key_len | ref                                            | rows | Extra                           |
+----+-------------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+--------+-----------------------------+--------------+---------+------------------------------------------------+------+---------------------------------+
|  1 | SIMPLE      | ee    | NULL                                                                                                                                                      | ALL    | eventId,unitGroup_id        | NULL         | NULL    | NULL                                           |    1 | Using temporary; Using filesort |
|  1 | SIMPLE      | ug    | NULL                                                                                                                                                      | eq_ref | PRIMARY                     | PRIMARY      | 4       | navsat_scope.ee.unitGroup_id                   |    1 | Using where; Using index        |
|  1 | SIMPLE      | ugd   | NULL                                                                                                                                                      | ref    | unitGroup_id,scopeDevice_id | unitGroup_id | 4       | navsat_scope.ee.unitGroup_id                   |   11 | Using where                     |
|  1 | SIMPLE      | sd    | NULL                                                                                                                                                      | eq_ref | PRIMARY,unitId              | PRIMARY      | 4       | navsat_scope.ugd.scopeDevice_id                |    1 | Using where                     |
|  1 | SIMPLE      | em    | p0,p1,p2,p3,p4,p5,p6,p7,p8,p9,p10,p11,p12,p13,p14,p15,p16,p17,p18,p19,p20,p21,p22,p23,p24,p25,p26,p27,p28,p29,p30,p31,p32,p33,p34,p35,p36,p37,p38,p39,p40 | ref    | unitId,eventId              | unitId       | 19      | navsat_scope.sd.unitId,navsat_scope.ee.eventId |  682 | Using where                     |
+----+-------------+-------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+--------+-----------------------------+--------------+---------+------------------------------------------------+------+---------------------------------+

There is only one entry in expectedEvent, so it is basically equivalent to doing the following. This query takes around 3 minutes(the select, without the explain):

select ee.eventId, scopeDevice_id as scopeDevId, sd.unitId as unitId, count(em.id) as evtCount, minOccur, periodInDays
from expectedEvent ee left join unitGroup ug on ee.unitGroup_id=ug.id
left join unitGroup_devices ugd on ug.id=ugd.unitGroup_id
left join scopeDevice sd on ugd.scopeDevice_id=sd.id
left join event_message em on sd.unitId=em.unitId and em.eventId=ee.eventId
where gpsDateTime>="2012-09-29"
and weekInfo>=WEEKOFYEAR("2012-09-29")
and weekInfo <=WEEKOFYEAR("2012-10-01")
group by ee.id, ugd.scopeDevice_id;

| id | select_type | table | partitions | type   | possible_keys               | key          | key_len | ref                                            | rows | Extra                           |
+----+-------------+-------+------------+--------+-----------------------------+--------------+---------+------------------------------------------------+------+---------------------------------+
|  1 | SIMPLE      | ee    | NULL       | ALL    | eventId,unitGroup_id        | NULL         | NULL    | NULL                                           |    1 | Using temporary; Using filesort |
|  1 | SIMPLE      | ug    | NULL       | eq_ref | PRIMARY                     | PRIMARY      | 4       | navsat_scope.ee.unitGroup_id                   |    1 | Using where; Using index        |
|  1 | SIMPLE      | ugd   | NULL       | ref    | unitGroup_id,scopeDevice_id | unitGroup_id | 4       | navsat_scope.ee.unitGroup_id                   |   11 | Using where                     |
|  1 | SIMPLE      | sd    | NULL       | eq_ref | PRIMARY,unitId              | PRIMARY      | 4       | navsat_scope.ugd.scopeDevice_id                |    1 | Using where                     |
|  1 | SIMPLE      | em    | p39,p40    | ref    | unitId,eventId              | unitId       | 19      | navsat_scope.sd.unitId,navsat_scope.ee.eventId |  682 | Using where                     |
+----+-------------+-------+------------+--------+-----------------------------+--------------+---------+------------------------------------------------+------+---------------------------------+

My idea for a workaround is reading in my C# app the expectedEvent table first, then building the query with real dates instead of the variable.

I would however prefer to do this all in MySQL. How can I optimize the query? ExpectedEvent eventually will contain many rows.

Thank you!

Steve Ford
  • 58
  • 1
  • 4
  • 1
    The "partition pruning" is done when the SQL statement is prepared; basically, you need values that are known at parse time; these can't be values that are unknown until execution time. When a SQL statement is executed, MySQL first prepares an execution plan, and then executes the plan. – spencer7593 Oct 01 '12 at 20:49
  • Spencer, I will mark yours as an answer if you move it from a comment to an answer. Thank you for your insight!! Makes more sense when understanding that! – Steve Ford Oct 03 '12 at 13:55
  • Looks like you still owe @spencer a shiny green check mark. :) – Air Aug 14 '14 at 16:53

2 Answers2

2

MySQL does "partition pruning" when a SQL statement is prepared (when MySQL is generating an execution plan for the statement.) To get "partition pruning" as part of the plan, MySQL needs values that are known at parse time; partition pruning isn't implemented for values that are unknown at parse time.

When MySQL executes a SQL statement, MySQL first prepares an execution plan, and then it executes that plan. "Partition pruning" is a detail of the execution plan which is decided during the prepare phase. (This explains why your statement with constants in the predicate show partition pruning, but your statement that has references to columns doesn't show any pruning.)

spencer7593
  • 106,611
  • 15
  • 112
  • 140
1

You need to use the actual values in the WHERE clause, not get the value via a JOIN. Think about it, when MySQL prepares the execution plan for the query, it has no idea as to what the values in the table is for gpsDateTime. Therefore it has no way to know that it only needs certain partitions in the query to get the data it needs.

In you case it would be much faster to get the date value used in the filter with a query beforehand and then use the actual value when making the select query, like you have done in your second example.

Mike Brant
  • 70,514
  • 10
  • 99
  • 103