I have a table I want to partition using MySQL 5.7 Partitioning to mitigate issues I'm having with dropping old data quickly. (Also, it would be nice to have increased insert I/O performance by partitioning across something other than date, especially if I plan to shard across multiple volumes with subpartitions)
Here is a simplified version of the table:
CREATE TABLE `tbl` (
`date` date NOT NULL,
`sub_id` int(11) unsigned NOT NULL,
`cmd_id` int(11) NOT NULL,
`code` TINYINT DEFAULT NULL,
`rqst` VARCHAR(32) NOT NULL DEFAULT '',
UNIQUE KEY `uk1` (sub_id,cmd_id,date)
) ENGINE=InnoDB
(note that use of column 'date' in uk1 is only to allow partitioning on date)
(The true unique key is (sub_id,cmd_id))
Here are the SQL statements I make on that table:
1. INSERT INTO tbl (NOW(), ...)
2. UPDATE tbl SET code=$code WHERE sub_id=$sub_id AND cmd_id=$cmd_id
3. SELECT code,rqst FROM tbl WHERE sub_id=$sub_id AND cmd_id=$cmd_id
Here is the partitioning scheme I've devised so far:
PARTITION BY RANGE (TO_DAYS(date))
SUBPARTITION BY HASH(sub_id)
SUBPARTITIONS 4
(PARTITION d001 VALUES LESS THAN (736250) ENGINE = InnoDB,
PARTITION d002 VALUES LESS THAN (736260) ENGINE = InnoDB,
PARTITION d003 VALUES LESS THAN (736270) ENGINE = InnoDB,
PARTITION d004 VALUES LESS THAN (736280) ENGINE = InnoDB,
PARTITION d005 VALUES LESS THAN (736290) ENGINE = InnoDB,
PARTITION d006 VALUES LESS THAN (736300) ENGINE = InnoDB,
PARTITION d007 VALUES LESS THAN (736310) ENGINE = InnoDB,
PARTITION d008 VALUES LESS THAN (736320) ENGINE = InnoDB,
PARTITION d009 VALUES LESS THAN (736330) ENGINE = InnoDB,
PARTITION d010 VALUES LESS THAN (736340) ENGINE = InnoDB,
PARTITION d011 VALUES LESS THAN MAXVALUE ENGINE = InnoDB)
However I believe this will hurt performance by requiring a read per each partition every time i reference (sub_id,cmd_id):
EXPLAIN PARTITIONS SELECT * FROM tbl WHERE sub_id='107' AND cmd_id='2246806';
+----+-------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------+------+---------------+------+---------+-------------+------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------+------+---------------+------+---------+-------------+------+-------------+
| 1 | SIMPLE | optz | d001_d001sp1,d002_d002sp1,d003_d003sp1,d004_d004sp1,d005_d005sp1,d006_d006sp1,d007_d007sp1,d008_d008sp1,d009_d009sp1,d010_d010sp1,d011_d011sp1 | ref | uk1 | uk1 | 38 | const,const | 11 | Using where |
+----+-------------+-------+------------------------------------------------------------------------------------------------------------------------------------------------+------+---------------+------+---------+-------------+------+-------------+
So the crux of my problem is:
- If i partition by D date's then its D-1 Extra lookups
- If i partition by S sub_id's then I cant easily DROP partitions on Date
- I don't see how I could use COLUMNS Partitioning
Here are some notes/caveats:
- INSERTing about 5-20million rows/day
- Equal distribution of read,write,insert - but always single row
- Only need to keep past ~month of data
- A replication system is in place
- The hardware involved is expensive
- I didn't want to include the
date
column in the unique key but then I couldn't partition on it, so the code ensures (sub_id,cmd_id) is unique across dates as it stands.
Thanks!