I have two tables in my db that have millions of rows now, the selection and insertion is getting slower and slower.
I am using spring+hibernate+mysql 5.5 and read about the sharding as well as partitioning the table and like the idea of partitioning my tables,
My current Db structure is like
CREATE TABLE `user` (
`id` BIGINT(20) NOT NULL,
`name` VARCHAR(255) DEFAULT NULL,
`email` VARCHAR(255) DEFAULT NULL,
`location_id` bigint(20) default NULL,
`updated_time` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `FK3DC99772C476E06B` (`location_id`),
CONSTRAINT `FK3DC99772C476E06B` FOREIGN KEY (`location_id`) REFERENCES `places` (`id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8
CREATE TABLE `friends` (
`id` BIGINT(20) NOT NULL AUTO_INCREMENT,
`user_id` BIGINT(20) DEFAULT NULL,
`friend_id` BIGINT(20) DEFAULT NULL,
`updated_time` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `unique_friend` (`user_id`,`friend_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8
Now I am testing how to better use partitioning, for user table following I thought will be good based on by usage.
CREATE TABLE `user_partition` (
`id` BIGINT(20) NOT NULL,
`name` VARCHAR(255) DEFAULT NULL,
`email` VARCHAR(255) DEFAULT NULL,
`location_id` bigint(20) default NULL,
`updated_time` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `FK3DC99772C476E06B` (`location_id`)
) ENGINE=INNODB DEFAULT CHARSET=utf8
PARTITION BY HASH(id DIV 100000)
PARTITIONS 30;
I created a procedures to load data in two table and check the performance of the two tables
DELIMITER //
CREATE PROCEDURE load_partition_table()
BEGIN
DECLARE v INT DEFAULT 0;
WHILE v < 1000000
DO
INSERT INTO user_partition (id,NAME,email)
VALUES (v,CONCAT(v,' name'),CONCAT(v,'@yahoo.com')),
(v+1,CONCAT(v+1,' name'),CONCAT(v+1,'@yahoo.com')),
(v+2,CONCAT(v+2,' name'),CONCAT(v+2,'@yahoo.com')),
(v+3,CONCAT(v+3,' name'),CONCAT(v+3,'@yahoo.com')),
(v+4,CONCAT(v+4,' name'),CONCAT(v+4,'@yahoo.com')),
(v+5,CONCAT(v+5,' name'),CONCAT(v+5,'@yahoo.com')),
(v+6,CONCAT(v+6,' name'),CONCAT(v+6,'@yahoo.com')),
(v+7,CONCAT(v+7,' name'),CONCAT(v+7,'@yahoo.com')),
(v+8,CONCAT(v+8,' name'),CONCAT(v+8,'@yahoo.com')),
(v+9,CONCAT(v+9,' name'),CONCAT(v+9,'@yahoo.com'))
;
SET v = v + 10;
END WHILE;
END
//
CREATE PROCEDURE load_table()
BEGIN
DECLARE v INT DEFAULT 0;
WHILE v < 1000000
DO
INSERT INTO user (id,NAME,email)
VALUES (v,CONCAT(v,' name'),CONCAT(v,'@yahoo.com')),
(v+1,CONCAT(v+1,' name'),CONCAT(v+1,'@yahoo.com')),
(v+2,CONCAT(v+2,' name'),CONCAT(v+2,'@yahoo.com')),
(v+3,CONCAT(v+3,' name'),CONCAT(v+3,'@yahoo.com')),
(v+4,CONCAT(v+4,' name'),CONCAT(v+4,'@yahoo.com')),
(v+5,CONCAT(v+5,' name'),CONCAT(v+5,'@yahoo.com')),
(v+6,CONCAT(v+6,' name'),CONCAT(v+6,'@yahoo.com')),
(v+7,CONCAT(v+7,' name'),CONCAT(v+7,'@yahoo.com')),
(v+8,CONCAT(v+8,' name'),CONCAT(v+8,'@yahoo.com')),
(v+9,CONCAT(v+9,' name'),CONCAT(v+9,'@yahoo.com'))
;
SET v = v + 10;
END WHILE;
END
//
Results were surprizing, insert/select in non partition table giving better results.
mysql> select count(*) from user_partition;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.40 sec)
mysql> select count(*) from user;
+----------+
| count(*) |
+----------+
| 1000000 |
+----------+
1 row in set (0.00 sec)
mysql> call load_table();
Query OK, 10 rows affected (20.31 sec)
mysql> call load_partition_table();
Query OK, 10 rows affected (21.22 sec)
mysql> select * from user where id = 999999;
+--------+-------------+------------------+---------------------+
| id | name | email | updated_time |
+--------+-------------+------------------+---------------------+
| 999999 | 999999 name | 999999@yahoo.com | 2012-11-27 08:06:54 |
+--------+-------------+------------------+---------------------+
1 row in set (0.00 sec)
mysql> select * from user_no_part where id = 999999;
+--------+-------------+------------------+---------------------+
| id | name | email | updated_time |
+--------+-------------+------------------+---------------------+
| 999999 | 999999 name | 999999@yahoo.com | 2012-11-27 08:03:14 |
+--------+-------------+------------------+---------------------+
1 row in set (0.00 sec)
So two question
1) Whats the best way to partition user
table so that inserts and selects also become fast and removing FOREIGN KEY on location_id
is correct? I know partition can be good only if we access on the base of partition key, In my case I want to read the table only by id. why inserts are slower in partition table?
2) What the best way to partition friend
table as I want to partition friends on the bases of user_id
as want to place all user friends in same partition and always access it using a user_id. Should I drop the primary key on friend.id or add the user_id in primary key?