Data Structure :
Main table (500 Millions) :
CREATE TABLE
USER_DETAILS
(
visitor_id
varchar(50) DEFAULT NULL,partition_id INT,
related_text longtext,
creation_date
timestamp DEFAULT CURRENT_TIMESTAMP,PRIMARY KEY (
visitor_id
,partition_id) )ENGINE=TokuDB
PARTITION BY LIST (partition_id) (
PARTITION p0 VALUES IN (0) ENGINE = TokuDB,
PARTITION p1 VALUES IN (1) ENGINE = TokuDB,
PARTITION p2 VALUES IN (2) ENGINE = TokuDB,
PARTITION p3 VALUES IN (3) ENGINE = TokuDB,
PARTITION p4 VALUES IN (4) ENGINE = TokuDB,
PARTITION p5 VALUES IN (5) ENGINE = TokuDB,
PARTITION p6 VALUES IN (6) ENGINE = TokuDB,
PARTITION p7 VALUES IN (7) ENGINE = TokuDB,
PARTITION p8 VALUES IN (8) ENGINE = TokuDB,
PARTITION p9 VALUES IN (9) ENGINE = TokuDB);
Intermediate table (10-20 Millions):
CREATE TABLE
USER_DETAILS_INTERMEDIATE
(
id
bigint(20) NOT NULL AUTO_INCREMENT,visitor_id` varchar(50) DEFAULT NULL,
partition_id
int(11) DEFAULT NULL,
related_text
longtext, PRIMARY KEY (id
));
Problem :
Taking too much time when i transfers data from intermediate table to main table.
I tried following solutions :
solution 1 :
REPLACE INTO USER_DETAILS(visitor_id, partition_id, json_list)
SELECT visitor_id ,partition_id ,related_text
FROM USER_DETAILS_INTERMEDIATE a
solution 2 (running following statement in loop : 10000 rows per loop):
REPLACE INTO USER_DETAILS(visitor_id, partition_id, json_list)
SELECT visitor_id ,partition_id ,related_text
FROM USER_DETAILS_INTERMEDIATE a
WHERE id BETWEEN var_min_id AND var_max_id ;
Above query is taking both time.
Is there another way to improve this..?