Currently, I have a Server A that is holding about 25 billion records (several terabytes size) with the following structure:
CREATE TABLE `table_x` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`a1` char(64) DEFAULT NULL,
`b1` int(11) unsigned DEFAULT NULL,
`c1` tinyint(1) DEFAULT NULL,
`LastUpdate` timestamp NOT NULL DEFAULT current_timestamp() ON UPDATE current_timestamp(),
PRIMARY KEY (`id`),
UNIQUE KEY `idxb1a1` (`b1`,`a1`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
As the data is growing too big, I am trying to migrate these records into Server B with the same schema structure using bulk insert of 10K records (e.g INSERT INTO yourtable VALUES (1,2), (5,5), ...;
) in asecending order by the column id
.
Initially, the insertion rate was really quick - however, it gradually slowed down and now takes about 10 secs to bulk insert 10K records (i.e 1K/sec). I am guessing its because it needs to update the indexes after every insertion.
I have done the following configuration on Server B before starting the migration :
innodb_flush_log_at_trx_commit=2
SET unique_checks=0;
autocommit=0
and commit every 50K
Server B hardware configuration :
- 300GB+ ram (240GB used for
innodb_buffer_pool_size
) - SSDs for data storage
Server B my.cnf :
innodb_buffer_pool_size=240G
innodb_buffer_pool_instances=64
innodb_page_cleaners=32
innodb_purge_threads=1
innodb_read_io_threads=64
innodb_write_io_threads=64
innodb_use_native_aio=0
innodb_flush_log_at_trx_commit=2
innodb_doublewrite=0
innodb_autoinc_lock_mode=2
innodb_file_per_table=1
max_connections=10000
skip_name_resolve=1
tmp_table_size=134217728
max_heap_table_size=134217728
back_log=1000
wait_timeout=900
innodb_log_buffer_size=32M
innodb_log_file_size=768M
Is there anything else I can do or configure to speed up the insertion?
Update #1 :
The reason why I am trying to migrate the records over to Server B is because I would like to break/shard the data into few servers (to use MariaDB SPIDER engine sharding solution). As such, solutions that involved sending a snapshot of the data or directly copying over the data doesn't seem viable.