I'm building a table that will be used for READ ONLY which is why I decided to go with MyISAM instead of InnoDB (but i'm open to suggestions). However, this table needs to be extremely large. Although each row is about 12-14 bytes, the table itself will contain hundreds of millions to billions of rows.
The table looks like this:
CREATE TABLE `tb1` ( `wid` int(11) NOT NULL DEFAULT '0',
`pid` int(11) NOT NULL DEFAULT '0',
`hit` tinyint(3) unsigned NOT NULL DEFAULT '0',
KEY `wid` (`wid`,`hit`,`pid`))
ENGINE=MyISAM DEFAULT CHARSET=latin1
I am loading all the data from text files using LOAD DATA LOCAL INFILE
in chunks.
It takes me 30 seconds to load about 20M rows. Then I load a chunk with 100M rows and 10 minutes later, after canceling the query, there's only 38M rows total. (18M rows in 10 minutes?..)
It appears this type of loading is exponential rather than linear. Why is that?
What can I do to speed up LOAD DATA INFILE when I have an index that keeps increasing?
Are there benefits to using InnoDB instead of MyISAM in terms of LOAD DATA INFILE speed?
Will it be quicker to load all the data with no indexes.. and then apply one afterwards?
Thanks for your input.