Lately i have been tasked to delete and reinsert approximately 15 million rows on a myisam table that has about 150 million rows doing so while the table/db still remains available for inserts/reads.
In order to do so i have started a process that takes small chunks of data and reinserts it via insert select statements into a cloned table with the same structure with sleep in between the runs to not overload the server, skips over the data to be deleted and insert the replacement data.
This way while cloned table was in the build process (took 8+ hours) new data was coming in into the source table. At the end i had to just sync the tables with the new data that was added in the 8+ hours and do a rename of the tables.
Everything was fine with exception of one thing. The cardinality of the indexes on the cloned table is way off, and execution plans for queries executed against it are awful (went from few seconds to 30+ min for some of them).
I know that this can be fixed by running an Analyze table on it, but this takes also a lot of time (currently i'm running one on a slave server and is been executed for more then 10h now) and i can't afford to have this table offline to write while the analyze is performed. Also this will stress the IO of the server putting pressure on the server and slowing it down.
Can someone explain why building a myisam table via insert select statements results in a table which has such a poor internal statistics for indexes?
Also is there a way to incrementally build the table and have the indexes in good shape at the end?
Thanks in advance.