I am hoping someone can explain why two hours of data purging purportedly results in a mere 32 KB reduction in data usage on my mysql instance. Here are my details:
I have a mysql database (running on Amazon RDS) from which I am attempting to purge data. I am doing this in order to avoid running out of storage space, because Amazon caps you at 1TB, and if we take no action we will eventually hit this limit.
I am using this command to calculate the size of my tables and indices:
select * from information_schema.tables;
There are two InnoDB tables in particular that consume the majority of my storage. I have a process that is iterating through my largest table deleting records. At time t = 0 I ran the above SQL query and got the following results for data length and index data length:
- Data length: 56431116288
- Index data length: 74233151488
Two hours later, after continuously running my database cleaner process, I ran the above SQL statement and got following:
- Data length: 56431083520
- Index data length: 74126147584
That basically means I shaved off 32 KB of table data and 102 MB of index data.
The reduction in the index makes sense. The reduction in the table data is insanely small. It is not possible that other data was inserted during this time, because I am running this test on a backup copy of my database (one of the nice things about RDS is you can get a complete replicate of your DB up and running to run experiments on, such as this one). I also confirmed that the AUTO_INCREMENT value was identical both times.
Can someone explain why data length has not moved much at all? Is data length just a really quick and dirty approximation? Is there some other compaction step mysql will eventually do? Or have I totally mis-interpreted the use of these fields?
Thank you!
Update
I may have figured it out -- at time t=0
- DATA_FREE = 77594624
Four hours laters,
- DATA_FREE = 256901120
That means I have increased DATA_FREE by appx 171MB.
Does this therefore mean that if I insert another 171MB, it will come out of the DATA_FREE pool, and so my Data length will not increase?
Put another way, let's say I start with a new InnoDB table and insert 20 GB of data (assuming that 20 GB incorporates all the excess InnoDB stuff, I realize data stored in InnoDB is larger than MyISAM), then I delete all data, then I insert 10 GB of data. When I run select * from information_schema.tables, I should see Data length of 10 GB and Data free of 10 GB, right? I should not expect to see Data length of 30 GB/Data free of 0 GB nor should I expect to see data length of 10 GB/data free of 10 GB?
Update 2
This post on Stack Overflow would also seem to confirm that my analysis.