9

I'm trying to find a way to do a conditional DELETE on an InnoDB table which contains millions of records, without locking it (thus not bringing the website down).

I've tried to find information on mysql.com, but to no avail. Any tips on how to proceed?

changelog
  • 4,646
  • 4
  • 35
  • 62
  • Are there a lot of indexes on this table? – Nick Craver Feb 13 '10 at 14:51
  • Index the columns which will handle conditions. Then perform the delete when the traffic to the site is low (@ night probably). –  Feb 13 '10 at 14:54
  • see also http://stackoverflow.com/questions/23193761/delete-operation-locks-whole-table-in-innodb which mentions gap locking – rogerdpack Sep 05 '16 at 18:08

3 Answers3

8

I don't think it is possible to delete without locking. That said, I don't think locking the record you want to delete is a problem. What would be a problem is locking other rows.

I found some information on that subject here: http://dev.mysql.com/doc/refman/5.0/en/innodb-locks-set.html

What I would suggest, is to try and do a million single row deletes. I think that if you do all those in a single transaction, performance should not hurt too much. so you would get something like:

START TRANSACTION;

DELETE FROM tab WHERE id = 1;
..
..
DELETE FROM tab WHERE id = x;

COMMIT;

You can generate the required statments by doing something like

SELECT CONCAT('DELETE FROM tab WHERE id = ', id)
FROM   tab
WHERE  <some intricate condition that selects the set you want to delete>

So the advantage over this method instead of doing:

DELETE FROM tab 
WHERE  <some intricate condition that selects the set you want to delete>

is that in the first approach you only ever lock the record you're deleting, whereas in the second approach you could run the risk of locking other records that happen to be in the same range as the rows you are deleteing.

Roland Bouman
  • 31,125
  • 6
  • 66
  • 67
  • +1, promising solution! Have you ever used this approach in practice? – Frunsi Feb 13 '10 at 15:44
  • @frunsi: I honestly admit that I have not. Typically, I don't have much of an issue with deleting a million rows, but then again the systems I manage are not being hit by many users at the same time. – Roland Bouman Feb 13 '10 at 15:58
5

If it fits your application, then you could limit the number of rows to delete, and setup a cronjob for repeating the deletion. E.g.:

DELETE FROM tab WHERE .. LIMIT 1000

I found this to be good compromise in a similar scenario.

Frunsi
  • 7,099
  • 5
  • 36
  • 42
  • I think the problem with this approach is that the LIMIT is evaluated after isolating the set. So the `WHERE` is applied, which could lock rows that happen to be in the same range as the rows you are deleting. The fact that you then only delete a limited number, does not change the fact that the records have already been locked. What this method does do is keep the transaction size small, which is also good. But it does not prevent spurious locking as far as i understand. – Roland Bouman Feb 13 '10 at 15:40
  • @Roland: You could be right. This solution may not be ideal, but works fine in practice (limited delete is fast, so the locks are hold for very small timespans, the limit parameter can be tuned to the real life data). I did some benchmarks before using this approach, and the accumulated time of all the limited deletes finished in less time than a single non-limited one (but at this time, it used MySQL 3.x and MyISAM tables). It still works, but benchmark on current MySQL and InnoDB tables would bring some light into the details.. – Frunsi Feb 13 '10 at 15:49
  • 1
    yes I think it's safe to say that the scenario you describe, a table locking MyISAM engine on a more than 9 yrs old version of the product as compared to the InnoDB engine wich features both multi-version concurrency control and row-level locking, might indeed be somewhat different :p – Roland Bouman Feb 13 '10 at 15:57
1

I use procedure to delete

create procedure delete_last_year_data() 
begin
  DECLARE del_row varchar(255);
  DECLARE done INT DEFAULT 0;

  declare del_rows cursor for select CONCAT('DELETE FROM table_name WHERE id = ', id)
                            from table_name 
                            where created_time < '2018-01-01 00:00:00';
  DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

  open del_rows;

  repeat
    fetch del_rows into del_row;
    if not done
    then
      set @del = del_row;
      prepare stmt from @del;
      execute stmt;
      DEALLOCATE PREPARE stmt;
    end if;
  until done end repeat;

  close del_rows;

end //

Gavin Zhao
  • 41
  • 1