12

I'm trying to delete 267 records out of about 40 million. The query looks like:

delete from pricedata
where
pricedate > '20120413'

pricedate is a char(8) field.

I know about adjusting innodb_buffer_pool_size, but if I can do

select from pricedata
where
pricedate > '20120413'

and get 267 records (and that's all there are), no errors, why does it choke on the delete?

And if adjusting innodb_buffer_pool_size doesn't work, what should I do?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
davej
  • 1,350
  • 5
  • 17
  • 34

3 Answers3

11

It seems that you don't have an index on pricedate (or MySQL does not use this index for some reason).

With REPEATABLE READ (the default transaction isolation level), InnoDB places shared locks on the records read and filtered out by the query and it seems you don't have enough space for 40M locks.

To work around this problem use any of these solutions:

  1. Create the index on pricedate if it's not there (may take time)

  2. Break your query into smaller chunks:

    DELETE
    FROM    pricedata
    WHERE   pricedate > '20120413'
            AND id BETWEEN 1 AND 1000000
    
    DELETE
    FROM    pricedata
    WHERE   pricedate > '20120413'
            AND id BETWEEN 1000001 AND 2000000
    

    etc. (change the id ranges as needed). Note that each statement should be run in its own transaction (don't forget to commit after each statement if AUTOCOMMIT is off).

  3. Run the DELETE query with READ COMMITTED transaction isolation level. It will make InnoDB lift locks from the records as soon as they are read. This will not work if you are using binary log in statement mode and don't allow binlog-unsafe queries (this is the default setting).

Quassnoi
  • 413,100
  • 91
  • 616
  • 614
  • @davej: on a table of that size this is going to take at least `40` minutes, and if you have lots of columns (or long `TEXT` or binary columns) this make take hours and days. If you are not using `InnoDB 1.1` (or the plugin), it is going to take even more. – Quassnoi Apr 20 '12 at 21:38
  • Thanks again for your informative input. There are a total of 8 fields; "pricedate", is the only non-numeric. There are, tho, 2 BIGINT(20)'s, so I've no idea how that'll factor in. As it is, I have to go out for a few hours anyway, so I'll let it run. Good thing it's on a test db! – davej Apr 20 '12 at 21:48
  • @davej: with a decent server, it should be a matter of an hour or two. – Quassnoi Apr 20 '12 at 22:04
  • it actually never finished after 6 hours, so I killed it. I then tried recreating the table with an index, then loading the table using "LOAD DATA INFILE..." Without the index, the file takes under 40 minutes to run. With the index, it got to about 40% of the load in 2hrs 15 minutes, slowing as it went. I'm using the char format for the date field because it works fine in my app, and is easier to load. Now I'm wondering if I should change it to an actual date. – davej Apr 21 '12 at 17:16
  • I'm assuming because numeric fields are easier to index? Or easier to use as criteria for delete queries? Would appreciate your insight. – davej Apr 21 '12 at 17:55
  • Well, I tried changing the pricedate field to DATE (which went fine), but when I tried to add the index, it was still working over 2.5 hours later. So I killed it, and tried doing the delete with `READ COMMITTED` transaction isolation level, and got the same error, i.e., "The total number of locks...". I am now going to try setting innodb_buffer_pool_size to 256M. Will report back. – davej Apr 21 '12 at 21:12
  • WOOHOO! changing `innodb_buffer_pool_size to 256M` to 256M did the trick, and the delete only took 2 min! Well, it's been a learning experience... Thanks, Quassnoi, for the great input. – davej Apr 21 '12 at 21:23
  • @davej i tried the mentioned way but it is not working.. Is there is any other way to do this ? – Pranav MS Aug 04 '17 at 07:29
  • @PranavMS I never revisited this problem once the solution worked for me. Are you sure the innodb_buffer_pool_size solution didn't work? Perhaps try restarting the server. – davej Aug 05 '17 at 13:57
7

(A late answer, but alwayx good to have it when people find this issue in google)

A solution without having to alter the innodb_buffer_pool_size or creating an index can be to limit the amount of rows to be deleted.

So, in your case DELETE from pricedata where pricedata > '20120413' limit 100; for example. This will remove 100 rows and leave 167 behind. So, you can run the same query again and delete another 100. For the last 67 it's tricky... when the amount of rows left in the database is less than the given limit you will again end up with the error about the number of locks. Probably because the server will search for more matching rows to fill up to the 100. In this case, use limit 67 to delete the last part. (Ofcourse you could use limit 267 already in the beginning as well)

And for those who like to script... a nice example I used in a bash script to cleanup old data :

   # Count the number of rows left to be deleted
   QUERY="select count(*) from pricedata where pricedata > '20120413';"
   AMOUNT=`${MYSQL} -u ${MYSQL_USER} -p${MYSQL_PWD} -e "${QUERY}" ${DB} | tail -1`
   ERROR=0
   while [ ${AMOUNT} -gt 0 -a ${ERROR} -eq 0 ]
   do
      ${LOGGER} "   ${AMOUNT} rows left to delete"
      if [ ${AMOUNT} -lt 1000 ]
      then
         LIMIT=${AMOUNT}
      else
         LIMIT=1000
      fi
      QUERY="delete low_priority from pricedata where pricedata > '20120413' limit ${LIMIT};"
      ${MYSQL} -u ${MYSQL_USER} -p${MYSQL_PWD} -e "${QUERY}" ${DB}
      STATUS=$?
      if [ ${STATUS} -ne 0 ]
      then
         ${LOGGER} "Cleanup failed for ${TABLE}"
         ERROR=1
      fi
      QUERY="select count(*) from pricedata where pricedata > '20120413';"
      AMOUNT=`${MYSQL} -u ${MYSQL_USER} -p${MYSQL_PWD} -e "${QUERY}" ${DB} | tail -1`
   done
user3472174
  • 81
  • 1
  • 4
6

What worked: changing innodb_buffer_pool_size to 256M (see comments under Quassnoi's original comment).

davej
  • 1,350
  • 5
  • 17
  • 34