Deleting rows in a table a chunk at a time

Question

I have a single-column table Ids, which whose column ID is of type uniqueidentifier. I have another table MyTable which has an ID column as well as many other columns. I would like to delete rows from MyTable 1000 at a time, where the ID from MyTable matches an ID in Ids.

WHILE 1 = 1 BEGIN
    DELETE t FROM (SELECT TOP 1000 ID FROM Ids) d INNER JOIN MyTable t ON d.ID = t.ID;
    IF @@ROWCOUNT < 1 BREAK;
    WAITFOR DELAY @sleeptime; -- some time to be determined later
END

This doesn't seem to work though. What should the statement actually be?

Mark · Answer 1 · 2012-07-03T21:17:07.233

11

Try this:

DECLARE @BatchSize INT
SET @BatchSize = 100000

WHILE @BatchSize <> 0
BEGIN 
   DELETE TOP (@BatchSize) t
    FROM [MyTable] t
    INNER JOIN [Ids] d ON d.ID=t.ID
    WHERE ????
   SET @BatchSize = @@rowcount 
END

Has the benefit that the only variable you need to create is the size, as it uses it for the WHILE loop check. When the delete gets below 100000, it will set the variable to that number, on the next pass there will be nothing to delete and the rowcount will be 0... and so you exit. Clean, simple, and easy to understand. Never use a CURSOR when WHILE will do the trick!

edited Jul 03 '12 at 21:17

answered Jul 03 '12 at 21:03

Mark

1,058
6
13

1

There's usually one extra iteration here though. How about while 1=1 and break if @@rowCount < @BatchSize at the end. – crokusek May 22 '14 at 22:54
Not the worst suggestion, but I hate doing a no-op while loop and then an if to escape. I guess you could avoid the extra interation most of the time with `WHILE @BatchSize = 100000`. you'd only have an extra iteration if you had an exact multiple of 100000. We mostly use this with a count of 4500 to avoid table-locking when purging old data. one extra loop when we're doing a few hundred never hurt. Nice catch tho, @crokusek! – Mark May 24 '14 at 03:38

score 6 · Accepted Answer · edited May 14 '22 at 20:33

6

Try

Delete from MyTable 
Where ID in 
   (select top 1000 t.ID 
    from Ids t inner 
    join MyTable d on d.Id = t.Id)

edited May 14 '22 at 20:33

ΩmegaMan

29,542
12
100
122

answered Jul 03 '12 at 20:59

Duncan Howe

2,965
19
18

Will this perform well? The database I'm working on will be quite large. – Nick Jul 03 '12 at 21:00
That is not a straight forward question to answer. I would make sure your ID is indexed to start with. But generally, this should work efficiently. – northpole Jul 03 '12 at 21:03
Actually, won't this keep selecting the same top 1000 ID repeatedly from Ids, since I'm only deleting from MyTable? – Nick Jul 03 '12 at 21:04
well, once the first 1000 are deleted, they will no longer be selected in the next iteration. – northpole Jul 03 '12 at 21:05
If the Id columns are indexed it would probably be OK, but I would opt for E J Brennan's answer for safety – Duncan Howe Jul 03 '12 at 21:05
@northpole, but they are being deleted from MyTable, not from Ids. won't Ids not change from iteration to iteration? – Nick Jul 03 '12 at 21:07
@Nick - ID is the attribute, MyTable is the table, yes. So when you delete, the next 'select top' statement will not have those previously deleted IDs in the resultset. – northpole Jul 03 '12 at 21:09
I have just edited to include an inner join to ensure that you are only deleting Ids that match up. – Duncan Howe Jul 03 '12 at 21:09

E.J. Brennan · Answer 3 · 2012-07-03T21:10:21.183

2

You could also try:

set rowcount 1000
delete from mytable where id in (select id from ids)
set rowcount 0  --reset it when you are done.

http://msdn.microsoft.com/en-us/library/ms188774.aspx

edited Jul 03 '12 at 21:10

answered Jul 03 '12 at 21:01

E.J. Brennan

45,870
7
88
116

true, but it applies to the version that comes after SQL-2012, so likely will work for some time. – E.J. Brennan Jul 03 '12 at 21:05
2

At least remind the user to clear that setting... if they reuse a terminal session or have a large script things will go really wrong from this step forward! – Mark Jul 03 '12 at 21:08

Stuart Ainsworth · Answer 4 · 2012-07-03T21:39:33.590

0

WHILE EXISTS (SELECT TOP 1 * FROM MyTable mt JOIN IDs i ON mt.ID = t.ID)
BEGIN

DELETE TOP (1000) FROM MyTable
FROM MyTable mt JOIN IDS i ON mt.ID = i.ID

--you can wait if you want, but probably not necessary
END

--Sorry for the quick post; was in a hurry :)

The DELETE statement in SQL Server supports two FROM clauses; the first FROM identifies the table that is having rows deleted, and the second FROM clause is used for JOINS.

See: http://msdn.microsoft.com/en-us/library/ms189835.aspx

edited Jul 03 '12 at 21:39

answered Jul 03 '12 at 21:06

Stuart Ainsworth

12,792
41
46

edit your answer, I don't think is correct but the approach is right. WHY? repetition of `FROM MyTable` – codingbiz Jul 03 '12 at 21:09
Also `SELECT TOP 1 FROM` is missing something. – Aaron Bertrand Jul 03 '12 at 21:15

Deleting rows in a table a chunk at a time

4 Answers4