1

Question

What is the Best way to update a column of a table of tens of millions of rows?

1) I saw creating a new table and rename the old one when finish

2) I saw update in batches using a temp table

3) I saw single transaction (don't like this one though)

4) never listen to cursor solution for a problema like this and I think it's not worthy to try

5) I read about loading data from file (Using BCP), but have not read if the performance is better or not. was not clear if it is just to copy or if it would allow join a big table with something and then bull copy.

really would like have some advice here.

Priority is performance

At the momment I'm testing solution 2) and Exploring solution 5)

Additional Information (UPDATE)

thank you for the critical thinking in here.

  • The operation be done in downtime.
  • UPDATE Will not cause row forwarding
  • All the tables go indexes, average 5 indexes, although few tables got like 13 indexes.
  • the probability of target column is present in one of the table indexes something like 50%.
  • Some tables can be rebuilt and replace, others don't because they make part of a software solution, and we might lose support to those. from those tables some got triggers.
  • I'll need to do this for more than 600 tables where ~150 range from 0.8 Million to 35 Million rows
  • The update is always in the same column in the various fields

References

BCP for data transfer

Nelssen
  • 1,023
  • 1
  • 17
  • 42

1 Answers1

0

Actually it depends:

  • on the number of indexes the table contains
  • the size of the row before and after the UPDATE operation
  • type of UPDATE - would it be in place? does it need to modify the row length
  • does the operation cause row forwarding?
  • how big is the table?
  • how big would the transaction log of the UPDATE command be?
  • does the table contain triggers?
  • can the operation be done in downtime?
  • will the table be modified during the operation?
  • are minimal logging operations allowed?
  • would the whole UPDATE transaction fit in the transaction log?
  • can the table be rebuilt & replaced with a new one?
  • what was the timing of the operation on the test environment?
  • what about free space in the database - is there enough space for a copy of the table?
  • what kind of UPDATE operation is to be performed? does additional SELECT commands have to be done to calculate the new value of every row? or is it a static change?

Depending on the answers and the results of the operation in the test environment we could consider the fastest operations to be:

  • minimal logging copy of the table
  • an in place UPDATE operation preferably in batches
Adam Leszczyński
  • 1,079
  • 7
  • 13
  • Thanks for the critical Thinking @Adam, when you talk about minimal logging you talk about "SELECT INTO"? I've updated the question with additional information. and regarding possible solutions, would you consider exporting and importing the data with bcp? is this reliable? I'm still exploring – Nelssen Feb 07 '18 at 10:13