11

I want to alter a field from a table which has about 4 million records. I ensured that all of these fields values are NOT NULL and want to ALTER this field to NOT NULL

ALTER TABLE dbo.MyTable
ALTER COLUMN myColumn int NOT NULL

... seems to take forever to do this update. Any ways to speed it up or am I stuck just doing it overnight during off-hours?

Also could this cause a table lock?

RichardTheKiwi
  • 105,798
  • 26
  • 196
  • 262
Chris Klepeis
  • 9,783
  • 16
  • 83
  • 149
  • 1
    Personally I would never make a change to table structure in an existing large table at any time except off peak hours. Even if it is fairly quick, it can cause users in the process of doing things at the momment you make the change to have problems. Any major change is best done in single user mode as well. Far better to have a blacked out scheduled maintenance period when users can't do anything (Announced in advance of course and during non peak hours), than to have unhappy users who were in the middle of doing something who are getting errors. – HLGEM Jul 20 '09 at 14:11
  • Is the column you are changing involved in FK constraints? – onupdatecascade Jul 22 '09 at 16:44
  • 1
    From a quick test in profiler it takes a `Sch-M` lock on the table [which is basically incompatible with everything](http://msdn.microsoft.com/en-us/library/ms186396.aspx). It then has to read every page to determine that all rows validate. – Martin Smith May 18 '11 at 18:44
  • Is the column int before the change? – gbn May 19 '11 at 19:36

3 Answers3

4

You can alter a field and make it not null without it checking the fields. If you are really concerned about not doing it off hours you can add a constraint to the field which checks to make sure it isn't null instead. This will allow you to use the with no check option, and not have it check each of the 4 million rows to see if it updates.

CREATE TABLE Test
(
    T0 INT Not NULL,
    T1 INT NUll 
)

INSERT INTO Test VALUES(1, NULL) -- Works!

ALTER TABLE Test
    WITH NOCHECK
        ADD CONSTRAINT N_null_test CHECK (T1 IS NOT NULL)

    ALTER COLUMN T1 int NOT NULL 

INSERT INTO Test VALUES(1, NULL) -- Doesn't work now!

Really you have two options (added a third one see edit):

  1. Use the constraint which will prevent any new rows from being updated and leave the original ones unaltered.
  2. Update the rows which are null to something else and then apply the not null alter option. This really should be run in off hours, unless you don't mind processes being locked out of the table.

Depending on your specific scenario, either option might be better for you. I wouldn't pick the option because you have to run it in off hours though. In the long run, the time you spend updating in the middle of the night will be well spent compared the headaches you'll possibly face by taking a short cut to save a couple of hours.

This all being said, if you are going to go with option two you can minimize the amount of work you do in off hours. Since you have to make sure you update the rows to not null before altering the column, you can write a cursor to slowly (relative to doing it all at once)

  1. Go through each row
  2. Check to see if it is null
  3. Update it appropriately. This will take a good while, but it won't lock the whole table block other programs from accessing it. (Don't forget the with(rowlock) table hint!)

EDIT: I just thought of a third option: You can create a new table with the appropriate columns, and then export the data from the original table to the new one. When this is done, you can then drop the original table and change the name of the new one to be the old one. To do this you'll have to disable the dependencies on the original and set them back up on the new one when you are done, but this process will greatly reduce the amount of work you have to do in the off hours. This is the same approach that sql server uses when you make column ordering changes to tables through the management studio. For this approach, I would do the insert in chunks to make sure that you don't cause undo stress on the system and stop others from accessing it. Then on the off hours, you can drop the original, rename the second, and apply dependencies etc. You'll still have some off hours work, but it will be minuscule compared to the other approach.

Link to using sp_rename.

kemiller2002
  • 113,795
  • 27
  • 197
  • 251
  • 4
    If you use NO CHECK, then the constraint will not be trusted and can't be used by the query optimizer. See http://sqlblog.com/blogs/tibor_karaszi/archive/2008/01/12/non-trusted-constraints-and-performance.aspx – Shannon Severance Jul 20 '09 at 21:47
  • 3
    Also, the NOCHECK keyword doesn't apply to NULL/NOT NULL. It only applies to constraints that are part of the CONSTRAINT clause. – Tom H May 18 '11 at 17:49
  • From the question the OP has already ensured that their column does not contain any `NULL` values so not sure of the relevance of the 2nd part of your answer? – Martin Smith May 19 '11 at 12:20
  • @martin, he may have ensured, it, but the next person reading the answer may not. It's a small amount of space to cover the other scenario, and if the op made a mistake and has null values, then he has more information to make a decision on. – kemiller2002 May 19 '11 at 12:26
4

The only way to do this "quickly" (*) that I know of is by

  • creating a 'shadow' table which has the required layout
  • adding a trigger to the source-table so any insert/update/delete operations are copied to the shadow-table (mind to catch any NULL's that might popup!)
  • copy all the data from the source to the shadow-table, potentially in smallish chunks (make sure you can handle the already copied data by the trigger(s), make sure the data will fit in the new structure (ISNULL(?) !)
  • script out all dependencies from / to other tables
  • when all is done, do the following inside an explicit transaction :
    • get an exclusive table lock on the source-table and one on the shadowtable
    • run the scripts to drop dependencies to the source-table
    • rename the source-table to something else (eg suffix _old)
    • rename the shadow table to the source-table's original name
    • run the scripts to create all the dependencies again

You might want to do the last step outside of the transaction as it might take quite a bit of time depending on the amount and size of tables referencing this table, the first steps won't take much time at all

As always, it's probably best to do a test run on a test-server first =)

PS: please do not be tempted to recreate the FK's with NOCHECK, it renders them futile as the optimizer will not trust them nor consider them when building a query plan.

(*: where quickly comes down to : with the least possible downtime)

deroby
  • 5,902
  • 2
  • 19
  • 33
  • I had been toying with this idea for the past week, but in our specific situation we saw problems because there could be multiple column changes that don't know about each other, so dynamically creating the "instead of" triggers would be impossible (this is all to be automated). This morning it occurred to me that it could be table driven and I was going to post the answer here only to find that you beat me to it :) – Tom H May 23 '11 at 13:47
2

Sorry for the discouragement, but:

  • Any ways to speed it up: No, not if you want to change the table structure itself
  • or am I stuck just doing it overnight during off-hours? Yes, and that's probably for the best, as @HLGEM pointed out
  • Also could this cause a table lock? Yes

Not directly relevant to you (because it's about going from NOT NULL to NULL), but interesting read on this topic: http://beyondrelational.com/blogs/sankarreddy/archive/2011/04/05/is-alter-table-alter-column-not-null-to-null-always-expensive.aspx

And finally some ancient history - on an equivalent question in a forum in 2005, the same suggestion was made as @Kevin offered above - using a constraint insteadof making the column itself non-nullable: http://www.sqlteam.com/Forums/topic.asp?TOPIC_ID=50671

Tao
  • 13,457
  • 7
  • 65
  • 76