Why does ALTER TABLE ... ALTER COLUMN ... fill the version store in TempDB

Question

I had to change a BIGINT column in a large table from nullable to non-nullable.

ALTER TABLE my.Table ALTER COLUMN myColumn BIGINT NOT NULL

Running this in our UAT and RC environments took around 3 hours with low levels of concurrent activity. Both UAT and RC are reflective of PROD so are good test platforms. 3 hours is reasonable given the size of the table and the performance of the kit.

As far as I'm aware the relevant config is snapshot_isolation_state = 0, is_read_committed_snapshot_on = 1.

The ALTER TABLE has been killed in PROD a couple of times (after running for several hours, then with a lengthy rollback) after other activity started to fail with "Transaction aborted when accessing versioned row in table 'myOther.Table' in database 'MyDatabase'. Requested versioned row was not found. Your tempdb is probably out of space. Please refer to BOL on how to configure tempdb for versioning." errors.

When running this in PROD for the third time I arranged for all other activity to be shutdown. After around 4 hours it was clear that something was not working. Using the initial query in Troubleshooting tempdb growth due to Version Store usage I could see that the version store was most of TempDB, but the ALTER TABLE connectionwas not blocked, CPU & IO were increasing slowly so I was confident it was alive, the only wait I saw was SOS_SCHEDULER_YIELD. There were no other non-trivial connections.

After another couple of hours I decided to add some space to TempDB. The ALTER TABLE finished very soon afterwards.

Can someone explain why the ALTER TABLE stalled? I could understand if there was another connection referencing the old (un-ALTERed) rows in my.Table but this definitely wasn't the case.

How many records does your supposedly "my.Table" contains in those environments? — Buhake Sindi, Jul 19 '17 at 12:47
@BuhakeSindi - about 200m rows, average record size about 50 bytes. — Rhys Jones, Jul 19 '17 at 14:39

sepupic · Answer 1 · 2017-07-19T21:24:59.957

Altering a column from nullable to not nullable causes the new column to be created, the operation is fully logged, and also causes row versions to be produced if you use RCSI.

You can check this topic for more info: Why does ALTER COLUMN to NOT NULL cause massive log file growth?

Reguarding

I could understand if there was another connection referencing the old (un-ALTERed) rows in my.Table but this definitely wasn't the case.

You misunderstand how RSCI works.

As soon as the transition to RCSI is completed, every update will generate row versions independent of the fact there are or there aren't othet transactions that are interested in those rows

When either the READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION database options are ON, logical copies (versions) are maintained for all data modifications performed in the database. Every time a row is modified by a specific transaction, the instance of the Database Engine stores a version of the previously committed image of the row in tempdb. Each version is marked with the transaction sequence number of the transaction that made the change. The versions of modified rows are chained using a link list. The newest row value is always stored in the current database and chained to the versioned rows stored in tempdb.

Understanding Row Versioning-Based Isolation Levels

Or more clearly it's written here:

When either the READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION database options are ON, update and delete transactions for a particular database must maintain row versions even when there are no transactions using a row versioning-based isolation level. Constructing a consistent snapshot of data using row versions involves system resources (CPU and memory), and potentially generates I/O activity. Because the record versions are stored in tempdb, performance is better and the number of issued I/Os is lower when more tempdb pages can be stored in memory for row versioning.

As you imagine, ALTER TABLE operates within 1 transaction, so that row versions are alive for all the duration of this transaction (they could live even more, until a statement interested in them was executing, but since no one was interested in, the minimum "life expectation" is the duration of the owning transaction) ...................................................................................

UPDATED:

I tried to reproduce the issue on SQL Server 2012:

I set tempdb autogrowth to 0 (tempdata set to 10Mb, templog to 1Mb) and created a new database of 20Mb data file + 10 Mb log file, simple recovery model, and created a table dbo.Nums filled up with 1000000 integers (bigint, null) this way:

select top 1000000 row_number() over(order by 1/0) as n
into dbo.Nums
from sys.all_columns c1 cross join sys.all_columns c2;

Then I did a checkpoint and alter a column from null to not null:

alter table dbo.nums alter column n bigint not null

This took 0 seconds, my table size was about 16Mb prior to this action and it remains about 16Mb, no log file growth, and what went to log file I'll show in the picture.

Then I dropped the table, recreate it and altered my db:

alter database rcsi set read_committed_snapshot on;

And did exactly the same thing: checkpoint + alter table + select from sys.fn_dblog()

I had to wait for 5 minutes, but tempdb gives no error. There was PREEMPTIVE_OS_GETDISKFREESPACE as a wait type during the statement execution, but guess what it was. It was not tempdb (that was only 10Mb + 1Mb and remains the same as I limited it size), it was the LOG FILE of my user database, that just for changing the data type from nullable to not nullable UNDER RCSI, has grown to 1Gb (!!!!)

1Gb of log for changing nullability of 1 column of the table that was 16Mb only And all the time I was waiting not for tempdb growth but for zering out 1 Gb for my db log file.

I attach the picture of what went to log during the same operation under RC and RCSI, so you can see that producing row versions cost much more to user database than to tempdb, so I think the hours you was waiting were spent to log row versions to your database log file (they are not logged in tempdb at all)

Becides COPY_VERSION_INFO, there were many row modifications that may not be your case: my rows have got a new 14-byte row version tag so there were too many changes made to that table because I have changed Isolation Level just before changing nullability, but the main impact in my case was produced by user db log file growth and not by tempdb that did not grow at all.

P.S. Maybe you'd better move this question to dbaexchange?

The first link relates to the transaction log which is not what my question is about. The 2nd link is good, but the article also states "When tempdb is full, update operations will stop generating versions and continue to succeed..." in which case why did my ALTER TABLE stall? — Rhys Jones, Jul 19 '17 at 14:28
The first link contains a repro code that shows that a new column was created, existing data transfered to it, the old column is deleted. as you asked "Can someone explain what happened?" I provided a link that not only explaines but shows what happened — sepupic, Jul 19 '17 at 14:34
Apologies - I've clarified the final question by improving the language. — Rhys Jones, Jul 19 '17 at 14:50
You are right in that the ALTER TABLE should not have problems if it could not generate versions anymore, but I'm not sure the problem were these row versions. What was in sys.dm_os_waiting_tasks when the operation was executing? — sepupic, Jul 19 '17 at 14:52
I didn't look at sys.dm_os_waiting_tasks but never saw anything other than SOS_SCHEDULER_YIELD for lastwaittype in sysprocesses. My IO is not very good so if there was much IO going on I would have expected to see some IO waits in lastwaittype. — Rhys Jones, Jul 19 '17 at 15:38

Why does ALTER TABLE ... ALTER COLUMN ... fill the version store in TempDB

1 Answers1