AWS DMS limitation when using SQL Server as a source - replicating rows with more than 8000 bytes of information

Question

I am thinking of creating a read only DB in Amazon RDS, using their DMS to replicate the data from an on-site database. On the Amazon DMS list of general limitations, one of them are:

Changes to rows with more than 8000 bytes of information, including header and mapping information, are not processed correctly due to limitations in the SQL Server TLOG buffer size.

In what circumstances would this apply? I was under the impression that a table can contain a maximum of 8,060 bytes per row. I understand varchar, nvarchar, varbinary, sql_variant, or CLR user-defined types are pushed into another row with a pointer in its place. Is that what this refers to? I am trying to understand any situation where this would occur.

Also, how can I determine which, if any, tables would be affected?

This text looks like hogwash to me. The limitation would be from Amazon's own technology, not SQL Server. In particular, obviously SQL Server allows updates to rows that take up more bytes through overflow storage, and the transaction log has no problem with that. The author probably meant "due to the way we suck the data from the transaction log, we can't do this if your row change takes up more than a page", but it is not clear what the actual practical impact is -- if only some updates or *all* updates to some tables won't replicate. — Jeroen Mostert, Sep 06 '17 at 11:58
That may very well be true. To be clear, I am not saying rows with those data types won't be replicated. I am simply trying to find out in what circumstances the above applies. If a row cannot exceed 8060 bytes, why is this even a limitation of AWS DMS? — TheTor, Sep 06 '17 at 13:19
It's not talking about the row size, it's talking about limitations of the transaction log buffers (whatever those may be). In-row data cannot exceed 8060 bytes, but that may have no relevance on the limit discussed here. If you want a definitive answer, asking Amazon directly is probably more constructive than waiting for an expert to show up on SO. The "replication" it is talking about here has no relation to the replication feature in SQL Server. — Jeroen Mostert, Sep 06 '17 at 13:50
I am not blaming SQL server for any deficiencies. I am simply trying to understand what the limit actually is in a practical situation. Is there any input from anyone with experience of this specific issue? In what situation can this occur, and how did you identify and resolve it? — TheTor, Sep 06 '17 at 14:25

score 1 · Accepted Answer · answered Sep 13 '17 at 10:09

OK, it seems I can answer my own question now, at least in one use case.

It's not relating to the possible size of the combined columns, but as stated, the amount of data in them. Create a table with an id, and 9 varchar(1000) columns. Fill the first 8 with 1000 bytes, and all is replicated as it should. Fill the remaining column with data, and the update is not replicated (presumably as the column data is pushed off page). Strangely, varchar(max) does not seem to be affected in this way.

To locate tables with rows affected by this, run the following, paying attention to the MaximumRecordSize column:

dbcc showcontig with tableresults

A way around the problem is to split your offending tables into smaller related tables (a.k.a normalizing it), and performing joins to retrieve the data. This may or may not be feasible depending on the size of your database.

AWS DMS limitation when using SQL Server as a source - replicating rows with more than 8000 bytes of information

1 Answers1