3

We've just started using Change Data Capture on SQL Server 2017. I'm running some tests by executing a few simple UPDATE statements then checking the CDC table.

To view the table changes, we run the CDC function cdc.fn_cdc_get_all_changes passing the row_filter_option parameter all update old so we can see the before and after of any update statements.

The results of that function are confusing and I need to understand why they display as they do before I create a report for the end-users:

In the results, every table UPDATE is represented by 2 rows - one row with operation type 3 representing the row BEFORE the UPDATE statement, and one row with operation type 4 representing the row AFTER the UPDATE statement. The confusion is related to the BEFORE row: I'm seeing many fields in this row showing as NULL when that field was definitely NOT NULL before the UPDATE. I can't find any definable pattern as to which fields are shown as NULL in the BEFORE row. The AFTER row seems to be fine and displays the record as exactly what it looks like after the UPDATE.

Can someone help me understand why the BEFORE record displays with some fields as NULL when they had values in them before the UPDATE was run?

EXAMPLE:

If I run the following query:

SELECT *
FROM dbo.Profiles
WHERE Id = 1

The result looks like this:

Id    Name      DateOfBirth   Position    Gender   Country
1     John Doe  1971-10-12    President   Male     France

Then run:

UPDATE dbo.Profiles
SET DateOfBirth = '1971-10-11'
WHERE Id = 1

Finally, run the CDC function to see the changes:

SELECT * FROM cdc.fn_cdc_get_all_changes_dbo_Profiles  
  (@from_lsn, @to_lsn, N'all update old'); 

And here is what the two results look like:

_$start_lsn _$operation Id   Name      DateOfBirth   Position   Gender  Country
001         3           1    John Doe  1971-10-12    NULL       NULL    France
002         4           1    John Doe  1971-10-11    President  Male    France

I'm trying to understand why in the first row are Position and Gender NULL?

EDIT 1:

As requested, here is the code for the function. The MS doc for this function can be found here: https://learn.microsoft.com/en-us/sql/relational-databases/system-functions/cdc-fn-cdc-get-all-changes-capture-instance-transact-sql

function [cdc].[fn_cdc_get_all_changes_dbo_Profiles]
(   @from_lsn binary(10),
    @to_lsn binary(10),
    @row_filter_option nvarchar(30)
)
returns table
return

select NULL as __$start_lsn,
    NULL as __$seqval,
    NULL as __$operation,
    NULL as __$update_mask, NULL as [Id], NULL as [Country], NULL as [DateOfBirth], NULL as [Gender], NULL as [Name], NULL as [NativeName], NULL as [PlaceOfBirth], NULL as [Position], NULL as [SynchedDateTime], NULL as [BirthName], NULL as [Name_Normalized], NULL as [Name_Parts], NULL as [BirthName_Normalized], NULL as [BirthName_Parts], NULL as [NativeName_Normalized], NULL as [NativeName_Parts], NULL as [Alias], NULL as [Children], NULL as [Fathers], NULL as [Godparents], NULL as [Mothers], NULL as [Relatives], NULL as [Siblings], NULL as [Spouses], NULL as [Stepparents], NULL as [Education], NULL as [Employer], NULL as [MemberOf], NULL as [UnmarriedPartner], NULL as [AwardReceived], NULL as [Convicted], NULL as [NotableWork], NULL as [PoliticalParty], NULL as [Addresses], NULL as [Affiliations], NULL as [DateOfDeath], NULL as [DeletedDateTime], NULL as [Status], NULL as [SyncBatch], NULL as [DeleteReason], NULL as [CountryCode]
where ( [sys].[fn_cdc_check_parameters]( N'dbo_Profiles', @from_lsn, @to_lsn, lower(rtrim(ltrim(@row_filter_option))), 0) = 0)

union all

select t.__$start_lsn as __$start_lsn,
    t.__$seqval as __$seqval,
    t.__$operation as __$operation,
    t.__$update_mask as __$update_mask, t.[Id], t.[Country], t.[DateOfBirth], t.[Gender], t.[Name], t.[NativeName], t.[PlaceOfBirth], t.[Position], t.[SynchedDateTime], t.[BirthName], t.[Name_Normalized], t.[Name_Parts], t.[BirthName_Normalized], t.[BirthName_Parts], t.[NativeName_Normalized], t.[NativeName_Parts], t.[Alias], t.[Children], t.[Fathers], t.[Godparents], t.[Mothers], t.[Relatives], t.[Siblings], t.[Spouses], t.[Stepparents], t.[Education], t.[Employer], t.[MemberOf], t.[UnmarriedPartner], t.[AwardReceived], t.[Convicted], t.[NotableWork], t.[PoliticalParty], t.[Addresses], t.[Affiliations], t.[DateOfDeath], t.[DeletedDateTime], t.[Status], t.[SyncBatch], t.[DeleteReason], t.[CountryCode]
from [cdc].[dbo_Profiles_CT] t with (nolock)    
where (lower(rtrim(ltrim(@row_filter_option))) = 'all')
    and ( [sys].[fn_cdc_check_parameters]( N'dbo_Profiles', @from_lsn, @to_lsn, lower(rtrim(ltrim(@row_filter_option))), 0) = 1)
    and (t.__$operation = 1 or t.__$operation = 2 or t.__$operation = 4)
    and (t.__$start_lsn <= @to_lsn)
    and (t.__$start_lsn >= @from_lsn)

union all   

select t.__$start_lsn as __$start_lsn,
    t.__$seqval as __$seqval,
    t.__$operation as __$operation,
    t.__$update_mask as __$update_mask, t.[Id], t.[Country], t.[DateOfBirth], t.[Gender], t.[Name], t.[NativeName], t.[PlaceOfBirth], t.[Position], t.[SynchedDateTime], t.[BirthName], t.[Name_Normalized], t.[Name_Parts], t.[BirthName_Normalized], t.[BirthName_Parts], t.[NativeName_Normalized], t.[NativeName_Parts], t.[Alias], t.[Children], t.[Fathers], t.[Godparents], t.[Mothers], t.[Relatives], t.[Siblings], t.[Spouses], t.[Stepparents], t.[Education], t.[Employer], t.[MemberOf], t.[UnmarriedPartner], t.[AwardReceived], t.[Convicted], t.[NotableWork], t.[PoliticalParty], t.[Addresses], t.[Affiliations], t.[DateOfDeath], t.[DeletedDateTime], t.[Status], t.[SyncBatch], t.[DeleteReason], t.[CountryCode]
from [cdc].[dbo_Profiles_CT] t with (nolock)     
where (lower(rtrim(ltrim(@row_filter_option))) = 'all update old')
    and ( [sys].[fn_cdc_check_parameters]( N'dbo_Profiles', @from_lsn, @to_lsn, lower(rtrim(ltrim(@row_filter_option))), 0) = 1)
    and (t.__$operation = 1 or t.__$operation = 2 or t.__$operation = 4 or
         t.__$operation = 3 )
    and (t.__$start_lsn <= @to_lsn)
    and (t.__$start_lsn >= @from_lsn)
Stpete111
  • 3,109
  • 4
  • 34
  • 74

2 Answers2

2

Type 3 CDC rows retrieve null for certain data types, eg nvarchar(max) unless that specific column was changed by the UPDATE statement. Ref: https://learn.microsoft.com/en-us/sql/relational-databases/system-functions/cdc-fn-cdc-get-all-changes-capture-instance-transact-sql?view=sql-server-ver15

  • 1
    Please add further details to expand on your answer, such as working code or documentation citations. – Community Sep 03 '21 at 05:58
1

added after code sample

So looking at the code you can see that if all the column values are null then this code was used to make that row:

select NULL as __$start_lsn,
    NULL as __$seqval,
    NULL as __$operation,
    NULL as __$update_mask, NULL as [Id], NULL as [Country], NULL as [DateOfBirth], NULL as [Gender], NULL as [Name], NULL as [NativeName], NULL as [PlaceOfBirth], NULL as [Position], NULL as [SynchedDateTime], NULL as [BirthName], NULL as [Name_Normalized], NULL as [Name_Parts], NULL as [BirthName_Normalized], NULL as [BirthName_Parts], NULL as [NativeName_Normalized], NULL as [NativeName_Parts], NULL as [Alias], NULL as [Children], NULL as [Fathers], NULL as [Godparents], NULL as [Mothers], NULL as [Relatives], NULL as [Siblings], NULL as [Spouses], NULL as [Stepparents], NULL as [Education], NULL as [Employer], NULL as [MemberOf], NULL as [UnmarriedPartner], NULL as [AwardReceived], NULL as [Convicted], NULL as [NotableWork], NULL as [PoliticalParty], NULL as [Addresses], NULL as [Affiliations], NULL as [DateOfDeath], NULL as [DeletedDateTime], NULL as [Status], NULL as [SyncBatch], NULL as [DeleteReason], NULL as [CountryCode]
where ( [sys].[fn_cdc_check_parameters]( N'dbo_Profiles', @from_lsn, @to_lsn, lower(rtrim(ltrim(@row_filter_option))), 0) = 0)

and if some of the columns are null it must be this code that was used:

select t.__$start_lsn as __$start_lsn,
    t.__$seqval as __$seqval,
    t.__$operation as __$operation,
    t.__$update_mask as __$update_mask,
    t.[Id],
    t.[Country],
    t.[DateOfBirth],
    -- etc 
    t.[Gender], t.[Name], t.[NativeName], t.[PlaceOfBirth], t.[Position], t.[SynchedDateTime], t.[BirthName], t.[Name_Normalized], t.[Name_Parts], t.[BirthName_Normalized], t.[BirthName_Parts], t.[NativeName_Normalized], t.[NativeName_Parts], t.[Alias], t.[Children], t.[Fathers], t.[Godparents], t.[Mothers], t.[Relatives], t.[Siblings], t.[Spouses], t.[Stepparents], t.[Education], t.[Employer], t.[MemberOf], t.[UnmarriedPartner], t.[AwardReceived], t.[Convicted], t.[NotableWork], t.[PoliticalParty], t.[Addresses], t.[Affiliations], t.[DateOfDeath], t.[DeletedDateTime], t.[Status], t.[SyncBatch], t.[DeleteReason], t.[CountryCode]
from [cdc].[dbo_Profiles_CT] t with (nolock)     
where (lower(rtrim(ltrim(@row_filter_option))) = 'all update old')
    and ( [sys].[fn_cdc_check_parameters]( N'dbo_Profiles', @from_lsn, @to_lsn, lower(rtrim(ltrim(@row_filter_option))), 0) = 1)
    and (t.__$operation = 1 or t.__$operation = 2 or t.__$operation = 4 or
         t.__$operation = 3 )
    and (t.__$start_lsn <= @to_lsn)
    and (t.__$start_lsn >= @from_lsn)

armed with this you can look at the [cdc].[dbo_Profiles_CT] table to figure out what is going on -- search just based on id maybe (and ignore timestamps) to see the entire history of the row. -- This might help you figure out why you are missing items.


This shows you are using a non-standard function -- so you have to show the code to that too for us to be sure (also, if you have any triggers -- we would need to see that.)

I think I can answer your question anyway. It is clear that your function has a bug -- or at least you are not interpreting the results correctly.

This is not what standard CDC looks like so your function is formatting it for you.

I'm guessing you are using a left join to things that don't have any value and when it says "NULL" what it actually means in your function is "there was no value given".

My other guess is that you are using timestamps to join (so you can see what elements where written at the "same time".) However, sometimes it takes longer to write -- so in your example the audit log does have values for Position and Gender they were just written to the log at a different time and because your join is making assumptions about timing they are not seen. This could account for the "random" results you see in thing sometimes not showing up.

Of course we would know for sure if we could see the source of the function.

Hogan
  • 69,564
  • 10
  • 76
  • 117
  • Thanks, Hogan. This function comes out-of-the-box with SQL Server as part of the Change Data Capture feature. The MS doc on it is here: https://learn.microsoft.com/en-us/sql/relational-databases/system-functions/cdc-fn-cdc-get-all-changes-capture-instance-transact-sql I've added the function code to my original post. – Stpete111 Dec 24 '19 at 16:03
  • Also, I do not see any Triggers related to this table. – Stpete111 Dec 24 '19 at 16:10
  • @Stpete111 -- it is not quite "out of the box" it is generated by the system for the change capture instance -- we can guess what it looks like but it is easier to see with the code. – Hogan Dec 30 '19 at 17:50
  • 1
    I've made some suggestions above and showed how to use that code to understand the data model. – Hogan Dec 30 '19 at 18:00