I'm having a little trouble following the article on how to use the CDC Control Task. Specifically, I seem to be unable to process the initial load in such a way that the subsequent incremental load is seamless (that is, no gap and no overlap) with the initial load. Unfortunately, I don't have the luxury of a quiesced database (i.e. there will be active changes while I'm doing the initial load). Here's what I've tried:
In all cases, my incremental load is simple: a CDC control task with the operation set as "Get processing range", a data flow task what has within it a CDC source and an ADO.NET destination, and another CDC control task whose operation is "mark range processed".
For the initial load, I've tried the following two scenarios:
A CDC control task in which the operation is set to "Mark CDC start", using a database snapshot that I created specifically for this task. The only other task is a data flow task that has within it an ADO.NET source that reads from the change table directly and an ADO.NET destination. In this scenario, the initial load runs fine but the subsequent incremental load fails saying that the starting LSN for the processing range is greater than the ending LSN.
The other initial load that I've tried has a CDC control task whose operation is set to "Mark initial load start", the same data flow as above (but this time, out of the live database instead of a database snapshot), and another CDC control task whose operation is "Mark initial load end". In this scenario, I get duplicate CDC records processed when I run the incremental load.
What am I missing?