1

I need to create a SSIS project that loads daily batches of 150 files into a SQL Server database. Each batch always contains the same 150 files and each file in the batch has a unique name. Also each file can either be a full or incremental type. Incremental files have one more column than the full files. Each batch contains a control file that states if a file is full or incremental. See example of a file below:

Full File

|  SID |  Name  |  DateOfBirth  |
|:---: | :----: | :-----------: |
| 1    | Samuel |   20/05/1964  |
| 2    | Dave   |   06/03/1986  |
| 3    | John   |   15/09/2001  |

Incremental File

|  SID |  Name  |  DateOfBirth  |  DeleteRow   |
|:---: | :----: | :-----------: | :----------: |
| 2    |        |               |  1           |      
| 4    | Abil   |   19/11/1993  |  0           |
| 5    | Zainab |   26/02/2006  |  0           |

I want to avoid creating 2 packages (full and incremental) for each file.

Is there a way to dynamically generate the column list in each source/destination component based on the file type in the control file? For example, when the file type is incremental, the column list should include the extra column (DeleteRow).

Alex
  • 4,885
  • 3
  • 19
  • 39
MayowaO
  • 380
  • 4
  • 12
  • I honestly searched and even tried myself. I couldn't do it easily. It would be better if you can modify source to have same number of columns. ETL are meant to do static jobs. Kindly post solution if you find any, in future. – Prabhat G Jul 25 '17 at 13:19
  • Thanks. I will share any solution I find. – MayowaO Jul 25 '17 at 13:35
  • Check the answer I posted. Ask for more details, if needed. Though its straight. – Prabhat G Jul 25 '17 at 14:24

2 Answers2

1

I can think of two solutions.

1) Have a script task at the beginning of the package that looks to see if this is an incremental load or a full load. If it is a full load, have it loop through all the files and add a "DeleteRow" column with all zeros to every file. Then you can use the same column list.

2) Use BiML to dynamically generate your package at run time based on the available metadata.

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52
  • Thanks @Tab. This would involve extra processing of the the files (1) or the packages (2) at run time. Wouldn't this take long to run and consume resources? I could give it a try – MayowaO Jul 25 '17 at 14:31
  • It does add some overhead in time and resources, but probably not much. It's a trade-off for not having to write 2 dataflows for each file. – Tab Alleman Jul 25 '17 at 14:55
  • You don't need 2 data flow for each file. Check my solution and logic. Hope it's worth trying – Prabhat G Jul 25 '17 at 17:30
1

Let's assume my ControlFile.xlsx is :

Col1        Col2
File1.xlsx  Full
file2.xlsx  Incremental

Flow:

1.Create a DFT where ControlFile.xlsx is captured in an object variable. Source : Control connection, Destination : RecordSet Destination

  1. Pass this object variable in ForEach loop. ResultSet variable should be capturing Col2 of ControlFile.xlsx.

  2. Create a Sequence container just for a start point. Add 2 DFD for full load and incremental load. Use the constraints (as shown in below image) to decide which DFD will run.

  3. Inside DFD, use excel source to OLEDB destination.

  4. Use FilePath variable for connection property in Full load and incremental excel connections to make it dynamic.

Step1: overall image

img_main

Step2: In DFT - read control file, you read the FlowControl.xlsx to save it RecordSet destination, into RecordOutput variable

img2

Step3: Your precedence constraints should look like below image("Full" for full load, "Incremental" for incremental load ) :

img3

Use the source and destination connections as shown in first image. It's a bit hard to explain all the steps, but flow is simple.

one thing to notice is, you have additional column in Incremental, hence you'll need to use 'Derived Column' in your full load for correct mapping.

Also, make sure DelayValidation property is set to true.

For each loop container uses For each ADO Enumerator. Following images describe the properties :

img

AND

imgx

Prabhat G
  • 2,974
  • 1
  • 22
  • 31