I've been working with SSIS 2016 to create some data flows. One of my objectives is to be able to capture any truncation issues and populate an error table highlighting where truncation is occurring. Part of this is due to us importing JSON data and there is a possibility some of the fields may contain data longer than what we already have set up in our database.
Just to give you some background, the simplified flow we have is:
- Bulk insert JSON into a table
- Create a landing view on top extracting JSON paths into a view structure
- Create a data flow that flows the view data into the production table
- Highlight any truncation that occurs between landing to production
Because of the way SQL Server handles JSON, the fields have a length of MAX, when creating the view we set the field length in our CROSS APPLY, as an example we set RECIPIENT_NAME to VARCHAR(10) in our production table, however if RECIPIENT_NAME were to have a row where it was length 15 in the JSON file, when creating this view we wouldn't know that the 15 was truncated to a 10. To get around this, we set the RECIPIENT_NAME length to 11 in the landing view cross apply and when rows move from landing to production RECIPIENT_NAME goes from an 11 to a 10 and if at this point there was indeed any truncation SSIS would highlight it and I can capture the row down down the red error path.
In SSIS I have an OLEDB Source which points to the landing view and an OLEDB Destination that points to the production table. I also have an error flow out of the Source into an error table where each column has been set to redirect row on truncation.
At this point SSIS warns me truncation may occur when validation the destination task, with a yellow triangle on the destination task:
[PRD_SYSTEM [189]] Warning: Truncation may occur due to inserting data from data flow column "RECIPIENT_NAME" with a length of 11 to database column "RECIPIENT_NAME" with a length of 10.
When testing this with some sample data, I purposely put a row with 20 characters in the JSON field for RECIPIENT_NAME. This does not get flagged as a truncation issue and proceeds to populate the production table albeit truncated.
However, if I right click on the source > Show Advanced Editor > Input and Output Properties > OLEDB Source Output and expand Output Columns and modify 'Length' under Data Type Properties for the RECIPIENT_NAME column (initially an 11) to what it should be in the production table, a 10 and then rerun the process, the row does indeed get redirected to the error table!
Interestingly the warning changes, with a yellow triangle on the source task instead:
[LND_VW_SYSTEM [81]] Warning: Truncation may occur due to retrieving data from database column "RECIPIENT_NAME" with a length of 11 to data flow column "RECIPIENT_NAME" with a length of 10.
Now my understanding was that SSIS should capture truncation as I have set it up and not have to edit individual columns in the advanced editor - is this intended behaviour given what I have described or have I missed something out?