1

I have a data flow task which contains a derived column. The derived column transforms a CSV file column, lets say A which is order number, to a data type char with length 10.

This works perfectly fine when the text file column is equal to or less than 10 characters. Of course, it throws an error when column A order number is more than 10 characters.

The column A (error prone).

12PR567890 
254W895X98 
ABC 56987K5239
485P971259 SPTGER
459745WERT

I would like to catch the error prone records and extract the order number only.

I already can configure error output from the derived column. But, this just ignores the error records and processes the others.

The expected output will process ABC 56987K5239, 485P971259 SPTGER order numbers as 56987K5239, 485P971259 respectively. The process removal of unexpected characters are not important, rather how to achieve this during the run time of the derived column (stripping and processing the data in case of error).

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52
SSharma
  • 951
  • 6
  • 15
  • What is the logic to be used? How do you know that `485P971259` is the order number and not `SPTGER`? – Tab Alleman Aug 15 '19 at 12:55
  • I can't know. But for now it is enough for me to take 485P971259(which is 10 chars long) and remove SPTGER. But how do I do it during run time of derived column? – SSharma Aug 15 '19 at 13:04

1 Answers1

2

If the valid order number always starts with a number, and the length of it equal to 10. You could use Script Component (Transformation) together with Regular Expression to transform the source data.

  1. Drag and drop the Script Component as Transformation
  2. Connect the source to the Script Component
  3. From the Script Component Edit window, checked the Order from the Input columns, and make it as Read and Write
  4. In the script, add:using System.Text.RegularExpressions;
  5. The full code needs to be added in the Input process method:

    string pattern = "[0-9].{9}";
    Row.Order = Regex.Match(Row.Order, pattern).Groups[1].ToString();
    
  6. The output going to the destination should be the matched 10 characters starting with the number.

LONG
  • 4,490
  • 2
  • 17
  • 35
  • Although I was looking for something easier within Derived Column and Error Configuration, this also works and I think the derived column has no such way of doing things. Thanks :) – SSharma Aug 16 '19 at 07:03