I'm developing a series of ETL procedures for a migration project using SSIS/DataTool 2010. I have multiple cases where I have to decode source values with new target ones, i.e in Customers addresses I have the states with "full names" and I have to extract them with state iso code (i.e. from California to CA). to do this I have saved in the solution folders an excel file with a series of sheets with decoding info (source value and destination value), so in case I need to add or modify decoding values I just have to manipulate the file without touching the query or ETL. At now the best way I found to manage this in SSIS is:
- sort the stream with the main extraction (i.e DB source with the query extracting the customers and their addreses) on the field with the data to be decoded
- sort the stream pointing at the excel files that manages the decoding mapping, by the column with values to be transcoded
- merge join (outer) the 2 stream on the fields to be decoded so to have starting values and decoded values in the same stream
- handle a "derived column" step to apply the logic: "replace the starting value column with the one with decoded values, unless the starting value has no target value"
this way works but it's very intricate and hard to "replicate" (copied and adapted) for the other (many) decoding needed to handle), can anyone suggest a better and quicker way to handle the scenario I described? thanks in advance,
all details provided in the description