0

I am at a loss of what to do as I am able to fully read in a column with a Flat File Source, but if certain things are avaiable, I need to split those into a separate column.

Example:

line 1)  2013-08-23 14:03:17 ipaddress:port @client POST /api func_0() result(0)

line 2) 2013-08-23 14:03:18 ipaddress:port @client POST /api/logout LOGOUT  
  (lm=local,haspid=randomnumbers,feat=0,sess=0000007E,duration=8400) result(0)

line 3)  2013-08-23 14:03:18 ipaddress:port @client POST /api/logout LOGOUT  
  (lm=local,haspid=randomnumbers,feat=1,sess=0000007D,duration=8408) result(0)

(wrapped for legibility: each of these three is really one long line)

I need to do some sort of derived column to split out the contents of sess= and duration= into their own separate columns, but as you can see, sometimes the function column is different and contains an empty func_0, or some other type of function, so it couldn't be done by delimiting it in the Flat file source read-in.

Any ideas?

Floris
  • 45,857
  • 6
  • 70
  • 122
Nick
  • 662
  • 2
  • 7
  • 21

2 Answers2

1

I'd use a script transform here. You could use a regEx or simple string.contains to see if you have data in the /api column.

Then use split() or regEx to pull the data out of sess= and duration= & write it into new columns.

You could create the new columns in the script transform but personally I like to use a derived column transform BEFORE the script task to create the new columns - e.g. sess & duration.

You then have the columns created for all rows & just need to use the script task to add data to the sess & duration columns if data exists in the /api column.

I hope this makes sense!

SinisterPenguin
  • 1,610
  • 15
  • 17
  • This is what I pretty much figured out, but I'm taking a slightly different route. I'm using a script transform and am using substrings to split everything up. I have the session ID, but am currently working on the duration as it can be different based on the amount of time the user is using it, which is posing a slight problem since it can vary in size from 1 to 4580, and so on. – Nick Aug 27 '13 at 14:30
  • I figured it out. Thank you. – Nick Aug 27 '13 at 14:41
0

I had a similar requirement that involved reading some files and looking for tolken=value pairs that might or might not have been present. I could only accomplish it by using a scrip task and writing a simple parser. I explained it a little bit in this question. ssis import fixed width flat file with header and trailer rows

I can provide some sample code if you want to go in this direction. Let me know if you have any questions.

Community
  • 1
  • 1
TsSkTo
  • 1,412
  • 13
  • 25