1

I'm fairly proficient in SSIS, but by no means an expert. I have been tasked with importing a fairly complicated fixed width flat file with header and trailer rows. Now, without the header and trailer rows I feel confident I could do this fairly easily. However, I am stumped on how to do this with the extra rows. This is a simplified version for this example. First row is header row, last row trailer row between is any amount of data.

0 201308190102
1 123412341234   MR            JOE              BLOGGS                19810529
1 123412341235   MS            SUE              BLOGGS                19810516
9 201308190102

Columns are in header row : action type 0 = header, today's date YYYYMMDD, sequence number 01, expected row count 02

data row : action type 1,2,3 dependant on create, update, delete, card number 123412341234, title, firstname, lastname, date of birth trailer row : action type 9 = footer, today's date, sequence number, actual row count.

I'll need to do validation checks on the date, sequence number and row count of the files.

Now, I'm not sure how to set this up in the flat file connection manager as the header/trailer detail is in the same position as the card numbers.

So my thinking is split the header and trailer rows out by row number and end of file or something? Would I need to split them both out in a script task? Would I need to setup the flat file connection line by line and split out the columns by script task?

halfer
  • 19,824
  • 17
  • 99
  • 186
jhowe
  • 10,198
  • 19
  • 48
  • 66
  • Headers can be skipped. Footers, unfortunately throw a spanner into the works. How many rows would you guesstimate the file to average? – billinkc Aug 19 '13 at 22:30
  • And the expected row count in header and footer, that is exclusive of the header and footer? Mainframe folks I dealt with always used *inclusive* – billinkc Aug 19 '13 at 22:31
  • hi billin, the system i'm creating would need to deal with creating new records, updating records, marking records as inactive anything up to and over 100000 records in one file... – jhowe Aug 20 '13 at 10:14

2 Answers2

2

If you can confidently determine the header and footer rows from the data, I suggest using a conditional split. For example, in a past project the first two characters of the header and footer rows were "HR" and "TR" respectively.

With that convention I was able separate the rows I actually needed to process. In a separate path I did some validation based on header row data. Finally, I merged the two paths as I used the date in the header for a processing date column in destination table.

enter image description here

halfer
  • 19,824
  • 17
  • 99
  • 186
Mike Henderson
  • 1,305
  • 15
  • 27
  • 1
    hi thanks for your help, so how do i setup the flat file connection manager? if i split it out by columns it seems to be out of alignment in the preview... – jhowe Aug 20 '13 at 15:02
  • What I had in mind was not splitting up columns, but sending each row down a different path depending on the data. So the header-footer data goes in one bucket, row data in another. It's a bit more complicated than I am letting on. But once the data is split apart, then one can do validations and transformations, and then load based on validation results. – Mike Henderson Aug 21 '13 at 00:16
  • ok so again how do i setup the flat file connection manager? do i need to parse everything as one column and split out with a script component? As i don't see how it will work defining columns in manager as everything is out of alignment after header row... – jhowe Aug 21 '13 at 08:15
  • I set up a normal flat file connection with one column set to the width of the row. Then I use the `Derived Column` transformation to slice up individual columns using a `SUBSTRING` function in the expression. – Mike Henderson Aug 21 '13 at 12:16
0

If you have some experience coding in c# I would recommend a scrip task. I ran into the same problem while parsing some license files and it seemed to be the only way to parse more complex files correctly.

If you do a scrip task you can bypass the flat file connection etirely. Here's a brief how to:

First add a String parameter to your package. In your scrip task options add your parameter as a readonly variable. Then access it with this snippet

byte[] empty = new byte[] { };
    base.PreExecute();
    //Outside parameter for file path
    //Check for the parameter existance
    IDTSVariable100 filePathVariable; //This is the var that will hold the parameter
    try
    {
        filePathVariable = this.ReadOnlyVariables["ParameterName"];
    }
    catch (Exception)
    {
        this.Log(@"The package configuration is invalid. The variable / parameter ""ParameterName"" is missing. 
            It has to be available and be of type string and give the path to the file to be Imported", 0, empty);
        throw;
    }
    string filePath = filePathVariable.Value.ToString();

Now use something like this to read the file and split it into lines..

try
    {
        using (StreamReader reader = new StreamReader(filePath))
        {
            //Read file and split into lines
            string fileStream = reader.ReadToEnd();
            fileLines = fileStream.Split('\n');
        }
    }
    catch (Exception e)
    {
        this.Log(@"Error reading file into list of strings. Reason: " + e.Message, 0, empty);
        throw;
    }

Put all of the above in the PreExecute section of your script task and what ever transformations you want to each line. Then create an array or list of lines and output it like this:

  public override void CreateNewOutputRows()
{
    for (int i = 0; i < lineCount; i++)
    {
        ItemsBuffer.AddRow();
        ItemsBuffer.ColumnName = CreatedLineList[i];
    }
}

if you have any questions about it let me know

TsSkTo
  • 1,412
  • 13
  • 25