0

I receive a Unicode text flat-file in which one column is a single fixed-length value, and the other contains a list values delimited by a vertical pipe '|'. The length of the second column and the number of delimited values it contains will vary greatly. In some cases the column will be up to 50000 characters wide, and could contain a thousand or more delimited values.

Input file Example:

[ObjectGUID]; [member]
{BD3481AF8-2CDG-42E2-BA93-73952AFB41F3}; CN=rGlynn SrechrshiresonIII,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp
{AC365A4F8-2CDG-42E2-BA33-73933AFB41F3}; CN=reeghler Johnson,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp|CN=rCoefler Cellins,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp|CN=rDasije M. Delmogeroo,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp|CN=rCurry T. Carrollton,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp|CN=yMica Macintosh,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp

My idea is to perform a Split operation on this column and create a new row for each value. I am attempting to use a script component to perform the split.

The width of the delimited column can easily exceed the 4000 character limit of DT-WSTR, so I chose NTEXT as the datatype. This presents problem because the .Split method I am familar with requires a string. I am attempting to convert the NTEXT to a string in the script component.

Here is my code:

public override void Input0_ProcessInputRow(Input0Buffer Row)

{
var stringMember = Row.member.ToString();
    var groupMembers =  stringMember.Split('|');
foreach (var groupMember in groupMembers)
{
    this.Output0Buffer.AddRow();
    this.Output0Buffer.objectGUID = Row.objectGUID;
    this.Output0Buffer.member = groupMember;
}

}

The output I am trying to get would be this:

[ObjectGUID]                                [member]
{BD3481AF8-2CDG-42E2-BA93-73952AFB41F3}     CN=rGlynn SrechrshiresonIII,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp
{AC365A4F8-2CDG-42E2-BA33-73933AFB41F3}     CN=reeghler Johnson,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp
{AC365A4F8-2CDG-42E2-BA33-73933AFB41F3}     CN=rCoefler Cellins,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp
{AC365A4F8-2CDG-42E2-BA33-73933AFB41F3}     CN=rDasije M. Delmogeroo,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp
{AC365A4F8-2CDG-42E2-BA33-73933AFB41F3}     CN=rCurry T. Carrollton,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp
{AC365A4F8-2CDG-42E2-BA33-73933AFB41F3}     CN=yMica Macintosh,OU=Users,OU=PRV,OU=LOL,DC=ent,DC=keke,DC=cqb,DC=corp

But what I am in fact getting is this:

[ObjectGUID]                               [member]
{BD3481AF8-2CDG-42E2-BA93-73952AFB41F3}    Microsoft.SqlServer.Dts.Pipeline.BlobColumn
{AC365A4F8-2CDG-42E2-BA33-73933AFB41F3}    Microsoft.SqlServer.Dts.Pipeline.BlobColumn

What might I be doing wrong?

osboy1
  • 124
  • 9
  • 1
    `ntext`, `text`, and `image` data types will be removed in a future version of SQL Server. Avoid using these data types in new development work, and plan to modify applications that currently use them. Use `nvarchar(max)`, `varchar(max)`, and `varbinary(max)` instead. [See details here](http://msdn.microsoft.com/en-us/library/ms187993.aspx) – marc_s Oct 05 '17 at 04:59
  • The source of the data is a text file. The ultimate destination of the individual delimited members will be a varchar(256) column. However, I need a datatype to use for the text processing. I chose NTEXT because I don't know the width of the column in the text file. – osboy1 Oct 05 '17 at 13:04
  • @marc_s I think I should clarify my comment to your response. Your information about the upcoming removal of the datatypes in future versions of SQL Server is informative and helpful in general. What I should have asked is, which internal SSIS datatype would be appropriate to use for the SSIS data flow components, to move the long textual data through the pipeline, if not DT_TEXT and DT_NTEXT? – osboy1 Oct 05 '17 at 15:30

1 Answers1

0

The following code worked:

public override void Input0_ProcessInputRow(Input0Buffer Row)
    {

            var blobLength = Convert.ToInt32(Row.member.Length);
            var blobData = Row.member.GetBlobData(0, blobLength);
            var stringData = System.Text.Encoding.Unicode.GetString(Row.member.GetBlobData(0, Convert.ToInt32(Row.member.Length)));
            var groupMembers = stringData.Split('|');
            foreach (var groupMember in groupMembers)
                {
                    this.Output0Buffer.AddRow();
                    this.Output0Buffer.CN = Row.CN;
                    this.Output0Buffer.ObjectGUID = Row.ObjectGUID;
                    this.Output0Buffer.member = groupMember;
            }
    }

I was trying to perform an implicit conversion as I would in PowerShell, but was actually just passing some object metadata to the string output. This method properly splits my members and builds a complete row.

osboy1
  • 124
  • 9