2

So I'm making a stored procedure with its end-goal being a dynamic ETL solution. My company deals with a lot of third-party data and often times I do not know the number of columns, data types, format of the data, etc... As such, I've put together a number of temporary tables with dynamic sql and bulk insert statements to get the data into SQL Server. Currently, the data comes as a single column nvarchar field that is tab or pipe separated, and there's often upwards of 100k rows per txt or csv file. An example of the aforementioned csv/txt format is below:

RawSingleLine
9XX01 No Cancelled Inadvertent Approval 1/12/2015 432115.2 99   
480X1 No Cancelled Pending Processing 1/7/2014 5060 27.5    

My current solution is to use a cursor and the below Split function to loop through all the single rows, split them, pivot them, and then insert into one of my dynamic temporary tables. However, I'd like to avoid a cursor because they're expensive, and a set based operation would be preferred.

CREATE FUNCTION [dbo].[udf_Split]
(
    @String NVARCHAR(4000),
    @Delimiter NCHAR(1)
)
RETURNS TABLE
AS
RETURN
(
    WITH Split(stpos,endpos)
    AS(
        SELECT 0 AS stpos, CHARINDEX(@Delimiter,@String) AS endpos
        UNION ALL
        SELECT endpos+1, CHARINDEX(@Delimiter,@String,endpos+1)
            FROM Split
            WHERE endpos > 0
    )
    SELECT 'Id' = ROW_NUMBER() OVER (ORDER BY (SELECT 1)),
        'Data' = SUBSTRING(@String,stpos,COALESCE(NULLIF(endpos,0),LEN(@String)+1)-stpos)
    FROM Split
)

Is there a way to achieve what I want without a cursor?

dckuehn
  • 2,427
  • 3
  • 27
  • 37
Alexus Wong
  • 347
  • 4
  • 9

1 Answers1

0

Disclaimer: I'm the owner of the project Eval SQL.NET

This solution allow to split and pivot 100k rows in around 2-3 seconds. Eval SQL.NET allow to execute C# code in T-SQL.

-- CREATE a big string with all the rows (100,000 rows)
DECLARE @s VARCHAR(MAX) = ''
SET @s = ( SELECT TOP ( 100000 )
                    RawSingleLine + CHAR(13) + CHAR(10)
           FROM     Import
FOR XML PATH('') ,
        TYPE
         ).value('text()[1]', 'varchar(max)')

-- Use C# Syntax to split the text. Use Regex.Split if necessary.
DECLARE @sqlnet SQLNET = SQLNET::New('
var rows = s.Split(new string[] { Environment.NewLine }, StringSplitOptions.RemoveEmptyEntries);
return rows.Select(x => x.Split('' '')).ToList()
').ValueString('s', @s).AutoDispose()

EXEC dbo.SQLNET_EvalResultSet @sqlnet
Jonathan Magnan
  • 10,874
  • 2
  • 38
  • 60