1

I have more than 100 files to import into the sql server and most of them are of 500 MB. I want to leverage the the parallel import utility of SQL server and have read a number of webpages, like the following ones:

How to load 1 TB data in 30 minutes

https://technet.microsoft.com/en-us/library/dd537533(v=sql.100).aspx

Importing Data in Parallel with Table Level Locking

https://technet.microsoft.com/en-us/library/ms186341(v=sql.105).aspx

Controlling Locking Behavior for Bulk Import

https://technet.microsoft.com/en-us/library/ms180876(v=sql.105).aspx

and the answers in stackoverflow

Fastest way to insert in parallel to a single table

However, none of them have given a simple example with code. I know how to use bulk insert/bcp, but I don't know where to start with parallel import? Can anyone help me with it?

My system is Windows and I'm using SQL server 2016. The source data file is in txt format.

Thanks in advance for your help!

Jason

Community
  • 1
  • 1
Jason
  • 1,200
  • 1
  • 10
  • 25

1 Answers1

5
  1. Load the file path details into a tracking table

    Create table FileListCollection TABLE (Id int identity(1,1), filepath VARCHAR(500), ThreadNo tinyint, isLoaded int)
    
    DECLARE @FileListCollection TABLE (filepath VARCHAR(500))
    DECLARE @folderpath NVARCHAR(500)
    DECLARE @cmd NVARCHAR(100)
    SET @folderpath = '<FolderPath>' 
    SET @cmd = 'dir ' + @folderpath + ' /b /s'
    
    INSERT INTO @FileListCollection
    EXECUTE xp_cmdshell @cmd
    
    DELETE
    FROM @FileListCollection
    WHERE filepath IS NULL
    
    insert into FileListCollection(filepath, isLoaded)
    select filepath, 0
    from @FileListCollection
    
  2. Schedule for each thread

    declare @ThreadNo int = 3
    update f set ThreadNo=(id%@ThreadNo)
    from FileListCollection f 
    
  3. Open three session and assign thread number to each

  4. Run the below script to load the data

    DECLARE @filepath NVARCHAR(500)
    DECLARE @filepath NVARCHAR(500)
    DECLARE @bcpquery NVARCHAR(MAX);
    DECLARE @ThreadNo int = 1
    WHILE EXISTS (
            SELECT TOP 1 *
            FROM FileListCollection
            where ThreadNo = @ThreadNo
            and isLoaded = 0
            )
    BEGIN
        SELECT TOP 1 @filepath = filepath
        FROM FileListCollection
        where ThreadNo = @ThreadNo
            and isLoaded = 0
    
        SET @bcpquery = 'bulk insert <Database>.dbo.Table from '''+ @filepath+''' with (fieldterminator = ''|'', rowterminator = ''\n'')';
    
        print @bcpquery
        --Load the Content in table
        execute sp_executesql @bcpquery;
    
        Update FileListCollection set isLoaded = 1
        WHERE filepath = @filepath
    END
    
Stephen
  • 8,508
  • 12
  • 56
  • 96
Syed
  • 417
  • 1
  • 6
  • 13
  • Thanks for your reply! I'll check it out. – Jason Nov 22 '16 at 18:12
  • Above cod logic will help in parallel import to improve the performance in concurrent use Table locking in bcp command as stated in your question – Syed Nov 22 '16 at 18:14