8

I am working to increase the performance of bulk loads; 100's of millions of records + daily.

I moved this over to use the IDatareader interface in lieu of the data tables and did get a noticeable performance boost (500,000 more records a minute). The current setup is:

  • A custom cached reader to parse the delimited files.
  • Wrapping the stream reader in a buffered stream.
  • A custom object reader class that enumerates over the objects and implements the IDatareader interface.
  • Then SqlBulkCopy writes to server

The bulk of the performance bottle neck is directly in SqlBulkCopy.WriteToServer. If I unit test the process up to but excluding just the WriteToServer the process returns in roughly 1 minute. WriteToServer is taking on an additional 15 minutes +. For the unit test it is on my local machine so the same drive the database lives on so it's not having to copy the data across the network.

I am using a heap table (no indexes; clustered or unclustered; I have played around various batch sizes without major differences in performance).

There is a need to decrease the load times so I am hoping someone might now a way to squeeze a little more blood out of this turn-up.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Michael S
  • 83
  • 1
  • 2
  • 6
  • 1
    Are you running release mode? What's the bitness? Have you profiled? During the 15+ minutes, is the CPU load mostly in the server or in the client? How much data are you actually copying? What settings did you use for the bulk copy? What's the I/O load during the copy? – Eamon Nerbonne Mar 20 '13 at 14:51
  • Yes, this is in release mode. 32 on my local where I am unit testing. The client is running on the same box as the db. Holding steady at roughly 12% CPU usage for the process and 80k of memory. For this unit test 12 million records. I used the table lock setting. Good idea, I will run the sql profiler on this process and see if it turns anything up. – Michael S Mar 20 '13 at 15:02
  • I mean the Cpu load on the client process vs. the server process; and you should look at a CPU profiler for the client which might tell you more than the sql profiler here - it sounds like the sql side is very simple (which is good). 12% CPU load - that might be 1 core of an 8 core machine maxed out - is it? How many bytes is one record? Oh, finally, are you running without a debugger attached (possible even in release mode)? – Eamon Nerbonne Mar 20 '13 at 16:07
  • To help benchmark you might want to include the custom data reader in your pre-WriteToServer benchmark. Since data-readers are pull-based (i.e. lazy) it's likely that by excluding WriteToServer you're also excluding possible perf. bottlenecks in the custom data reader. – Eamon Nerbonne Mar 20 '13 at 16:15

1 Answers1

1

Why not use SSIS directly?

Anyway, if you did a treaming from parsing to IDataReader you're already on the right path. To optimize SqlBulkCopy itself you need to turn your focus to SQL Server. The key is minimally logged operations. You must read these MSDN articles:

If your target is a B-Tree (ie a clustered indexed table) unfortunately one of the most important tenets of performant bulk insert, namely the sorted-input rowset, cannot be declared. Sis simple as this, ADO.Net SqlClient does not have the equivalent of SSPROP_FASTLOADOPTIONS -> ORDER(Column) (OleDb). Since the engine does not know that the data is already sorted it will add a Sort operator in the plan which is not that bad except when it spills. To avoid spills, use a small batch size (~10k). See my original point: all these are just options and clicks to set in SSIS rather than digging through OleDB MSDN spec...

If your data stream is unsorted to start with or the destination is a heap then my point above is mute.

However, achieving minimally logging is still a must for decent performance.

Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
  • I agree that the DB is the focus; especially given that everything up to bulk copy flies; 12 million records completes in 57 seconds. On my local for unit testing I am using simple logging but it is express edition. I will test this out on one of our dev boxes (enterprise edition) by creating a new DB and using simple logging. – Michael S Mar 20 '13 at 15:58
  • Simple recovery model is capable of minimally logged operations. But you need all the other factors. Also, with an Express Edition I assume that your IO subsytem is off ordinary consumer grade (eg. a big 5400RPM single drive). If IO is you bottleneck you'll need to test on fast drives. – Remus Rusanu Mar 20 '13 at 16:06
  • I ran this out on a server with more resources and created a simple recovery database. This knocked it up about another 600,000 records a second from 2.6 mil. 1.1 million more than where this all started. I also just did some testing using multiple threads for sql bulk copy with some great promising numbers. 4 threads jumped to 6 million a minute; while 8 jumped to 12 million a minute. Quite promising but probably need to distribute it out against multiple boxes using 8 thread due to CPU utilization. – Michael S Mar 20 '13 at 21:38