5

I wonder how SqlDataAdapter works internally, especially when using UpdateCommand for updating a huge DataTable (since it's usually a lot faster that just sending sql statements from a loop).

Here is some idea I have in mind :

  • It creates a prepared sql statement (using SqlCommand.Prepare()) with CommandText filled and sql parameters initialized with correct sql types. Then, it loops on datarows that need to be updated, and for each record, it updates parameters values, and call SqlCommand.ExecuteNonQuery().
  • It creates a bunch of SqlCommand objects with everything filled inside (CommandText and sql parameters). Several SqlCommands at once are then batched to the server (depending of UpdateBatchSize).
  • It uses some special, low level or undocumented sql driver instructions that allow to perform an update on several rows in a effecient way (rows to update would need to be provided using a special data format and a the same sql query (UpdateCommand here) would be executed against each of these rows).
tigrou
  • 4,236
  • 5
  • 33
  • 59
  • 2
    I often wonder why people that ask how certain things work internally don't use something like ILSpy and just take a peek. – itsme86 Dec 19 '12 at 23:03
  • I'm assuming you've hooked up a profiler (to the database) to see what it does? What is your question? – Nick DeVore Dec 19 '12 at 23:03
  • As @itsme86 has mentioned, you can have a lot of fun with ILSpy - http://ilspy.net/. Use Open From GAC to add a reference to System.Data and explore away! – dash Dec 19 '12 at 23:05

1 Answers1

7

It uses an internal facility of the SQL Server client classes which is called command sets. You can send multiple batches with a single command to SQL Server. This cuts down on per-call overhead. You have less server roundtrips and such.

A single row is updated per statement, and one statement per batch is sent, but multiple batches per roundtrip are send. The last point in this list is the magic sauce.

Unfortunately, this facility is not publicly exposed. Ayende took a hack on this and built a private-reflection bases API for it.

If you want more information I encourage you to look at the internal SqlCommandSet class.

That said, you can go faster than this by yourself: Transfer the update data using a TVP and issue a single UPDATE that updates many rows. That way you save all per-batch, per-roundtrip and per-statement overheads.

Such a query would look like this:

update T set T.x = @src.x from T join @src on T.ID = @src.ID
usr
  • 168,620
  • 35
  • 240
  • 369
  • Good answer; although the last statement does depend on the nature of the update. I'd possibly add a link to ILSpy and highlight that the SqlCommandSet is a private field in `SqlDataAdapter`. – dash Dec 19 '12 at 23:10
  • @dash I cannot think of a case where this would not be the fastest way for many-row updates. Can you think of one? – usr Dec 19 '12 at 23:12
  • 1
    sorry, I wasn't clear - I agree, I meant to say in the context of the original post that if you've updated many different rows with many different values then there isn't really a single SQL statement you can use (is there?!) – dash Dec 19 '12 at 23:13
  • @dash, you can say `update T set T.x = @src.x from T join @src on T.ID = @src.ID`. That updates everything at once, pulling data from the TVP. *Very* efficient for index and indexed-view maintenance. – usr Dec 19 '12 at 23:15
  • 1
    I see! I haven't come across TVP before. I shall now (ritually ab)use them - being able to supply a table to a stored proc or function is *very* useful. http://www.mssqltips.com/sqlservertip/1483/using-table-valued-parameters-tvp-in-sql-server-2008/ – dash Dec 19 '12 at 23:16