How to do very fast inserts to SQL Server 2008

Question

I have a project that involves recording data from a device directly into a sql table.

I do very little processing in code before writing to sql server (2008 express by the way)

typically i use the sqlhelper class's ExecuteNonQuery method and pass in a stored proc name and list of parameters that the SP expects.

This is very convenient, but i need a much faster way of doing this.

Thanks.

I have to log each output one at a time, i cannot collect several outputs and log them in bulk — Charles Okwuagwu, May 19 '10 at 00:13

score 48 · Accepted Answer · edited May 02 '18 at 15:44

ExecuteNonQuery with an INSERT statement, or even a stored procedure, will get you into thousands of inserts per second range on Express. 4000-5000/sec are easily achievable, I know this for a fact.

What usually slows down individual updates is the wait time for log flush and you need to account for that. The easiest solution is to simply batch commit. Eg. commit every 1000 inserts, or every second. This will fill up the log pages and will amortize the cost of log flush wait over all the inserts in a transaction.

With batch commits you'll probably bottleneck on disk log write performance, which there is nothing you can do about it short of changing the hardware (going raid 0 stripe on log).

If you hit earlier bottlenecks (unlikely) then you can look into batching statements, ie. send one single T-SQL batch with multiple inserts on it. But this seldom pays off.

Of course, you'll need to reduce the size of your writes to a minimum, meaning reduce the width of your table to the minimally needed columns, eliminate non-clustered indexes, eliminate unneeded constraints. If possible, use a Heap instead of a clustered index, since Heap inserts are significantly faster than clustered index ones.

There is little need to use the fast insert interface (ie. SqlBulkCopy). Using ordinary INSERTS and ExecuteNoQuery on batch commits you'll exhaust the drive sequential write throughput much faster than the need to deploy bulk insert. Bulk insert is needed on fast SAN connected machines, and you mention Express so it's probably not the case. There is a perception of the contrary out there, but is simply because people don't realize that bulk insert gives them batch commit, and its the batch commit that speeds thinks up, not the bulk insert.

As with any performance test, make sure you eliminate randomness, and preallocate the database and the log, you don't want to hit db or log growth event during test measurements or during production, that is sooo amateurish.

lol @ "that is sooo amateurish...." @Ramus - Thanks man, you totally nailed it. Please clarify on "ExecuteNonQuery". Do you mean ExecuteNonQuery method on the sqlhelper class in Microsoft.ApplicationBlocks.Data? — Charles Okwuagwu, May 19 '10 at 00:47
I mean the basic SqlCommand http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.executenonquery.aspx, but I think the Application Framework one is very similar in every aspect. One think you gotta make absolutely sure is that Application Framework does *not* enroll your connection into a distributed transaction, that would just sunk any performance. — Remus Rusanu, May 19 '10 at 01:27
Just checking your nomenclature - by "batch commit" do you mean wrapping every thousand inserts in a transaction? This is what I've been doing, and it's a hassle handling the occasional indigestible insert which takes out the whole batch, I basically build an exclusion-list and retry until it goes through. Or do you mean a bunch of insert statements followed by a GO statement? And if so, how can I do that from ADO.NET? Do I have to build a sodding great string and run it using ExecuteNonQuery ? — Peter Wone, Jul 30 '12 at 06:56
@RemusRusanu Now that we have the Async methods in sqlclient 4.5, I guess you now longer need to batch every 1000 or so, you can just ExecuteNonQueryAsync and get better insert speeds. — Charles Okwuagwu, Jan 02 '14 at 18:53
@CharlesOkwuagwu At least in my tests, I don't see a difference in speed between async and synchronous, while batched is about 30% faster. — Alexander, Jul 03 '17 at 14:08

score 6 · Answer 2 · answered May 18 '10 at 23:44

6

bulk insert would be the fastest since it is minimally logged

.NET also has the SqlBulkCopy Class

answered May 18 '10 at 23:44

SQLMenace

132,095
25
206
225

I have to log each output one at a time, i cannot collect several outputs and log them in bulk – Charles Okwuagwu May 19 '10 at 00:13

score 3 · Answer 3 · edited Oct 31 '16 at 12:17

Here is a good way to insert a lot of records using table variables...

...but best to limit it to 1000 records at a time because table variables are "in Memory"

In this example I will insert 2 records into a table with 3 fields - CustID, Firstname, Lastname

--first create an In-Memory table variable with same structure
--you could also use a temporary table, but it would be slower

declare @MyTblVar table (CustID int, FName nvarchar(50), LName nvarchar(50))

insert into @MyTblVar values (100,'Joe','Bloggs')

insert into @MyTblVar values (101,'Mary','Smith')

Insert into MyCustomerTable

Select * from @MyTblVar

Is this really more efficient than directly inserting into a database table ? (That's a genuine question, not a criticism..) — Mike Gledhill, Oct 31 '16 at 12:18

score 2 · Answer 4 · answered May 18 '10 at 23:44

2

This is typically done by way of a BULK INSERT. Basically, you prepare a file and then issue the BULK INSERT statement and SQL Server copies all the data from the file to the table with the fast method possible.

It does have some restrictions (for example, there's no way to do "update or insert" type of behaviour if you have possibly-existing rows to update), but if you can get around those, then you're unlikely to find anything much faster.

answered May 18 '10 at 23:44

Dean Harding

71,468
13
145
180

i have to log each device output individually, that excludes the bulk options – Charles Okwuagwu May 18 '10 at 23:50
What do you mean by "log each device output individually"? – Dean Harding May 18 '10 at 23:54
what I mean is I have to log each output one at a time, i cannot collect several outputs and log them in bulk – Charles Okwuagwu May 19 '10 at 00:12

score 2 · Answer 5 · answered May 18 '10 at 23:45

2

If you mean from .NET then use SqlBulkCopy

answered May 18 '10 at 23:45

Mitch Wheat

295,962
43
465
541

1

i have to log each device output individually, that excludes the bulk options – Charles Okwuagwu May 18 '10 at 23:51
@CharlesO: I have no idea what that means. Bulk has no relation to many devices! – Mitch Wheat May 18 '10 at 23:55
what I mean is I have to log each output one at a time, i cannot collect several outputs and log them in bulk – Charles Okwuagwu May 19 '10 at 00:12

score 2 · Answer 6 · answered May 18 '10 at 23:55

2

Things that can slow inserts include indexes and reads or updates (locks) on the same table. You can speed up situations like yours by avoiding both and inserting individual transactions to a separate holding table with no indexes or other activity. Then batch the holding table to the main table a little less frequently.

answered May 18 '10 at 23:55

Joel Coehoorn

399,467
113
570
794

@Mitch: This is news to me, can you give us an example? – Aaronaught May 19 '10 at 00:19
@Joel: Your suggestion makes sense. Basically I hold open an ADO.net connection, and keep using sqlcommand objects to insert records as fast as the device produces then. However: 1) Will raw sql inserts be a better option than using a stored proc in this case? 2) Will this holding table need to have a clustered index, like an auto incrementing ID col 3)say i run an sqlAgent script ever 2 seconds to batch the holding table to the main table, won't the deletes on the holding table after copying each batch further degrade performance? Thanks – Charles Okwuagwu May 19 '10 at 00:29
@Aaronaught - by helping other queries run faster and therefore reducing locking on the table. – Joel Coehoorn May 19 '10 at 00:42
3

Indexes might speed up the checking of foreign key constraints. – meriton May 19 '10 at 01:52

Cade Roux · Answer 7 · 2010-05-19T01:42:04.743

1

It can only really go as fast as your SP will run. Ensure that the table(s) are properly indexed and if you have a clustered index, ensure that it has a narrow, unique, increasing key. Ensure that the remaining indexes and constraints (if any) do not have a lot of overhead.

You shouldn't see much overhead in the ADO.NET layer (I wouldn't necessarily use any other .NET library above SQLCommand). You may be able to use ADO.NET Async methods in order to queue several calls to the stored proc without blocking a single thread in your application (this potentially could free up more throughput than anything else - just like having multiple machines inserting into the database).

Other than that, you really need to tell us more about your requirements.

edited May 19 '10 at 01:42

answered May 19 '10 at 00:01

Cade Roux

88,164
40
182
265

solid suggestion, stick to pure ADO.net only, and possibly hold my connection open as long as possible. Building the parameters for the SP and then calling the SP and passing parameters, it seems tedious for a simple insert operation. Would you recommend sending insert statements directly from ADO.net? – Charles Okwuagwu May 19 '10 at 00:37
@CharlesO - I would use SQLCommand, add the parameters and ExecuteNonQuery (I would not send the 'INSERT INTO blah VALUES (blah') literal string) - and I would definitely think about using the async version: BeginExecuteNonQuery (http://msdn.microsoft.com/en-us/library/system.data.sqlclient.sqlcommand.beginexecutenonquery.aspx) with some sort of governor on how many active inserts and a queue internal to the program. You can then see if it can handle your expected workload or if you need more extensive storing and forwarding. – Cade Roux May 19 '10 at 02:20

How to do very fast inserts to SQL Server 2008

7 Answers7

Linked