1

I have a simple task - to read data from SQL table, process it, and once finished - update a field in DB (set processed=true), so every record should be processed only once.

Can you advise me, what technique can I use to do it multi-threaded? It means, read data from table in multiple threads.

Phil
  • 42,255
  • 9
  • 100
  • 100
berliner
  • 1,887
  • 3
  • 15
  • 23
  • You probably don't need multiple threads. why do you think you do? – John Saunders Mar 03 '13 at 21:09
  • @JohnSaunders Why not? If the data is not dependent on each other (I.e, setting first N records has no bearing on processing of subsequent N record batches) then this is an ideal candidate for multi threading. If there is a long running query, generally splitting it up will reduce overall processing time. – Dmitriy Khaykin Mar 04 '13 at 01:38
  • @DavidKhaykin: How much disk I/O is involved? And are you sure that the database can't multi-thread better than you can? – John Saunders Mar 04 '13 at 02:14
  • @JohnSaunders - depends on the DB design, if it is a cluster, how the DB may be split up into file groups, etc., etc. In the simple sense, one can say that the DB (MS SQL server) is designed to handle multiple concurrent requests efficiently, and by multi threading (I.e. sending multiple requests), you are taking advantage of this capability. If you send one request for 1,000,000 rows from one table, vs. 10 concurrent requests for 100,000 rows each, you are in the latter simulating 10 clients hitting the DB concurrently - it won't do this for you for just one query pulling a large recordset. – Dmitriy Khaykin Mar 04 '13 at 03:48

2 Answers2

2

If you want to read data in multiple threads without overlapping then you might try paging. You could have each thread read a different page. For example, you could have the first thread read the first 20 records, process each of those records, and then set processed=true for each, while the second thread is doing the same for the next 20 records and so on.

Check out this link for more info on paging in multiple threads. http://ericniemiec.wordpress.com/2010/06/10/paging-records-in-sql-server-2008-for-processing-database-records-on-different-threads-in-c/

  • 1
    Yep, I would just add that the batch size should be dependent on the overall size of the expected recordset so that some trial and error type tweaking can reveal the ideal number of batches to break up the data into. – Dmitriy Khaykin Mar 04 '13 at 01:39
1

If you use ef and use DataContext you should know DataContext is not thread safe.Database Management Systems like MsSQL Is thread safe. i prefer you that read this question

Community
  • 1
  • 1