-1

I have some code that fetches relatively large amounts of data from my SQL db (approx 200k records, 7 columns) which I then have to act upon and update / insert into other tables.

My initial iteration is done by opening a SqlDataReader and looping over it - it would appear though that this seems to hold an open transaction on the db and can cause some locking issues when the process takens several hours to run. The data is fetched via a sql stored proc and I am pretty sure that's pretty well optimised. The processing for each record is fairly intensive.

My pseudo code :

string sql = "EXEC StoredProc"
sqlConn.Open();
SqlCommand sqlComm = new SqlCommand(sql, sqlConn);
SqlDataReader reader = sqlComm.ExecuteReader();

//loop through products
while (reader.Read())    
{
    // do stuff
}

Can I put the SQLDataReader into an array or List<> to grab it "offline" so the db is freed from waiting for all the code in between to run several hundred thousand times - will that improve matters or make it worse because there will be so much data held in memory?

chilluk
  • 217
  • 2
  • 17
  • There's a lot happening here that seems like a really bad idea. You open a recordset through a procedure. Then you loop (lord no) through that huge recordset and insert (tell me it's not true) into another table... as well as some other things. Are you sure you are making the right decision here with this end-to-end design. I suppose you could loop through this recordset and push it into a array, but then you have 200k sized array on whatever poor system this is running on. – JNevill Sep 06 '17 at 16:37
  • I hear you - basically I need to read each and every record of my original table, run some of the values through some other code and write those new values to another table - and I can't do this inside of SQL as the code that does the processing is pretty complex and is quite a chunk of C# code. I guess I could batch the process into smaller chunks but ultimately I have to process it all! – chilluk Sep 06 '17 at 20:34
  • Now that I've seen your response to JNevill's comment, I can add that you might get the best results from writing a CLR procedure. – Tab Alleman Sep 08 '17 at 18:53

1 Answers1

1

JNevill raises some important questions, but I'll treat your question as academic for the moment.

Can I put the SQLDataReader into an array or List<> to grab it "offline" so the db is freed from waiting for all the code in between to run several hundred thousand times

Of course you can.

will that improve matters or make it worse because there will be so much data held in memory?

Well now that depends on which is worse for you: having your table locked by a datareader for as long as it takes, or having your entire dataset in memory. Neither one is universally better or worse, it depends on the effects it has on your business and your users. Whichever is worse in your opinion is worse.

If both are unacceptable, you can always take option C, and read the data and immediately write it to a flat file on disk (using the filesystemobject), so it's not keeping the sql table locked and it's not keeping the data in memory. Then you can do your heavy processing line by line with a filestream, and finally write from the flat file to your database.

Keep in mind that if you don't keep the table locked during processing, changes could be made to the table by other users, and those changes would be lost when you overwrite the table with your processed data. Again, you have to determine which is worse in your case.

Tab Alleman
  • 31,483
  • 7
  • 36
  • 52
  • I guess as above I could batch this into smaller chunks but I still need to eventually iterate over every single record in my original table. I guess using filesystem is a way to "cache" the data away from the db - doesn't matter about other changes happening to the main table in the meantime. – chilluk Sep 06 '17 at 20:36