5

I have a for loop inside of which

First : I want to compute the SQL required to run

Second : Run the SQL asynchronously without waiting for them individually to finish in a loop

My code looks like:

for (
        int i = 0;
        i < gm.ListGroupMembershipUploadDetailsInput.GroupMembershipUploadInputList.Count;
        i++)
{
     // Compute
     SQL.Upload.UploadDetails.insertGroupMembershipRecords(
         gm.ListGroupMembershipUploadDetailsInput.GroupMembershipUploadInputList[i],max_seq_key++,max_unit_key++,
         out strSPQuery,
         out listParam);

     //Run the out SPQuery async
     Task.Run(() => rep.ExecuteStoredProcedureInputTypeAsync(strSPQuery, listParam));
}

The insertGroupMembershipRecords method in a separate DAL class looks like :

public static GroupMembershipUploadInput insertGroupMembershipRecords(GroupMembershipUploadInput gm, List<ChapterUploadFileDetailsHelper> ch, long max_seq_key, long max_unit_key, out string strSPQuery, out List<object> parameters)
{
    GroupMembershipUploadInput gmHelper = new GroupMembershipUploadInput();
    gmHelper = gm;
    int com_unit_key = -1;
    foreach(var item in com_unit_key_lst){
        if (item.nk_ecode == gm.nk_ecode)
            com_unit_key = item.unit_key;
    }

    int intNumberOfInputParameters = 42;
    List<string> listOutputParameters = new List<string> { "o_outputMessage" };
    strSPQuery = SPHelper.createSPQuery("dw_stuart_macs.strx_inst_cnst_grp_mbrshp", intNumberOfInputParameters, listOutputParameters);
    var ParamObjects = new List<object>();

    ParamObjects.Add(SPHelper.createTdParameter("i_seq_key", max_seq_key, "IN", TdType.BigInt, 10));
    ParamObjects.Add(SPHelper.createTdParameter("i_chpt_cd", "S" + gm.appl_src_cd.Substring(1), "IN", TdType.VarChar, 4));
    ParamObjects.Add(SPHelper.createTdParameter("i_nk_ecode", gm.nk_ecode, "IN", TdType.Char, 5)); 

    // rest of the method
   }

But in case of list Count of 2k which I tried,

It did not insert 2k records in DB but only 1.

Why this does not insert all the records the input list has ?

What am I missing ?

StrugglingCoder
  • 4,781
  • 16
  • 69
  • 103
  • 4
    I know it's not the answer - But I think you should look how to insert in a bulk - will be way more efficient that 2000 individual inserts. – Gilad Green Jul 07 '16 at 09:57
  • That would be great , but for now , in Teradata , I guess there is no direct support . – StrugglingCoder Jul 07 '16 at 09:58
  • try to move insertGroupMembershipRecords and the required variables into the scope of the task – nozzleman Jul 07 '16 at 10:01
  • 1
    Where are `strSPQuery` & `listParam` defined? – Enigmativity Jul 07 '16 at 10:10
  • @Enigmativity they are the out params from the Compute query. – StrugglingCoder Jul 07 '16 at 10:26
  • 2
    ADO.NET already provides asynchronous execution methods, eg. DbCommand.ExecuteAsync. Faking asynchronous execution with `Task.Run` is actually a bug. Furthermore, executing 2K inserts in parallel is *slower* than sending a single batch of 2K insert statements. Most ORMs support batching. Finally, the *fastest* way is to use BULK INSERT (through the SqlBulkCopy class) or use table-valued parameters, to call the stored procedure ONCE with a parameter that contains all 2K rows – Panagiotis Kanavos Jul 07 '16 at 10:33
  • In other words, trying to perform batch modidications using parallel execution is a smell – Panagiotis Kanavos Jul 07 '16 at 10:38
  • @StrugglingCoder - Yes, I get that. But where are they defined and what are their types? – Enigmativity Jul 07 '16 at 10:42
  • @Enigmativity out strSPQuery is of type string and listParam is of type List – StrugglingCoder Jul 07 '16 at 10:45
  • @StrugglingCoder - It's a struggle to get the full answer out of you - where are they defined? – Enigmativity Jul 07 '16 at 11:02
  • @Enigmativity .. in a separate DAL class. Please see the question updates. – StrugglingCoder Jul 07 '16 at 11:08
  • @StrugglingCoder - Could you try defining them inside the `for` loop and see if your problem goes away? – Enigmativity Jul 07 '16 at 11:13
  • I see that Teradata documentation is a bit ... challenged regarding bulk operations. If it doesn't have TVPs either, I'd try to execute statements in parallel using a *limited* set of connections to prevent performance degradation due to contention and locking. 10 concurrent connections are actually *faster* than 2000. The crudest attempt would be to use `Parallel.ForEach` with a DOP of eg 10. A somewhat better attempt would be to use `ActionBlock` with a MaxDOP of 10 ... – Panagiotis Kanavos Jul 07 '16 at 11:13
  • ... possibly with a BatchBlock in front to batch the inputs into arrays of eg 100 or 500 rows. In this case, the ActionBlock's action could open a connection, send all 100 rows then close it. This would guarantee controlled parallel execution – Panagiotis Kanavos Jul 07 '16 at 11:16
  • @PanagiotisKanavos Well , not much idea with those. Could you please help with Parallel.ForEach . I am not getting how to compute and then execute via Parallel.ForEach? – StrugglingCoder Jul 07 '16 at 11:18
  • @StrugglingCoder - Could you try defining them inside the `for` loop and see if your problem goes away? – Enigmativity Jul 07 '16 at 11:26

1 Answers1

2

Task.Run in a for loop

Even though this is not the question, the title itself is what I'm going to address. For CPU bound operations you could use Parallel.For or Parallel.ForEach, but since we are IO bound (i.e.; database calls) we should rethink this approach.

The obvious answer here is to create a list of tasks that represent the asynchronous operations and then await them using the Task.WhenAll API like this:

public async Task InvokeAllTheSqlAsync()
{
    var list = gm.ListGroupMembershipUploadDetailsInput.GroupMembershipUploadInputList;

    var tasks = Enumerable.Range(0, list.Count).Select(i =>
    {
        var value = list[i];
        string strSPQuery;
        List<SqlParameter> listParam;
        SQL.Upload.UploadDetails.insertGroupMembershipRecords(
            value, 
            max_seq_key++,
            max_unit_key++,
            out strSPQuery,
            out listParam
        );

        return rep.ExecuteStoredProcedureInputTypeAsync(strSPQuery, listParam);
    });

    await Task.WhenAll(tasks);
}
David Pine
  • 23,787
  • 10
  • 79
  • 107
  • 1
    The solution above is good, I only want to add my two cents to answer the actual question :). The problem with the loop in the original question is that it **does not wait for the SQL operations**. It only fires up the 2K tasks and goes on with the program flow - maybe the app will even end, before completing the db statements. My guess is that those threads will abort without inserting all the records. This is why the solution above is correct: you have to "await" (not wait) for all the tasks. Not wait for them on-by-one, but "await" for all of them at the end. – Miklós Tóth Oct 11 '16 at 20:08