-1

I am posting this partly out of intrest on how the Task Parallel Library works, and for spreading knowledge. And also for investigating whether my "Cancellation" updates is the reason to a new issue where the user is suddenly logged out.

The project I am working on have these components:

  • Web forms site. A website that acts as portal for administrating company vehicles. Further refered as "Web"
  • WCF web service. A backend service on a seperate machine. Further refered as "Service"
  • Third party service. Further refered as "3rd"

Note: I am using .NET 4.0. Therefore the newer updates to the Task Parallel Library are not available.


The issue that I was assigned to fix was that the login function was very slow and CPU intensive. This later was later admitted to be a problem in the Third party service. However I tried to optimize the login behavior as well as I could.

The login request and response doesn't contain perticularly much data. But for gathering the response data several API calls are made to the Third party service.

1. Pre changes

The Web invokes a WCF method on the Service for gathering "session data". This method would sometimes take so long that it would timeout (I think the timeout was set to 1 minute).

A pseudo representation of the "GetSessionData" method:

var agreements = getAgreements(request);
foreach (var agreement in agreements)
{
    getAgreementDetails(agreement);
    var customers = getCustomersWithAgreement(agreement);
    foreach (var customer in customers)
    {
        getCustomerInfo(customer);
        getCustomerAddress(customer);
        getCustomerBranches(customer);
    }
}

var person = getPerson(request);
var accounts = getAccount(person.Id);
foreach (var account in accounts)
{
    var accountDetail = getAccountDetail(account.Id);
    foreach (var vehicle in accountDetail.Vehicles)
    {
        getCurrentMilageReport(vehicle.Id);
    }
}
return sessionData;

See gist for code snippet.

This method quickly becomes heavy the more agreements and accounts the user has.

2. Parallel.ForEach

I figured that I could replace foreach loops with a Parallel.ForEach(). This greatly improved the speed of the method for larger users.

See gist for code snippet.

3. Cancel

Another problem we had was that when the web services server is maxed on CPU usage, all method calls becomes much slower and could result in a timeout for the user. And a popular response to a timeout is to try again, so the user triggers another login attempt which is "queued"(?) due to the high CPU usage levels. This all while the first request has not returned yet.

We discovered that the request is still alive if the web site times out. So we decided to implement a similiar timeout on the Service side.

See gist for code snippet.

The idea is that GetSessionData(..) is to be invoked with a CancellationToken that will trigger Cancel after about the same time as the Web timeout. So that no work will be done if no one is there to show or use the results.

I also implemented the cancellation for the method calls to the Third party service.

Is it correct to share the same CancellationToken for all of the loops and service calls? Could there be an issue when all threads are "aborted" by throwing the cancel exception?

See gist for code snippet.

Community
  • 1
  • 1
LazyTarget
  • 879
  • 1
  • 14
  • 30
  • if getting data timeout over 1 minute you must be loading way too much informations than you need to. I load over 36 thousands records over 425 different tables and it doesn't take more than 7 seconds. Only load what's necessary it will help on that point. Your loops can very likely be replaced with an inner join each. – Franck Aug 16 '17 at 12:54
  • @Franck The thing is that I am dependent on the Third party service which currently (pre patch) is slow when under pressure. I could refactor the website to only ask for the info when needed, but that would require a large overhaul of the current implementation. Now we are looking for a quick solution for fixing the performace issues – LazyTarget Aug 16 '17 at 13:37
  • @LazyTarget updated my answer – cassandrad Aug 17 '17 at 09:37
  • 1
    A couple of recommendations: profile your call to find the **real** perf bottlenecks. Use O(1) lookup collections - i.e. use `Dictionary` or `HashSet` to determine whether you've already seen a particular object instead of using LINQ. Limit the degree of parallelism of your `Parallel.ForEach` to avoid it creating new threads continuously when it encounters a blocking call (i.e. `lock`, `Task.Wait` or external service IO), as this will saturate your thread pool. – Kirill Shlenskiy Aug 17 '17 at 09:57

1 Answers1

3

Is it correct to share the same CancellationToken for all of the loops and service calls? Could there be an issue when all threads are "aborted" by throwing the cancel exception?

Yes, it is correct. And yes, there could be an issue with throwing a lot of exceptions at the same time, but only in specific situations and huge amount of parallel work.

Several hints:

  • Use one CancellationTokenSource per one complete action. For example, per request. Pass the same Cancellation Token from this source to every asynchronous method
  • You can avoid throwing an exception and just return from a method. Later, to check that work was done and nothing been cancelled, you you check IsCancellationRequested on cts
  • Check token for cancellation inside loops on each iteration and just return if cancelled
  • Use threads only when there is an IO work, for example, when you query something from database or requests to another services; don't use it for CPU-bound work

I was tired at the end of working day and suggested a bad thing. Mainly, you don't need threads for IO bound work, for example, for waiting for a response from database of third service. Use threads only for CPU computations.

Also, I reviewed your code again and found several bottlenecks:

  1. You can call GetAgreementDetail, GetFuelCards, GetServiceLevels, GetCustomers in asynchronously; don't wait for each next, not running all four requests
  2. You can call GetAddressByCustomer and GetBranches in parallel as well
  3. I noticed that you use mutex. I guess it is for protecting agreementDto. Customers and response.Customers on addition. If so, you can reduce scope of the lock
  4. You can start work with Vehicles earlier, as you know UserId at the beginning of the method; do it in parallel too
cassandrad
  • 3,412
  • 26
  • 50