25

I'm currently working on an application that relies on many different web services to get data. Since I want to modularize each service and have a bit of dependency in there (service1 must run before service 2 and 3 etc), I'm running each service in its own task.

The tasks themselves are either

  1. running actively, meaning they're sending their request to the web service and are waiting for a response or processing the response

  2. waiting (via monitor and timeout) - once a task finishes all waiting tasks wake up and check if their dependencies have finished

Now, the system is running with what I would call good performance (especially since the performance is rather negligible) - however, the application generates quite a number of tasks.

So, to my question: are ~200 tasks in this scenario too many? Do they generate that much overhead so that a basically non-threaded approach would be better?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Scurals
  • 253
  • 1
  • 3
  • 4
  • That probably depends on (1) the tasks that have to be done and (2) the granularity of the modules. – Willem Van Onsem Oct 13 '13 at 17:35
  • The tasks are mostly only sending requests to the webservice, i.e. sending a request for twitter feeds, with very minor processing (filter tweets). I start a new task for each item, meaning about 1-30 tasks running "concurrently" and not waiting for dependency --- Normally about one module per webservice (currently about 10 - 15 modules total). – Scurals Oct 13 '13 at 17:39
  • Looks to me that will be feasible, since "running" merely means waiting for a response of the server... – Willem Van Onsem Oct 13 '13 at 17:41
  • Answer depends upon what those tasks does. You said web request, why not asynchronous? If you do asynchronous then you dont need to worry about resources – Sriram Sakthivel Oct 13 '13 at 18:05

1 Answers1

21

The general answer is "Measure, Measure, Measure" :) if you're not experiencing any problems with performance, you shouldn't start optimizing.

I'd say 200 tasks are fine though. The beauty of tasks compared to threads is their low overhead compared to "real" threads and even the thread pool. The TaskScheduler is making sure all the hardware threads are utilized as much as possible with the least amount of thread switching. it does this by various tricks such as running child tasks serially, stealing work from queues on other threads and so on.

You can also give the TaskScheduler some hints about what a specific task is going to do via the TaskCreationOptions


If you want some numbers, check out this post, as you can see, Tpl is pretty cheap in terms of overhead:
.NET 4.0 - Performance of Task Parallel Library (TPL), by Daniel Palme

This is another interesting article on the subject:
CLR Inside Out: Using concurrency for scalability, by Joe Duffy

Theodor Zoulias
  • 34,835
  • 7
  • 69
  • 104
aL3891
  • 6,205
  • 3
  • 33
  • 37
  • 1
    The "another interesting article" is dead - do you remember what it referred to? – default Jan 12 '17 at 16:46
  • I believe the dead link referred to a September 2006 article, "CLR Inside Out: Using concurrency for scalability". Since the article is so old, MS archived it. You can try viewing online here: https://web.archive.org/web/20130608013159/http://msdn.microsoft.com/en-us/magazine/cc163552.aspx or you can download the archived CHM version here: http://download.microsoft.com/download/3/a/7/3a7fa450-1f33-41f7-9e6d-3aa95b5a6aea/MSDNMagazineSeptember2006en-us.chm (you may need to "unblock" the file before you can read the contents). –  Mar 14 '18 at 14:49
  • 1
    the link to that article seems to have been rerouted so it works again now – aL3891 Oct 23 '20 at 15:52