8

AS far as I understood , Plinq decides how many thread to open ( each on a thread on different core) by cores count.

__________

  Core 1
  Core 2
  Core 3
  Core 4
___________

So If I Have a Plinq task which finds all the first 1000 prime numbers , Plink will open a new Thread on each Core in order to maximize the efficiency.

So here , each core will be running on 1000/4 numbers , the logic of finding the prime numbers.

However I've read that a blocking operations like IO should be used with WithDegreeOfParallelism so that the cpu won't think that this is an intensive cpu operation , and it allowed to use more threads than cores.

Question :

1) Is it accurate ? Did I understood it correctly ?

2)If I set WithDegreeOfParallelism (7) so it will definitely use all the 4 cores , but what about the other 3 ? ( 7-4) where will they be running ? on which core/s ?

Royi Namir
  • 144,742
  • 138
  • 468
  • 792
  • 1
    You might want to look at WithExecutionMode too. If IO is the only factor that makes parallelism possible, then that increases the possibility that in examining the query, plinq makes the wrong call on whether it would be better of just processing it serially. Conversely though, it's also possible that if plinq decides that, it was right, so it's worth processing with and without. – Jon Hanna Aug 18 '12 at 19:04

1 Answers1

13

First, .Net doesn't choose which core executes which thread, the OS does. If there is no other CPU-intensive application running on the system, you can expect that each thread will execute on a separate core. But if there is some other application, the OS might for example decide to run all of your threads on a single core, switching between them.

And it's even more complicated than that. A thread usually doesn't run on a single core, the OS switches it from core to core all the time. For example, have a look at the following screenshot from Task Manager showing the execution of a single-threaded CPU-intensive application.

Task Manager CPU usage screenshot

You'll notice that the single thread executed on all of my 4 cores, and utilized approximately 25 % of each core over the few seconds it ran.

.Net has no knowledge of the CPU usage of your computer, so it assumes that the optimal number of threads doing CPU-intensive work is the same as the number of cores.

I don't know how exactly does PLINQ work, but I wouldn't expect each core to produce exactly 1000/4 prime numbers in your example. If one thread already produced its share of prime numbers and another one isn't done yet, it wouldn't be efficient to let the first thread stay idle.

And yes, with IO operations, the optimal number of threads doesn't depend on the number of cores, so you should set the degree of parallelism manually. (Don't forget that the optimal number of threads may be 1; harddisks are fastest with sequential reads, not seeking back and forth between many files.)

If you set WithDegreeOfParallelism(7) it will definitely use 7 threads (again, no guarantee on the number of cores). The OS will decide how to run those 7 threads on your 4 cores. If all of those threads are CPU-intensive, it will most likely give each thread something like 4/7 ≈ 57 % of a core. If they are IO-bound, it will execute the code for a thread that just woke up (unblocked) on any core that is just available.

And WithDegreeOfParallelism() really does set exact number of threads, not their maximum number, see Stephen Toub's ParallelOptions.MaxDegreeOfParallelism vs PLINQ’s WithDegreeOfParallelism.

Arithmomaniac
  • 4,604
  • 3
  • 38
  • 58
svick
  • 236,525
  • 50
  • 385
  • 514
  • Thanks Svick. so Plinq Automatically divides the data among the cores , and I Can't know where the extra 3 will go.right ? – Royi Namir Aug 19 '12 at 06:47
  • No, PLINQ doesn't do anything with cores, only with threads. PLINQ divides the data among *threads* not cores. Like I said, scheduling the threads to the cores is the job of the OS, not .Net. And you can't know where any of the threads go, there is no difference between the first 4 and the “extra” 3. – svick Aug 19 '12 at 10:12
  • I think I have a little problem understanding : `PLINQ doesn't do anything with cores, The OS does....` ok . But when I use Plinq - wthe work **WILL** be divided among cores (let's assume execution mode = parallel)...right ? _PLINQ is a query execution engine that accepts any LINQ-to-Objects or LINQ-to-XML query and automatically utilizes multiple processors or cores for execution when they are available._ http://msdn.microsoft.com/en-us/magazine/cc163329.aspx – Royi Namir Aug 19 '12 at 10:34
  • That depends on the OS and other processes in the system. The OS will most likely divide the work of the threads among the cores, yeah. But it's still possible that you won't get any advantage from parallelization, if there are other CPU-intensive processes running. – svick Aug 19 '12 at 10:44
  • One thing which is strange : If I specify `WithDegreeOfParallelism` , I actually tell the CPU that it is not CPU intensive operation , so if I tell him `WithDegreeOfParallelism(10)` - he can actually open all those 10 threads in a single core....right ? – Royi Namir Aug 19 '12 at 13:00
  • You're not telling the CPU anything, you're just telling .Net that it shouldn't decide the number of threads automatically (this decision is based on the number of cores), you're telling it the number yourself. And the CPU then handles the threads PLINQ uses like any other threads. – svick Aug 19 '12 at 13:37
  • (I'm gonna ask till I understand :-)) You said : _the CPU then handles the threads PLINQ uses like any other threads...._ it not just like any other threads , I told him to use ASParallel() so it should divide effectivly Threads to Cores (!!!!!!!!!) You sound like ASParallel is just for openning threads , no matter where.....please explain Svick :-) – Royi Namir Aug 19 '12 at 18:15
  • @RoyiNamir [let's continue in chat](http://chat.stackoverflow.com/rooms/15533/discussion-between-royi-namir-and-svick) – svick Aug 19 '12 at 19:42
  • @svick That article contradicts what the documentation for WithDegreeOfParalellism() says....... // Summary: // Sets the degree of parallelism to use in a query. Degree of parallelism is // the maximum number of concurrently executing tasks that will be used to process // the query. // // Parameters: // source: // A ParallelQuery on which to set the limit on the degrees of parallelism. – C. Tewalt Oct 15 '14 at 06:20
  • @matrixugly I don't think that's a contradiction. The documentation specifies the *contract* of the method while the blog explains *implementation details*. For example mono's implementation should follow what the documentation says, but not necessarily what the blog says. – svick Oct 15 '14 at 08:39