0

I'm running an MCMC algorithm and Microsoft R open on Windows 7 has improved my speed a lot. But right now I need to run tons of simulations using my algorithm, so I used the R snow package to parallel my code. However, it doesn't work.

To be specific, the Microsfot R open on my PC is using 4 cores for calculation, while there are 8 cores in total. So I'm thinking I will parallel 2 process on my PC since each will need 4 cores for MKL library. But the parallel isn't real at all. I set up all my 8 cores when paralleling. My test program will need 5 minutes to run. But if I'm paralleling my program with a copy of that, I hope the 2 process will take 5 minutes as well. But actually it took 10 minutes, just like running the 2 process sequentially.

The same thing happened if I tried to open two R sessions and run the programs in the two R sessions. Usually it will only need 5 mins, but now each of them will take 10 mins.

So where am I messing up? Is that the problems about two layers of parallel? One is at my level, the other one is at the intel MKL level?

Jiang Du
  • 189
  • 2
  • 14
  • Don't get confused by physical and logical cores. The logical ones, as counted via `library(parallel);detectCores()`, include hyperthreading. But Microsoft R Open at startup is reporting physical cores. For instance, MRO says on my machine `Multithreaded BLAS/LAPACK libraries detected. Using 2 cores for math algorithms.`. But `detectCores()` is reporting `4`. – cryo111 Sep 12 '16 at 18:00
  • @cryo111 So I think in my case there are 4 physical cores on my PC. I just tried to use setMKLthreads(1) to limit it to 1 physical cores but still doesn't help. Actually, setMKLthreads(1) is only 1 minutes slower than setMKLthreads(4), which is OK in my case. If 1 core MKL would work correctly when I use all my 4 cores for paralleling, I can also get my work done. But it didn't work with the same result above. – Jiang Du Sep 12 '16 at 18:05
  • It would be easier if you provided some code with a reproducible example. Instead of the time-intensive computation, you could use `Sys.sleep(20)` or so... – cryo111 Sep 12 '16 at 18:19
  • It's so sad that I can't reproduce the problem with some test code. The parallel works perfect for the test code.... Will there be a problem for some C++ code I used? I wrote a R package using Rcpp and compiled it(4 cores when compiling) and load the package to every cluster. – Jiang Du Sep 12 '16 at 18:38
  • Writing an Rcpp package and uploading to each cluster node is the right way to do it. I assume you did not use the C++ threads library in your C++ code, right? Otherwise, this might interfere with R. I would add a simple C++ function to your package and see whether it works then. Then from this simple function I would work my way up to the full C++ routine that you actually want to implement. Somewhere on this way, there might be the issue. – cryo111 Sep 12 '16 at 18:46
  • @cryo111 I just realized that there is a function setMKLthreads(1) can set MKL using only 1 core, so I can do parallel at my own. However, it seems that commend only works for R, not the Rcpp code I've written. I can't figure out the way to add a statement in my cpp code claiming only using one core.... – Jiang Du Sep 13 '16 at 22:58

1 Answers1

0

There are way too many factors at play here to figure it out without knowing certain details about your code. For example, what is the affinity mask in effect for each process? What is the Tread Ideal Processor for threads in concurrent processes? It is possible, that your processes are trying to compete for the same cores. You can find more details by looking at the SetThreadIdealProcessor and SetProcessAffinityMask APIs. It is also possible that your code is using a shared resource protected by a critical section or other synchronization object. I would start by downloading Process Explorer from Sysinternals and looking at the thread list for each process. This would tell you how many physical threads are running and how many context switches are there for each thread. This will give you something to start with.

Victor Havin
  • 1,023
  • 7
  • 11
  • Thanks for your reply. You've mentioned a lot things I've never heard before. I will try to look at Process Explorer seeing if I can find any clues. – Jiang Du Sep 12 '16 at 17:57