7

Possible Duplicate:
Parallel processing in R limited

I've written some code in R multicore, and I'm running it on a 24-core machine. In fact there are only 12 cores, but they are hyperthreaded, so it looks like there are 24.

Here's what's strange: all the threads run on the same single core! So they each only use a tiny amount of cpu, instead of each running on a single core, and chewing up all available cores.

For simplicity, I'm just running 4 threads:

mclapply( 1:30, function(size) {
    # time consuming stuff that is cpu bound (think "forecast.ets" et al)
}, mc.cores = 4, mc.preschedule = F )

Prior to running this, there is already an R process running on one core, using 100% of that core's capacity:

enter image description here

Next, I launch the "multicore process", and 4 extra threads fight for the same core!:

enter image description here

... so, they each get 12% of one core, or about 1% of the available processing power, when they should each be able to get 100% of one core. Also, the other R process now only get 50% of the core.

OS is Ubuntu 12.04 64-bit. Hardware is Intel. R is version 2.15.2 "trick or treat"

Thoughts? (I know I could just use snowfall, but I have a ton of variables, and I really don't want to have to sfExport all of them!)

Edit: oh, I guess there's some global lock somewhere? But still, why would there be a conflict between two completely separate R processes? I can run two R processes in parallel just fine, with each taking 100% of a core's CPU.

Edit2: Thanks to Dirk's pointer, I rebuilt openblas, and it's looking much healthier now!:

enter image description here

Community
  • 1
  • 1
Hugh Perkins
  • 7,975
  • 7
  • 63
  • 71
  • Have you run "registerDC" before "doMC" ? – Gong-Yi Liao Oct 29 '12 at 17:51
  • Haven't heard of either, so no, and I will go and find what those are now. – Hugh Perkins Oct 29 '12 at 17:51
  • Hmmm, I'm using the multicore package that comes with R, as the `parallel` package. There doesn't seem to be either of those two functions. Should I be better off downloading the raw `multicore` package instead of using `parallel`? – Hugh Perkins Oct 29 '12 at 17:52
  • No. Did you read the vignette that came with the package? – Gavin Simpson Oct 29 '12 at 17:56
  • No. I will go and look for a vignette now. I read the `multicore.pdf` file, and the `?multicore` output. – Hugh Perkins Oct 29 '12 at 17:58
  • `vignette("parallel")` "Vignette 'multicore' not found"; `vignette("parallel")` "Vignette 'parallel' not found". What command should I use to open the vignette? – Hugh Perkins Oct 29 '12 at 17:59
  • That should've worked (and works for me). Alternatively, Googling "R parallel package" gives it to me as well: http://stat.ethz.ch/R-manual/R-devel/library/parallel/doc/parallel.pdf – Josh O'Brien Oct 29 '12 at 18:25

1 Answers1

8

A possible issue is a possible side effect of the OpenBLAS package which sets CPU affinity such that processes stick to one core. See Parallel processing in R limited for a discussion and link to more discussion on the r-sig-hpc list which has a fix.

Community
  • 1
  • 1
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725