I try to get some code running which is "embarassingly parallel", so I just started to look into parallel processing. I am trying to use parLapply
on a Linux machine (because it works perfectly fine under my Windows machine, whereas mclapply
would limit the code to Linux) but I encounter some problems.
This is how my code looks like:
cl <- makeCluster(detectCores(), type="FORK") # fork -> psock when I use Win
clusterExport(cl, some.list.of.things)
out <- parLapply(cl, some.fun)
stopCluster(cl)
At first, I noted that the parallel implementation is actually much slower than the sequential one, the reason being that on my Linux machine, each child process inherits the CPU of the parent. At least I think I can draw this conclusion by making the observation that in the systems monitor, all my r-session processes had only about 8% or so CPU time, and only one core was used. See this really helpful thread here.
I ended up using the code of that last thread, namely:
system(sprintf("taskset -p 0xffffffff %d", Sys.getpid()))
I need to mention here that I am not in any way familiar with any Linux basics. It is my university server run by other people, and I have no idea what the above code actually means and does apart from changing "1" to "ff" (whatever "ff" stands for). Anyway, after executing the above code, I can see that 3 out of 8 of my child processes receive almost full CPU time, which is a big improvement.
Having said that, there are 8 cores (determined by detectCores()
), and 8 child processes (as seen in the systems monitor), but "only" 3 child processes are working.
Given that I am completely new to parallel processing, I was wondering if you could give me some guidance as to how to make all 8 cores used. I feel like a blind person that doesn't know what he should be looking for to fix that situation. Any pointers to what I should change or what might be the problem would be highly appreciated!