I am learning about parallel computing in R , and I found this happening in my experiments.
Briefly, in the following example, why are most values of 'user' in t
smaller than that in mc_t
? My machine has 32GB memory, 2 cpus with 4 cores and 8 hyper threads in total.
system.time({t = lapply(1:4,function(i) {
m = matrix(1:10^6,ncol=100)
t = system.time({
m%*%t(m)
})
return(t)
})})
library(multicore)
system.time({
mc_t = mclapply(1:4,function(m){
m = matrix(1:10^6,ncol=100)
t = system.time({
m%*%t(m)
})
return(t)
},mc.cores=4)
})
> t
[[1]]
user system elapsed
11.136 0.548 11.703
[[2]]
user system elapsed
11.533 0.548 12.098
[[3]]
user system elapsed
11.665 0.432 12.115
[[4]]
user system elapsed
11.580 0.512 12.115
> mc_t
[[1]]
user system elapsed
16.677 0.496 17.199
[[2]]
user system elapsed
16.741 0.428 17.198
[[3]]
user system elapsed
16.653 0.520 17.198
[[4]]
user system elapsed
11.056 0.444 11.520
And sessionInfo()
:
> sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-pc-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] multicore_0.1-7
To clarify: Sorry that my decription may be ambiguous. I understand that parallel is still quicker for the whole mission. However, the time-counter is just in the function for calculation, the time of set-up overhead for each child process in mclapply
is not taken into consideration. So I am still confused why is this pure calculation(i.e., m%*%t(m)
) step slower.