I am working with foreach and doParallel package in Windows, but the CPU used in the code is less than 10% during the foreach function. This is the code that I use with a small example.
library(doParallel)
library(foreach)
library(dplyr)
library(Matrix)
cl <- detectCores() - 1
registerDoParallel(cl)
n_max=1300000
df=data.frame(fromID=sample(c(1:1300000),2000,replace=TRUE),
toID=sample(c(1:1300000),2000,replace=TRUE),
group=sample(c(1:10),2000,replace=TRUE))
As=foreach (i=1:10,.packages=c('dplyr','Matrix'))%dopar%{
databygroup=filter(df,group==i)
sparseMatrix(i=databygroup$fromID,j=databygroup$toID,x=1,dims=c(n_max,n_max))
}
stopImplicitCluster()
Before using the foreach, I have this result to know how many workers are active.
> cat(sprintf('%s backend is registered\n',
+ if(getDoParRegistered()) 'A' else 'No'))
A backend is registered
> cat(sprintf('Running with %d worker(s)\n', getDoParWorkers()))
Running with 35 worker(s)
> (name <- getDoParName())
[1] "doParallelSNOW"
> (ver <- getDoParVersion())
[1] "1.0.11"
> if (getDoParRegistered())
+ cat(sprintf('Currently using %s [%s]\n', name, ver))
Currently using doParallelSNOW [1.0.11]
The message that I received is this for several connections
"In if (.Internal(exists(package, .Internal(getNamespaceRegistry()), ... : closing unused connection 70..."
And after using "stopImplicitCluster" function, the number of workers is the same. So, I am not able to close the workers.
stopCluster(cl) doesn´t work
> cat(sprintf('Running with %d worker(s)\n', getDoParWorkers()))
Running with 2 worker(s)
> (name <- getDoParName())
[1] "doParallelSNOW"
> (ver <- getDoParVersion())
[1] "1.0.11"
> if (getDoParRegistered())
+ cat(sprintf('Currently using %s [%s]\n', name, ver))
Currently using doParallelSNOW [1.0.11]
> stopCluster(cl)
> cat(sprintf('Running with %d worker(s)\n', getDoParWorkers()))
Running with 2 worker(s)
> stopCluster(cl)
Error in summary.connection(connection) : invalid connection
I don´t know why the parallelization is not working.
Thank you for your time