R package that is a “parallel backend” for the foreach package. It provides a mechanism needed to execute foreach loops in parallel.
Questions tagged [doparallel]
453 questions
6
votes
1 answer
What triggers "Ancestor must be an environment" error?
I am running a parallelized calculation using foreach to work on a lot of time series simultaneously. Among those calculations (within a function called compute_slope() I do something like this
lBd <- floor(TMax^delta) # lower bound
uBd <- …

AlbertRapp
- 408
- 2
- 9
6
votes
1 answer
Parallel Computation for Create_Matrix 'RTextTools' package
I am creating a DocumentTermMatrix using create_matrix() from RTextTools and create container and model based on that. It is for extremely large datasets.
I do this for each category (factor levels). So for each category it has to run matrix,…

Prasanna Nandakumar
- 4,295
- 34
- 63
6
votes
1 answer
Do I have to registerDoParallel() and stopCluster() every time I want to use foreach() in R?
I read you had to use stopCluster() after running parallel function: foreach() in R. However, I can get away with registerDoParallel() and then running foreach() as many times as I want without ever using stopCluster(). So do I need stopCluster() or…

stavro
- 103
- 1
- 5
6
votes
0 answers
Run several R functions in parallel
I have a dataset with few numeric columns and over 100 millions of rows as a data.table object. I would like to do group operations on some of the columns based on other columns. For example, count unique elements of column "a" per each category in…

hm6
- 340
- 2
- 13
6
votes
2 answers
doParallel (package) foreach does not work for big iterations in R
I'm running the following code (extracted from doParallel's Vignettes) on a PC (OS Linux) with 4 and 8 physical and logical cores, respectively.
Running the code with iter=1e+6 or less, every thing is fine and I can see from CPU usage that all…

989
- 12,579
- 5
- 31
- 53
6
votes
1 answer
RPostgreSQL connections are expired as soon as they are initiated with doParallel clusterEvalQ
I'm trying to setup a parallel task where each worker will need to make database queries. I'm trying to setup each worker with a connection as seen in this question but each time I try it returns for however…

Dean MacGregor
- 11,847
- 9
- 34
- 72
5
votes
0 answers
R: parallelized reading of xml-files with xml2, doParallel and foreach
currently I'm working on a little R project to read some information out of Word files. Since those are zipped xml files under the hood, I thought that this task would be quite easy with R. My script basically works, but I wanted to increase its…

david
- 51
- 1
5
votes
3 answers
foreach loop becomes inactive for large iterations in R
I have an input csv file with 4500 rows. Each row has a unique ID and for each row, I have to read some data, do some calculation, and write the output in a csv file so that I have 4500 csv files written in my output directory. An individual output…

89_Simple
- 3,393
- 3
- 39
- 94
5
votes
1 answer
Why is this parallel computing code only using 1 CPU?
I am using foreach and parallel libraries to perform parallel computation, but for some reason, while running, it only uses 1 CPU at a time (I look it up using 'top' (Bash on Linux Terminal).
The server has 48 cores, and I've tried:
Using 24, 12…

Hart Radev
- 361
- 1
- 10
5
votes
1 answer
Very high CPU usage by Windows Defender when using doParallel's foreach in R
I have a Threadripper 1950X based workstation with 16 cores and 32 threads and plenty of memory. Running 64-bit R 3.6.0 (patched) on Windows 10, I frequently run parallel code in R using the doParallel library and the foreach command, frequently…

bshor
- 4,859
- 8
- 24
- 33
5
votes
1 answer
foreach doparallel on GPU
I have this code for writing my results in parallel. I am using foreach and doParallel libraries in R.
output_location='/home/Desktop/pp/'
library(foreach)
library(doParallel)
library(data.table)
no_cores <- detectCores()
…

9113303
- 852
- 1
- 16
- 30
5
votes
2 answers
How to export many variables and functions from global environment to foreach loop?
How can I export the global environment for the beginning of each parallel simulation in foreach? The following code is part of a function that is called to run the simulations.
num.cores <- detectCores()-1
cluztrr <- makeCluster(num.cores)
…

Actuary_Greg
- 63
- 1
- 6
5
votes
1 answer
How many cores is optimal in parallel processing?
Say I have an 8 core CPU. Using doParallel in R, when I register makeCluster(x), what is the ideal number of cores, x, to use?
Is it as many cores as possible? Or would using 7 cores be slower than using 6 cores? Are there any rules around this?

milkmotel
- 402
- 4
- 13
5
votes
3 answers
R: how to split dataframe in foreach %dopar%
This is a very simple example.
df = c("already ","miss you","haters","she's cool")
df = data.frame(df)
library(doParallel)
cl = makeCluster(4)
registerDoParallel(cl)
foreach(i = df[1:4,1], .combine = rbind, .packages='tm') %dopar%…

M.T.
- 51
- 1
- 3
5
votes
1 answer
R doParallel foreach worker timeout error and never returns
The following question is a very detailed question related to the question described here. Previous Question
Using Ubuntu Server 14.04 LTS 64-bit Amazon Machine Image launched on a c4.8xlarge (36 cores) with R version 3.2.3.
Consider the following…

user1325068
- 51
- 4