Questions tagged [doparallel]

R package that is a “parallel backend” for the foreach package. It provides a mechanism needed to execute foreach loops in parallel.

453 questions
6
votes
1 answer

What triggers "Ancestor must be an environment" error?

I am running a parallelized calculation using foreach to work on a lot of time series simultaneously. Among those calculations (within a function called compute_slope() I do something like this lBd <- floor(TMax^delta) # lower bound uBd <- …
AlbertRapp
  • 408
  • 2
  • 9
6
votes
1 answer

Parallel Computation for Create_Matrix 'RTextTools' package

I am creating a DocumentTermMatrix using create_matrix() from RTextTools and create container and model based on that. It is for extremely large datasets. I do this for each category (factor levels). So for each category it has to run matrix,…
6
votes
1 answer

Do I have to registerDoParallel() and stopCluster() every time I want to use foreach() in R?

I read you had to use stopCluster() after running parallel function: foreach() in R. However, I can get away with registerDoParallel() and then running foreach() as many times as I want without ever using stopCluster(). So do I need stopCluster() or…
stavro
  • 103
  • 1
  • 5
6
votes
0 answers

Run several R functions in parallel

I have a dataset with few numeric columns and over 100 millions of rows as a data.table object. I would like to do group operations on some of the columns based on other columns. For example, count unique elements of column "a" per each category in…
hm6
  • 340
  • 2
  • 13
6
votes
2 answers

doParallel (package) foreach does not work for big iterations in R

I'm running the following code (extracted from doParallel's Vignettes) on a PC (OS Linux) with 4 and 8 physical and logical cores, respectively. Running the code with iter=1e+6 or less, every thing is fine and I can see from CPU usage that all…
989
  • 12,579
  • 5
  • 31
  • 53
6
votes
1 answer

RPostgreSQL connections are expired as soon as they are initiated with doParallel clusterEvalQ

I'm trying to setup a parallel task where each worker will need to make database queries. I'm trying to setup each worker with a connection as seen in this question but each time I try it returns for however…
Dean MacGregor
  • 11,847
  • 9
  • 34
  • 72
5
votes
0 answers

R: parallelized reading of xml-files with xml2, doParallel and foreach

currently I'm working on a little R project to read some information out of Word files. Since those are zipped xml files under the hood, I thought that this task would be quite easy with R. My script basically works, but I wanted to increase its…
david
  • 51
  • 1
5
votes
3 answers

foreach loop becomes inactive for large iterations in R

I have an input csv file with 4500 rows. Each row has a unique ID and for each row, I have to read some data, do some calculation, and write the output in a csv file so that I have 4500 csv files written in my output directory. An individual output…
89_Simple
  • 3,393
  • 3
  • 39
  • 94
5
votes
1 answer

Why is this parallel computing code only using 1 CPU?

I am using foreach and parallel libraries to perform parallel computation, but for some reason, while running, it only uses 1 CPU at a time (I look it up using 'top' (Bash on Linux Terminal). The server has 48 cores, and I've tried: Using 24, 12…
Hart Radev
  • 361
  • 1
  • 10
5
votes
1 answer

Very high CPU usage by Windows Defender when using doParallel's foreach in R

I have a Threadripper 1950X based workstation with 16 cores and 32 threads and plenty of memory. Running 64-bit R 3.6.0 (patched) on Windows 10, I frequently run parallel code in R using the doParallel library and the foreach command, frequently…
bshor
  • 4,859
  • 8
  • 24
  • 33
5
votes
1 answer

foreach doparallel on GPU

I have this code for writing my results in parallel. I am using foreach and doParallel libraries in R. output_location='/home/Desktop/pp/' library(foreach) library(doParallel) library(data.table) no_cores <- detectCores() …
9113303
  • 852
  • 1
  • 16
  • 30
5
votes
2 answers

How to export many variables and functions from global environment to foreach loop?

How can I export the global environment for the beginning of each parallel simulation in foreach? The following code is part of a function that is called to run the simulations. num.cores <- detectCores()-1 cluztrr <- makeCluster(num.cores) …
5
votes
1 answer

How many cores is optimal in parallel processing?

Say I have an 8 core CPU. Using doParallel in R, when I register makeCluster(x), what is the ideal number of cores, x, to use? Is it as many cores as possible? Or would using 7 cores be slower than using 6 cores? Are there any rules around this?
milkmotel
  • 402
  • 4
  • 13
5
votes
3 answers

R: how to split dataframe in foreach %dopar%

This is a very simple example. df = c("already ","miss you","haters","she's cool") df = data.frame(df) library(doParallel) cl = makeCluster(4) registerDoParallel(cl) foreach(i = df[1:4,1], .combine = rbind, .packages='tm') %dopar%…
M.T.
  • 51
  • 1
  • 3
5
votes
1 answer

R doParallel foreach worker timeout error and never returns

The following question is a very detailed question related to the question described here. Previous Question Using Ubuntu Server 14.04 LTS 64-bit Amazon Machine Image launched on a c4.8xlarge (36 cores) with R version 3.2.3. Consider the following…
1
2
3
30 31