Questions tagged [parallel-foreach]

this is a library in R which allows for easy parallel processing

212 questions
68
votes
1 answer

run a for loop in parallel in R

I have a for loop that is something like this: for (i=1:150000) { tempMatrix = {} tempMatrix = functionThatDoesSomething() #calling a function finalMatrix = cbind(finalMatrix, tempMatrix) } Could you tell me how to make this parallel ? I…
kay
  • 1,851
  • 3
  • 13
  • 14
38
votes
2 answers

"un-register" a doParallel cluster

If I run foreach... %dopar% without registering a cluster, foreach raises a warning, and executes the code sequentially: library("doParallel") foreach(i=1:3) %dopar% sqrt(i) Yields: Warning message: executing %dopar% sequentially: no parallel…
Zach
  • 29,791
  • 35
  • 142
  • 201
14
votes
3 answers

How can I speed up the training of my random forest?

I'm trying to train several random forests (for regression) to have them compete and see which feature selection and which parameters give the best model. However the trainings seem to take an insane amount of time, and I'm wondering if I'm doing…
13
votes
1 answer

Why is R for loop 10 times slower than when using foreach?

This is really blowing my mind. The basic loop takes like 8 seconds on my computer: system.time({ x <- 0 for (p in 1:2) { for (i in 1:500) { for (j in 1:5000) { x <- x + i * j } } } }) x Whereas if I use foreach…
Tomas
  • 57,621
  • 49
  • 238
  • 373
12
votes
3 answers

Is it possible to get a progress bar with foreach and a "multicore-kind" of backend

While using "multicore" parallelism using foreach and the doMC backend (I use doMC as at the time I looked into it other package did not allow logging from the I would like to get a progress bar, using the progress package, but any progress (that…
statquant
  • 13,672
  • 21
  • 91
  • 162
9
votes
2 answers

What is the best practice for making functions in my R package parallelizable?

I have developed an R package that contains embarassingly parallel functions. I would like to implement parallelization for these functions in a way that is transparent to the user, regardless of his/her OS (at least ideally). I have looked around…
C8H10N4O2
  • 18,312
  • 8
  • 98
  • 134
9
votes
1 answer

Understanding parallel TSQL connections

I managed to create parallel connections in R to a TSQL server using the below code: SQL_retrieve <- function(x){ con <- odbcDriverConnect( 'driver={SQL…
Evan Larson
  • 181
  • 1
  • 10
9
votes
1 answer

Parallelization doesn't work with the foreach package

Using the foreach package, I was expecting the following line to run in about 10 seconds system.time(foreach (i=1:5, .combine='c') %do% {Sys.sleep(2);i}) user system elapsed 0.053 0.011 10.012 and the following line to run in about 2…
Remi.b
  • 17,389
  • 28
  • 87
  • 168
8
votes
2 answers

R foreach: from single-machine to cluster

The following (simplified) script works fine on the master node of a unix cluster (4 virtual cores). library(foreach) library(doParallel) nc = detectCores() cl = makeCluster(nc) registerDoParallel(cl) foreach(i = 1:nrow(data_frame_1), .packages =…
Antoine
  • 1,649
  • 4
  • 23
  • 50
7
votes
1 answer

How to switch programmatically between %do% and %dopar% in foreach?

By changing %dopar% to %do% when using foreach, I can run the code sequentially. How can I do this programmatically? E.g. I want the following but with only ONE foreach statement: library(doParallel) library(foreach) registerDoParallel(cores =…
katsumi
  • 154
  • 8
6
votes
2 answers

doParallel (package) foreach does not work for big iterations in R

I'm running the following code (extracted from doParallel's Vignettes) on a PC (OS Linux) with 4 and 8 physical and logical cores, respectively. Running the code with iter=1e+6 or less, every thing is fine and I can see from CPU usage that all…
989
  • 12,579
  • 5
  • 31
  • 53
6
votes
2 answers

R parallel: rbind parallely into separate data.frames

The below code produces different results on Windows and Ubuntu platforms. I understand it is because of the different methods of handling parallel processing. Summarizing: I cannot insert / rbind data on Linux parallely (mclapply, mcmapply) while…
jangorecki
  • 16,384
  • 4
  • 79
  • 160
6
votes
3 answers

parallel k-means in R

I am trying to understand how to parallelize some of my code using R. So, in the following example I want to use k-means to cluster data using 2,3,4,5,6 centers, while using 20 iterations. Here is the code:…
hema
  • 725
  • 1
  • 8
  • 20
5
votes
1 answer

Why is this parallel computing code only using 1 CPU?

I am using foreach and parallel libraries to perform parallel computation, but for some reason, while running, it only uses 1 CPU at a time (I look it up using 'top' (Bash on Linux Terminal). The server has 48 cores, and I've tried: Using 24, 12…
Hart Radev
  • 361
  • 1
  • 10
5
votes
1 answer

Very high CPU usage by Windows Defender when using doParallel's foreach in R

I have a Threadripper 1950X based workstation with 16 cores and 32 threads and plenty of memory. Running 64-bit R 3.6.0 (patched) on Windows 10, I frequently run parallel code in R using the doParallel library and the foreach command, frequently…
bshor
  • 4,859
  • 8
  • 24
  • 33
1
2 3
14 15