Questions tagged [doparallel]

R package that is a “parallel backend” for the foreach package. It provides a mechanism needed to execute foreach loops in parallel.

453 questions
2
votes
3 answers

What is the fastest way to perform an exhaustive search in R

I am implementing a version of the Very Large Scale Relieff algorithm detailed here. Simply put, Very Large Scale Relieff split the set of features N into several random subsets Ns where Ns << N. Then it calculates the Relieff weights for the…
fednem
  • 95
  • 2
  • 13
2
votes
0 answers

How do you use more available cores when using DoParallel to tune models on tidymodel

I'm tuning some random forest models using ranger in tidymodels. I have a fairly large dataset with many columns. As a result, I set up a digital ocean droplet for tuning/trainng using instructions from Danny Foster's article: R on Digital Ocean.…
Mutuelinvestor
  • 3,384
  • 10
  • 44
  • 75
2
votes
0 answers

How to make shiny progress bar work when using foreach

I am using foreach package inside my shiny app to run operations in parallel. I don't know how to make regular shiny progress bar work. Below an simple example that illustrates my…
Mark Perez
  • 177
  • 7
2
votes
0 answers

When to use OpenMP vs DoParallel?

For a for loop in R that is parallelized with DoParallel::foreach, is there any advantage to porting the code to Rcpp and using OpenMP to do the parallelization instead? Assuming that the code itself runs equally fast in R and C, are there any…
zdebruine
  • 3,687
  • 6
  • 31
  • 50
2
votes
1 answer

How to split a dataframe for parallel processing and then recombine the results?

I'm looking to split up a dataframe for parallel processing in order to speed up the processing time. What I have so far (broken code): library(tidyverse) library(iterators) library(doParallel) library(foreach) data_split <- split(iris,…
SCDCE
  • 1,603
  • 1
  • 15
  • 28
2
votes
0 answers

Redirect ggplot2 prints to a singple pdf file inside parLapply

I would like to know if there is a way to get a single PDF file containing all the plots that are generated inside parLapply. I tried by using outfile option in the makeClusters function, but it gave me a PDF that I could not open. Regards, Juan
2
votes
1 answer

"object not found" in foreach loop

I am running vector autoregression models in R using vars library and I want to utilize the foreach function to run models in parallel but it yields an error saying Error in { : task 1 failed - "object 'exogen.train' not found" The code runs fine…
T. J.
  • 90
  • 1
  • 6
2
votes
1 answer

What is foreach %dopar% actually doing when applied to a dataframe as in df[i,]

I think I've completely misunderstood how foreach parallel operations work. In the following example is foreach running 7 independant threads of foo(DF[i,]) for different values of i which leapfrog each other to get the next available row?…
D3SL
  • 117
  • 8
2
votes
0 answers

foreach memory usage in R

I am trying the following code which includes a foreach loop to compute the normalized columns of a matrix A: library(doParallel) library(tictoc) A <- matrix(1.0, 5000, 1000) cl <- makeCluster(2) registerDoParallel(cl) gcinfo(TRUE) tic() res1 <-…
armando
  • 1,360
  • 2
  • 13
  • 30
2
votes
2 answers

How to get same results using loop and parallel in R?

I test the influence of training data on the accuracy of classification. For example, I use iris data. I noticed that I get the best accuracy from 33 iteration. I would like to use the training set (iristrain) from iteration for further analysis. I…
Rick_H
  • 77
  • 6
2
votes
1 answer

Processing Large Data Sets in R

I have a data set of ~5mm rows of businesses with contact information (ID(int), Email(text), BusinessPhone(text), WorkPhone(text), CellPhone(text)) - over 3 million of these rows contain duplicate data. But the dupes aren't exact dupes - for…
2
votes
1 answer

Can I Use Only One RODBC Connection in Foreach using doParallel in R?

I know that I can open an SQL Server connection in each worker, however, it opens multiple connections to the server at the same time. My work's Database Administrators are saying that I am using too many system resources by having multiple…
James Marquez
  • 365
  • 4
  • 12
2
votes
1 answer

Nested maximisation in parallel with the need of using global variables in R

I've an R code with two nested optimisations. There's an outer and an inner function. The outer function passed certain parameters along to the inner function, which performs an optimisation on another set of parameters. These parameters are then…
Andrew
  • 678
  • 2
  • 9
  • 19
2
votes
0 answers

BTYDplus: scheduled cores 1, 2, 8 did not deliver results, all values of the jobs will be affected

I'm using BTYDplus package functions. Sometimes, quite randomly I would encounter this error below Error in mcmc.list(lapply(draws, function(draw) draw$level_1[[i]])) : Arguments must be mcmc objects In addition: Warning message: In…
Afiq Johari
  • 1,372
  • 1
  • 15
  • 28
2
votes
1 answer

Why is using %dopar% with foreach causing R to not recognize package?

I was trying to get my code to run in parallel on R by using the doParallel package with the foreach package. I am also using the sf package to manipulate shp files. I made sure all my code worked in the foreach loop just using %do% so if there was…
Will-i-am
  • 73
  • 6