Questions tagged [rparallel]

This tag refers to the `parallel` package, R core team. It provides support for parallel computation in R.

72 questions
1
vote
1 answer

parallel data.table -- what's the correct syntax

Following up some data.table parallelism (1) (2) (3) I'm trying to figure it out. What's wrong with this syntax? library(data.table) set.seed(1234) dt <- data.table(id= factor(sample(1L:10000L, size= 1e6, replace= TRUE)), val= rnorm(n= 1e6),…
alexwhitworth
  • 4,839
  • 5
  • 32
  • 59
1
vote
1 answer

How to understand the master and worker processes---R package "parallel"?

As I tried to understand the documentation of the R package parallel, I encountered this question as I read some lines of codes on Page 8 in the package's documentation. I have copied the code in the following. Please note that mc is just equal to…
Zhenning
  • 11
  • 1
1
vote
2 answers

unrelated nested foreach with an outer %dopar% and an inner %do%

I am running tasks locally in parallel using %dopar% from the foreach package using the doSNOW package to create the cluster (running this on a windows machine at the moment). I have done this many times before and it works fine until I place an…
crogg01
  • 2,446
  • 15
  • 35
0
votes
0 answers

How to run R commands in parallel with controlled memory usage?

I have a large object (~50GB) that I wish to break into roughly 10 non-disjoint subsets to perform analysis on concurrently. The issue is that I have had numerous bad experiences with parallelism in R whereby large memory objects from the…
shians
  • 955
  • 1
  • 6
  • 21
0
votes
0 answers

Why does mcapply cause memory spike and slow execution with renderUI?

I want to parallelize making plotOutputs and plots using multiple observers. When I use lapply or mcapply with mc.cores = 1 the plots are re-generated quickly, but when mc.cores is increased, the first set of plots is created upon the first click of…
matt
  • 318
  • 3
  • 11
0
votes
1 answer

How to use imported names from box modules inside parallel code?

Here is a minimal example showing the issue: mod.r: #' @export run_sqrt <- function (x) { sqrt(x) } mwe.r box::use( ./mod[...], parallel, dp = doParallel, foreach[foreach, `%dopar%`], ) cl <-…
Ferroao
  • 3,042
  • 28
  • 53
0
votes
1 answer

R: Compare each element with all the other elements below in a list using foreach, parallel and doParallel

Aim: I'm trying to compare each element in a list with all the other elements below it using Levenshtein distance from this package stringsim to find text that is similar. Obstacle: The problem is that due to the time and space complexity, it will…
0
votes
1 answer

R optimParallel uses huge amounts of RAM

On my (large) server (Windows with 255GB RAM) my optimparallel skript is running out of memory and then crashes with Error in serialize(data, node$con) : error writing to connection. While I would understand if the data was huge and each node would…
Puki Luki
  • 573
  • 1
  • 4
  • 13
0
votes
1 answer

Tracking User Sessions across a website in R

I am looking for help with creating and tracking user sessions and activities within sessions using R. At a high level I have a column of user Ids and a column of timestamps. For each user ID I want to calculate the time difference between…
S.Markind
  • 27
  • 7
0
votes
1 answer

betareg not using multithreading on CentOS

Model Fitting Runs Single-Threaded on CentOS I am fitting a mixture of Beta regressions model with the betamix function from the betareg package. I originally developed the code on Mac OS X, but am now running it (i.e., moving to at scale) on an HPC…
merv
  • 67,214
  • 13
  • 180
  • 245
0
votes
1 answer

Unable to call create_cluster in multidplyr

I am able to load all the packages and able to see the no of cores available as well but I am getting Error in create_cluster(4) : could not find function "create_cluster" library(multidplyr) library(dplyr) library(parallel) numCores…
James
  • 35
  • 5
0
votes
0 answers

Parallel package for windows 10 in R

I have this dataset that I'm trying to parse in R. The data from HMDB and the dataset name is Serum Metabolites (in a format of xml file). The xml file contains about 25K metabolites nodes, each I want to parse to sub-nodes I have a code that parses…
TaL
  • 173
  • 2
  • 15
0
votes
1 answer

R occupying virtual Memory completely

I rewrote my program many times to not hit any memory limits. It again takes up full VIRT which does not make any sense to me. I do not save any objects. I write to disk each time I am done with a calculation. The code (simplified) looks like …
kn1g
  • 358
  • 3
  • 16
0
votes
1 answer

parSapplyLB with missing arguments

Suppose fun is a function with 3 arguments (x, y, z) and y or z needs to be specified, but not both. fun <- function(x, y, z) { if (missing(y)) { x^2 } else { x^5 } } Now assume this function gets call within another function…
shani
  • 217
  • 1
  • 8
0
votes
1 answer

How to parallelise C++ code when using Rcpp?

I have an R script which compiles C++ code via sourceCpp("prog.cpp") and then calls the function go that is exported from prog.cpp. This C++ code then makes quite a few calls back to R and, (after quite a long time) then finally returns the…
user4385532