Questions tagged [mclapply]

mclapply is a parallelized version of lapply, it returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.

mclapply is a parallelized version of lapply. It returns a list of the same length as X, each element of which is the result of applying FUN to the corresponding element of X.

136 questions
3
votes
2 answers

Increasing mc.cores beyond the number of logical cores

Playing around with R function parallel::mclapply, I found that argument mc.cores can be chosen greater than the number of logical cores (as indicated by parallel::detectCores), resulting in speedup greater than the number of logical cores. Here's a…
3
votes
1 answer

foreach very slow with large number of values

I'm trying to use foreach to do parallel computations. It works fine if there are a small number of values to iterate over, but at some point it becomes incredibly slow. Here's a simple…
Rob Richmond
  • 855
  • 6
  • 19
3
votes
0 answers

Environment when running function using mclapply

I have a function like the following: fxn <- function(X) { data <- replicate(10, rnorm(10000)) clusters <- kmeans(data, X) write.csv(clusters$cluster, paste0("kmeans", X, ".csv"))} I want to use mclapply to iterate it in parallel. list…
Jack Arnestad
  • 1,845
  • 13
  • 26
3
votes
2 answers

weird segfault in R when using mclapply in Linux

I have encountered this weird segfault error and have zero clue how to solve it. I was running some Markov Chain Monte Carlo algorithm (a sequential algorithm that approximates a distribution). I parallelize each single iteration of this algorithm.…
Bayesric
  • 329
  • 3
  • 13
3
votes
1 answer

R - get worker name when running in parallel

I am running a function in parallel. In order to get progress updates on the state of the work, I would like one but only one worker to report periodically on its progress. My natural thought for how to do this would be to have the function that…
Michael Ohlrogge
  • 10,559
  • 5
  • 48
  • 76
3
votes
1 answer

mcapply: all scheduled cores encountered errors in user code

The following is my code. I am trying get the list of all the files (~20000) that end with .idat and read each file using the function illuminaio::readIDAT. library(illuminaio) library(parallel) library(data.table) # number of cores to use ncores =…
Komal Rathi
  • 4,164
  • 13
  • 60
  • 98
3
votes
1 answer

Fast ANOVA computation in R

I have a dataframe with the following dimensions: dim(b) [1] 974 433685 The columns represent variables that I want to run ANOVAs on (i.e., I want to run 433,685 ANOVAs). Sample size is 974. The last column is the 'group' variable. I've come…
Chad Johnson
  • 179
  • 1
  • 5
3
votes
2 answers

When using mclapply, each single core is slower than its unparallelized version

I am learning about parallel computing in R , and I found this happening in my experiments. Briefly, in the following example, why are most values of 'user' in t smaller than that in mc_t ? My machine has 32GB memory, 2 cpus with 4 cores and 8 hyper…
TomHall
  • 286
  • 3
  • 15
3
votes
2 answers

Unwanted bold-face while putting multiple ggplot charts in the same file

I don't know if you have seen some unwanted bold-face font like picture below: As you see the third line is bold-faced, while the others are not. This happens to me when I try to use ggplot() with lapply() or specially mclapply(), to make the same…
Ali
  • 9,440
  • 12
  • 62
  • 92
3
votes
1 answer

R Checking for duplicates is painfully slow, even with mclapply

I've got some data involving repeated sales for a bunch of of cars with unique Ids. A car can be sold more than once. Some of the Ids are erroneous however, so I'm checking, for each Id, if the size is recorded as the same over multiple sales. If it…
N. McA.
  • 4,796
  • 4
  • 35
  • 60
3
votes
2 answers

Semi-global variable to mclapply

In a function, I need to run mclapply per each item in a list and it should also use a semi-global variable var.1. I don't want to add var.1 to every list-item as it would take too much memory. Here is code that illustrate the…
Chris
  • 2,256
  • 1
  • 19
  • 41
3
votes
0 answers

R/sqldf/mclapply, How can I use sqldf and mclapply together?

Hi I am trying to use sqldf to fetch data from my database. Since sqldf will always load tcltk, I can not use mclapply function. How can I do with that? Thanks. Here is an example. options(gsubfn.engine =…
user1589
  • 151
  • 1
  • 5
3
votes
1 answer

What is the ideal format to store large results generated by R?

I simulate reasonably sized datasets (10-20mb) through a large number of parameter combinations (20-40k). Each dataset x parameter set is pushed through mclapply and the result is a list where each item contains output data (as list item 1) and…
Maiasaura
  • 32,226
  • 27
  • 104
  • 108
2
votes
1 answer

Is there any simple task that allows me to understand whether my embarassingly parallel program works fine?

I am employing an embarassingly parallel routine using mclapply() of the parallel package in R in order to simulate independent paths of a stochastic process. I am surprised that I could not get any speed gain over the non-parallel program,…
Mr Frog
  • 296
  • 2
  • 16
2
votes
2 answers

Apply function in matrix elements of a list in R

I have a list of elements in R as follows: set.seed(123) A <- matrix(rnorm(20 * 20, mean = 0, sd = 1), 20, 20) B <- matrix(rnorm(20 * 20, mean = 0, sd = 1), 20, 20) C <- matrix(rnorm(20 * 20, mean = 0, sd = 1), 20, 20) D <- matrix(rnorm(20 * 20,…
nickolakis
  • 621
  • 3
  • 7
1 2
3
9 10