0

I am trying to design a paralelization analysis. With it, I wish to iterate a given function across X iterations (the order of thousands), and I want to repeat them again under S specific conditions (scenarios).

Lets say I have available an N number of nodes, each one with X processors.

With these constrains, would it be possible to design the best parallelization strategy/script? For example:

Thinking of the function as two nested loops:

    for (s in 1:8) {print(paste("scenario",s)) 
  for (i in 1:100) {print(paste("iteration",i))
  }
}

I would wish to distribute each scenario across nodes and distribute iterations across cores. However, I do not see in paralelizing tutorials how to do that. Would this be possible?

Waldi
  • 39,242
  • 6
  • 30
  • 78
Agus camacho
  • 868
  • 2
  • 9
  • 24
  • 4
    Some examples: [1](https://psu-psychology.github.io/r-bootcamp-2018/talks/parallel_r.html#multi-node_parallelism), [2](https://redmine.pik-potsdam.de/projects/r-tutorial/wiki/Parallelize_R_jobs_on_the_cluster), [3](https://stackoverflow.com/questions/47757035/single-r-script-on-multiple-nodes), [4](https://www.smart-stats.org/wiki/parallel-computing-cluster-using-r). Edit: And [the review paper by Dirk Eddelbuettel](https://arxiv.org/pdf/1912.11144.pdf) is pretty sweet too. – slamballais May 29 '21 at 10:07
  • 1
    (Disclaimer: I'm the author) The **future** framework supports nested parallelism with different parallel backend at each layer. There are many parallel backends to choose from. All parallelization is done at the R level, i.e. if you're after multi-_threading_, you need to turn to native code. See https://future.futureverse.org/articles/future-3-topologies.html for example of nested parallelization. In your particular example, you'd use `plan(list(layer1=tweak(cluster, workers=nodes), layer2=multisession))` per Section 'An ad-hoc compute cluster'. – HenrikB May 30 '21 at 20:44

0 Answers0