how to best design a multinode/multicore computing process

Question

I am trying to design a paralelization analysis. With it, I wish to iterate a given function across X iterations (the order of thousands), and I want to repeat them again under S specific conditions (scenarios).

Lets say I have available an N number of nodes, each one with X processors.

With these constrains, would it be possible to design the best parallelization strategy/script? For example:

Thinking of the function as two nested loops:

    for (s in 1:8) {print(paste("scenario",s)) 
  for (i in 1:100) {print(paste("iteration",i))
  }
}

I would wish to distribute each scenario across nodes and distribute iterations across cores. However, I do not see in paralelizing tutorials how to do that. Would this be possible?

Some examples: [1](https://psu-psychology.github.io/r-bootcamp-2018/talks/parallel_r.html#multi-node_parallelism), [2](https://redmine.pik-potsdam.de/projects/r-tutorial/wiki/Parallelize_R_jobs_on_the_cluster), [3](https://stackoverflow.com/questions/47757035/single-r-script-on-multiple-nodes), [4](https://www.smart-stats.org/wiki/parallel-computing-cluster-using-r). Edit: And [the review paper by Dirk Eddelbuettel](https://arxiv.org/pdf/1912.11144.pdf) is pretty sweet too. — slamballais, May 29 '21 at 10:07
(Disclaimer: I'm the author) The **future** framework supports nested parallelism with different parallel backend at each layer. There are many parallel backends to choose from. All parallelization is done at the R level, i.e. if you're after multi-_threading_, you need to turn to native code. See https://future.futureverse.org/articles/future-3-topologies.html for example of nested parallelization. In your particular example, you'd use `plan(list(layer1=tweak(cluster, workers=nodes), layer2=multisession))` per Section 'An ad-hoc compute cluster'. — HenrikB, May 30 '21 at 20:44

how to best design a multinode/multicore computing process

0 Answers0