0

I am using R for a reproducible scientific machine learing & hyperparameter optimizations. I stumble upon the fact that other implementations of blas such openblas/atlas/klm can speedup this costly optimization. But results are slightly different using each blas even if optimization is forced on single thread results deviate from default R.

So I want to try using Docker to contain the experiment. I have multiple questions.

  1. is it good to compile from source instead of binaries ?

  2. if I compile from source, will it lead to same configuration as debian binaries ?

  3. since results are different for each blas, there is a tool called ReproBLAS from Berkeley, is it good idea to use it with R ?

  4. when you compile R using "--with-blas=-lopenblas" in this case openblas is single threaded or multithreaded ?

conradkleinespel
  • 6,560
  • 10
  • 51
  • 87
  • Do you really need results to be completely reproducible down to `x` significant digits? If your results can be reproduced to a reasonable level of precision and the conclusions are the same, I would think that would be good enough in a lot of cases. Then as long as you can explain the source of any differences (which I assume are fairly minor), people should be OK with that. – Marius Jul 06 '17 at 06:49
  • the problem is not being reproducible to x digit but that x digit determine the value of the hyperparamter to be used...for example in randomforest openblas lead to 6 & 975 however reference blas(basic blas) lead 6 & 1000 for interaction depth & number of trees respectively.... – aminevsaziz Jul 06 '17 at 06:56

0 Answers0