2

I am training some deep learning models using pytorch, which also includes the usage of numpy. Since the randomisation is not truly random and it is pseudo-random, why aren't the numbers i.e., accuracy etc same across different runs?

I mean, even if I do not set some random seed, there should be some default random seed according to which my code must run and give same results across different runs. Is there something more to it?

Peter O.
  • 32,158
  • 14
  • 82
  • 96
Megh
  • 67
  • 1
  • 1
  • 7

1 Answers1

2

I don't think the truly random vs. pseudo-random discussion is relevant here. Different numbers may be generated depending on factors like date and time, which is why you should set a seed.

If you involve PyTorch and CUDA, things get a little more complicated. Here is an article talking about randomness and reproducibility.

In short, you need to set seeds for numpy + PyTorch and also set the backend to deterministic operations.

pietz
  • 2,093
  • 1
  • 21
  • 23
  • 1
    Thanks @pietz. I did not know that time and date affect the randomness. Yeah, I have come across the article that you gave, thanks, it gives me more clarity. So, it's basically hard to make my algorithm exactly deterministic, but I stop different non-deterministic elements from creeping into my algorithm. Anyways, this is something to ponder about in a different discussion, but I have no idea how non-determinism arises in a computer system. I mean the weirdest thing could be the temperature of the GPU, which could be a source of almost true randomness. Thanks, anyways. – Megh Nov 23 '20 at 18:18