7

My network includes 'torch.nn.MaxPool3d' which throw a RuntimeError when cudnn deterministic flag is on according to the PyTorch docs (version 1.7 - https://pytorch.org/docs/stable/generated/torch.set_deterministic.html#torch.set_deterministic), however, when I inserted the code 'torch.backends.cudnn.deterministic=True' at the beginning of my code, there was no RuntimeError. Why doesn't that code throw a RuntimeError? I wonder whether that code guarantees the deterministic computation of my training process.

chungseok
  • 73
  • 1
  • 5

1 Answers1

12

torch.backends.cudnn.deterministic=True only applies to CUDA convolution operations, and nothing else. Therefore, no, it will not guarantee that your training process is deterministic, since you're also using torch.nn.MaxPool3d, whose backward function is nondeterministic for CUDA.

torch.set_deterministic(), on the other hand, affects all the normally-nondeterministic operations listed here (note that set_deterministic has been renamed to use_deterministic_algorithms in 1.8): https://pytorch.org/docs/stable/generated/torch.use_deterministic_algorithms.html?highlight=use_deterministic#torch.use_deterministic_algorithms

As the documentation states, some of the listed operations don't have a deterministic implementation. So if torch.use_deterministic_algorithms(True) is set, they will throw an error.

If you need to use nondeterministic operations like torch.nn.MaxPool3d, then, at the moment, there is no way for your training process to be deterministic--unless you write a custom deterministic implementation yourself. Or you could open a GitHub issue requesting a deterministic implementation: https://github.com/pytorch/pytorch/issues

In addition, you might want to check out this page: https://pytorch.org/docs/stable/notes/randomness.html

Kurt Mohler
  • 361
  • 3
  • 8
  • 1
    this is gold: `As the documentation states, some of the listed operations don't have a deterministic implementation. So if torch.use_deterministic_algorithms(True) is set, they will throw an error.` Thanks I was confused about the errors I saw at one point. – Charlie Parker Jul 30 '21 at 15:29
  • 1
    if you're looking for the first line of the answer in the docs: https://pytorch.org/docs/stable/backends.html#torch.backends.cudnn.deterministic "A bool that, if True, causes cuDNN to only use deterministic convolution algorithms." – user1881282 Jan 18 '22 at 09:18