My network includes 'torch.nn.MaxPool3d' which throw a RuntimeError when cudnn deterministic flag is on according to the PyTorch docs (version 1.7 - https://pytorch.org/docs/stable/generated/torch.set_deterministic.html#torch.set_deterministic), however, when I inserted the code 'torch.backends.cudnn.deterministic=True' at the beginning of my code, there was no RuntimeError. Why doesn't that code throw a RuntimeError? I wonder whether that code guarantees the deterministic computation of my training process.
1 Answers
torch.backends.cudnn.deterministic=True
only applies to CUDA convolution operations, and nothing else. Therefore, no, it will not guarantee that your training process is deterministic, since you're also using torch.nn.MaxPool3d
, whose backward function is nondeterministic for CUDA.
torch.set_deterministic()
, on the other hand, affects all the normally-nondeterministic operations listed here (note that set_deterministic
has been renamed to use_deterministic_algorithms
in 1.8): https://pytorch.org/docs/stable/generated/torch.use_deterministic_algorithms.html?highlight=use_deterministic#torch.use_deterministic_algorithms
As the documentation states, some of the listed operations don't have a deterministic implementation. So if torch.use_deterministic_algorithms(True)
is set, they will throw an error.
If you need to use nondeterministic operations like torch.nn.MaxPool3d
, then, at the moment, there is no way for your training process to be deterministic--unless you write a custom deterministic implementation yourself. Or you could open a GitHub issue requesting a deterministic implementation: https://github.com/pytorch/pytorch/issues
In addition, you might want to check out this page: https://pytorch.org/docs/stable/notes/randomness.html

- 361
- 3
- 8
-
1this is gold: `As the documentation states, some of the listed operations don't have a deterministic implementation. So if torch.use_deterministic_algorithms(True) is set, they will throw an error.` Thanks I was confused about the errors I saw at one point. – Charlie Parker Jul 30 '21 at 15:29
-
1if you're looking for the first line of the answer in the docs: https://pytorch.org/docs/stable/backends.html#torch.backends.cudnn.deterministic "A bool that, if True, causes cuDNN to only use deterministic convolution algorithms." – user1881282 Jan 18 '22 at 09:18