Why is convolution in cuDNN non-deterministic?

Question

The PyTorch documentary says, when using cuDNN as backend for a convolution, one has to set two options to make the implementation deterministic. The options are torch.backends.cudnn.deterministic = True and torch.backends.cudnn.benchmark = False. Is this because of the way weights are initialized?

Take a look at the Reproducibility section in [cuDNN docs](https://docs.nvidia.com/deeplearning/sdk/cudnn-developer-guide/index.html#reproducibility). — Berriel, Jul 01 '20 at 01:35

score 0 · Answer 1 · answered Feb 20 '23 at 01:25

When torch.backends.cudnn.deterministic is set to True, CuDNN will use deterministic algorithms for these operations, meaning that given the same input and parameters, the output will always be the same. This can be useful in situations where you need reproducibility in your results, such as when debugging or when comparing different model architectures.

However, using deterministic algorithms can come at a cost of performance, as some of the optimizations that make CuDNN fast may not be compatible with determinism. Therefore, setting torch.backends.cudnn.deterministic to True may result in slower training times.

Why is convolution in cuDNN non-deterministic?

1 Answers1