Reproducibility issue with PyTorch

Question

I'm running a script with the same seed and I see results are reproduced on consecutive runs but somehow running the same script with the same seed changes the output after a few days. I'm only getting a short-term reproducibility which is weird. For reproducibility my script includes the following statements already:

torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
torch.use_deterministic_algorithms(True)

random.seed(args.seed)
np.random.seed(args.seed)
torch.manual_seed(args.seed)

I also checked the sequence of instance ids created by the RandomSampler for train Dataloader which is maintained across runs. Also set the num_workers=0 in the dataloader.What could be causing the output to change?

score 1 · Answer 1 · edited Sep 01 '23 at 14:41

PyTorch is actually not fully deterministic. Meaning that with a set seed, some PyTorch operations will simply behave differently and diverge from previous runs, given enough time. This is due to algorithm, CUDA, and backprop optimizations.

This is a good read: https://pytorch.org/docs/stable/notes/randomness.html

The above page lists which operations are non-deterministic. It is generally discouraged that you disable their use, but it can be done with:

torch.use_deterministic_algorithms(mode=True)

This might also disable which operation can be used.

score 0 · Answer 2 · answered Apr 30 '22 at 01:17

0

torch.cuda.manual_seed(args.seed)
torch.cuda.manual_seed_all(args.seed)

Try adding these to your current reproducibility setting.

answered Apr 30 '22 at 01:17

starriet

2,565
22
23

Reproducibility issue with PyTorch

2 Answers2