How to reproduce RNN results on several runs?

Question

I call same model on same input twice in a row and I don't get the same result, this model have nn.GRU layers so I suspect that it have some internal state that should be release before second run?

How to reset RNN hidden state to make it the same as if model was initially loaded?

UPDATE:

Some context:

I'm trying to run model from here:

https://github.com/erogol/WaveRNN/blob/master/models/wavernn.py#L93

I'm calling generate:

https://github.com/erogol/WaveRNN/blob/master/models/wavernn.py#L148

Here it's actually have some code using random generator in pytorch:

https://github.com/erogol/WaveRNN/blob/master/models/wavernn.py#L200

https://github.com/erogol/WaveRNN/blob/master/utils/distribution.py#L110

https://github.com/erogol/WaveRNN/blob/master/utils/distribution.py#L129

I have placed (I'm running code on CPU):

torch.manual_seed(0)
torch.cuda.manual_seed_all(0)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
np.random.seed(0)

in

https://github.com/erogol/WaveRNN/blob/master/utils/distribution.py

after all imports.

I have checked GRU weights between runs and they are the same:

https://github.com/erogol/WaveRNN/blob/master/models/wavernn.py#L153

Also I have checked logits and sample between runs and logits are the same but sample are not, so @Andrew Naguib seems were right about random seeding, but I'm not sure where the code that fixes random seed should be placed?

https://github.com/erogol/WaveRNN/blob/master/models/wavernn.py#L200

UPDATE 2:

I have placed seed init inside generate and now results are consistent:

https://github.com/erogol/WaveRNN/blob/master/models/wavernn.py#L148

I don't understand what you mean by 'I have placed seed init inside generate'. Can you explain that please? — mpourreza, Jan 28 '20 at 01:14
@mpourreza Inside `generate` function, seems master code is changed and links are not valid anymore. — mrgloom, Jan 28 '20 at 08:09
Does this answer your question? [Training PyTorch models on different machines leads to different results](https://stackoverflow.com/questions/67511658/training-pytorch-models-on-different-machines-leads-to-different-results) — iacob, May 18 '21 at 07:41

score 3 · Accepted Answer · answered May 18 '19 at 00:09

I believe this may be highly related to Random Seeding. To ensure reproducible results (as stated by them) you have to seed torch as in this:

import torch
torch.manual_seed(0)

And also, the CuDNN module.

torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

If you're using numpy, you could also do:

import numpy as np
np.random.seed(0)

However, they warn you:

Deterministic mode can have a performance impact, depending on your model.

A suggested script I regularly use which has been working very good to reproduce results is:

# imports
import numpy as np
import random
import torch
# ...
""" Set Random Seed """
if args.random_seed is not None:
    """Following seeding lines of code are to ensure reproducible results 
       Seeding the two pseudorandom number generators involved in PyTorch"""
    random.seed(args.random_seed)
    np.random.seed(args.random_seed)
    torch.manual_seed(args.random_seed)
    # https://pytorch.org/docs/master/notes/randomness.html#cudnn
    if not args.cpu_only:
        torch.cuda.manual_seed(args.random_seed)
        cudnn.deterministic = True
        cudnn.benchmark = False

My problem that 2 consecutive calls of model on same data is producing different results, if I run it two separate times I get same results. — mrgloom, May 19 '19 at 19:01

Anubhav Singh · Answer 2 · 2019-05-20T03:44:23.503

0

You can use model.init_hidden() to reset the RNN hidden state.

def init_hidden(self):
     # Initialize hidden and cell states
     return Variable(torch.zeros(num_layers, batch_size, hidden_size))

So, before calling the same model on the same data next time, you can call model.init_hidden() to reset the hidden and cell states to the initial values.

This will clear out the history, in order words, the weights the model learned after running on the data first time.

edited May 20 '19 at 03:44

answered May 18 '19 at 03:00

Anubhav Singh

8,321
4
25
43

What is the type of model should be? class derived from `nn.Module`? – mrgloom May 19 '19 at 18:46

How to reproduce RNN results on several runs?

2 Answers2

Linked

Related