PyTorch: What's the difference between state_dict and parameters()?

Question

In order to access a model's parameters in pytorch, I saw two methods:

using state_dict and using parameters()

I wonder what's the difference, or if one is good practice and the other is bad practice.

Thanks

kHarshit · Accepted Answer · 2019-02-18T12:37:50.570

23

The parameters() only gives the module parameters i.e. weights and biases.

Returns an iterator over module parameters.

You can check the list of the parameters as follows:

for name, param in model.named_parameters():
    if param.requires_grad:
        print(name)

On the other hand, state_dict returns a dictionary containing a whole state of the module. Check its source code that contains not just the call to parameters but also buffers, etc.

Both parameters and persistent buffers (e.g. running averages) are included. Keys are the corresponding parameter and buffer names.

Check all keys that state_dict contains using:

model.state_dict().keys()

For example, in state_dict, you'll find entries like bn1.running_mean and running_var, which are not present in .parameters().

If you only want to access parameters, you can simply use .parameters(), while for purposes like saving and loading model as in transfer learning, you'll need to save state_dict not just parameters.

edited Feb 18 '19 at 12:37

answered Feb 18 '19 at 12:23

kHarshit

11,362
10
52
71

Is there also some third way of accessing parameters, that returns a tensor of all the parameters, for vector operations? – Gulzar Feb 18 '19 at 13:07
I think `.parameters()` is the easiest way _(maybe only way)_ to return all parameters. Other methods, if exist, may simply involve a call to it. – kHarshit Feb 18 '19 at 13:28
If you could take a look at my other question https://stackoverflow.com/questions/54734556/pytorch-how-to-create-an-update-rule-that-doesnt-come-from-derivatives and propose the correct way to implement the update rule, I would be extremely greatful – Gulzar Feb 18 '19 at 13:30

score 4 · Answer 2 · answered Aug 10 '20 at 09:05

4

Besides the differences in @kHarshit 's answer, the attribute requires_grad of trainable tensors in net.parameters() is True, while False in net.state_dict()

answered Aug 10 '20 at 09:05

david

842
2
8
25

1

Can somebody explain why this difference?? Also, i was actually comapring the tensors from model.named_parameters and model.state_dict using torch.equal, even before running the back pass (so no updates ) and the two are different. Any ideas as to why this may be so. – Kanchan Kumar Mar 18 '21 at 10:38

PyTorch: What's the difference between state_dict and parameters()?

2 Answers2

Linked