0

i create a CNN model like this global_model = CNNMnist(args=args). Then i send it to device, set it to train. Then i train my local models, collect the local_weights and the average them to get updated global_model. Now i am trying to get the items from the .parameters() function but all i get is None as item.grad. When i do the same thing for the local_models, i get the desired output. What am i doing wrong?

global_model.to(device)
global_model.train()
...................
global_weights = average_weights(local_weights)
global_model.load_state_dict(global_weights)
last_update = []
for item in global_model.parameters():
    last_update.append(copy.deepcopy(item.grad))
    print(item.grad)

Output: None None None None None None None None

Any help would be appreciated.

Samiur Rahman
  • 63
  • 1
  • 6

1 Answers1

0

You are looking at values you loaded from a state_dict - the gradients are not saved there. try print .grad just after your call to backward() and before zero_grad().

Shai
  • 111,146
  • 38
  • 238
  • 371
  • thank you for your reply. that works. but i do have another question, the `backward()` only appears in the local_model updates. the global_model is updated bu averaging the updated weights i got from the local_models. How do i get the gradients of the global_model if i don't use `backward()` in global_model? – Samiur Rahman May 25 '21 at 10:00