While calculating posterior using the parameter(say w_mu) vs using the parameter.data(w_mu.data) makes a difference?

Question

I am trying to implement Bayes By Backprop. While calculating posterior if I use the parameter as an input vs parameter.data as input the result accuracy drastically changes.

    self.w_post = Normal(self.w_mu.data, torch.log(1+torch.exp(self.w_rho)))
    self.b_post = Normal(self.b_mu.data, torch.log(1+torch.exp(self.b_rho)))
    self.log_post = self.w_post.log_prob(self.w).sum() + self.b_post.log_prob(self.b).sum()

This works, while the next block doesn't.

    self.w_post = Normal(self.w_mu, torch.log(1+torch.exp(self.w_rho)))
    self.b_post = Normal(self.b_mu, torch.log(1+torch.exp(self.b_rho)))
    self.log_post = self.w_post.log_prob(self.w).sum() + self.b_post.log_prob(self.b).sum()

Since w_post and b_post aren't the parameters so why does this affect my answer. This snip of code lies in the forward function of a custom-defined linear layer.

While the value of log_posterior does not change through the epochs. Can it have something to do with the seed?

score 0 · Answer 1 · answered Feb 08 '20 at 11:10

0

I think both the solutions can be applied and it somehow only changes the seed. After a few iterations both the model converge.

answered Feb 08 '20 at 11:10

Mohit Anand

11
2

While calculating posterior using the parameter(say w_mu) vs using the parameter.data(w_mu.data) makes a difference?

1 Answers1