I am trying to implement Bayes By Backprop. While calculating posterior if I use the parameter as an input vs parameter.data as input the result accuracy drastically changes.
self.w_post = Normal(self.w_mu.data, torch.log(1+torch.exp(self.w_rho)))
self.b_post = Normal(self.b_mu.data, torch.log(1+torch.exp(self.b_rho)))
self.log_post = self.w_post.log_prob(self.w).sum() + self.b_post.log_prob(self.b).sum()
This works, while the next block doesn't.
self.w_post = Normal(self.w_mu, torch.log(1+torch.exp(self.w_rho)))
self.b_post = Normal(self.b_mu, torch.log(1+torch.exp(self.b_rho)))
self.log_post = self.w_post.log_prob(self.w).sum() + self.b_post.log_prob(self.b).sum()
Since w_post and b_post aren't the parameters so why does this affect my answer. This snip of code lies in the forward function of a custom-defined linear layer.
While the value of log_posterior does not change through the epochs. Can it have something to do with the seed?