tf.distributions gives access to several distributions. My network should predict parameters of a probability density function (i.e. a policy in my case), the loss is then dependent on these again. I would like to ask for the beta-distribution especially, as that is the one i intend to use. E.g.:
loss=tf.distributions.Beta(concentration0,concentration1).pdf(some_value)/tf.distributions.Beta(given_concentration0.pdf(some_value), given_concentration1)*advantage
trainstep = tf.train.AdamOptimizer().minimize(loss)
Where concentration1 and concentration0 are the output of some network, which i want to optimize (let's say the other parameters are given for the sake of this question). When calling session.run(trainstep), would this backpropagate into the net? I can't find any ressources stating the one or the other.