Posterior predictive distribution

In Bayesian statistics, the posterior predictive distribution is the distribution of possible unobserved values conditional on the observed values.

Given a set of N i.i.d. observations $\mathbf {X} =\{x_{1},\dots ,x_{N}\}$ , a new value ${\tilde {x}}$ will be drawn from a distribution that depends on a parameter $\theta \in \Theta$ , where $\Theta$ is the parameter space.

p({\tilde {x}}|\theta )

It may seem tempting to plug in a single best estimate ${\hat {\theta }}$ for $\theta$ , but this ignores uncertainty about $\theta$ , and because a source of uncertainty is ignored, the predictive distribution will be too narrow. Put another way, predictions of extreme values of ${\tilde {x}}$ will have a lower probability than if the uncertainty in the parameters as given by their posterior distribution is accounted for.

A posterior predictive distribution accounts for uncertainty about $\theta$ . The posterior distribution of possible $\theta$ values depends on $\mathbf {X}$ :

p(\theta |\mathbf {X} )

And the posterior predictive distribution of ${\tilde {x}}$ given $\mathbf {X}$ is calculated by marginalizing the distribution of ${\tilde {x}}$ given $\theta$ over the posterior distribution of $\theta$ given $\mathbf {X}$ :

p({\tilde {x}}|\mathbf {X} )=\int _{\Theta }p({\tilde {x}}|\theta )\,p(\theta |\mathbf {X} )\operatorname {d} \!\theta

Because it accounts for uncertainty about $\theta$ , the posterior predictive distribution will in general be wider than a predictive distribution which plugs in a single best estimate for $\theta$ .

This article is issued from Wikipedia. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.