I am studying zero-inflated count temporal data. I have built a stan
model that deals with this zero-inflated data with an if
statement in the model
block. This is as they advise in the Stan Reference Guide. e.g.,
model {
for (n in 1:N) {
if (y[n] == 0)
target += log_sum_exp(bernoulli_lpmf(1 | theta), bernoulli_lpmf(0 | theta) + poisson_lpmf(y[n] | lambda));
else
target += bernoulli_lpmf(0 | theta) + poisson_lpmf(y[n] | lambda);
}
}
This if
statement is clearly necessary as Stan uses NUTS as the sampler which does not deal with discrete variables (and thus we are marginalising over this discrete random variable instead of sampling from it). I have not had very much experience with pymc3
but my understanding is that it can deal with a Gibbs update step (to sample from the discrete bernoulli likelihood). Then conditioned on the zero-inflated value, it could perform a Metropolis or NUTS update for the parameters that depend on the Poisson likelihood.
My question is: Can (and if so how can) pymc3
be used in such a way to sample from the discrete zero-inflated variable with the updates to the continuous variable being performed with a NUTS update? If it can, is the performance significantly improved over the above implementation in stan
(which marginalises out the discrete random variable)? Further, if pymc3
can only support a Gibbs + Metropolis update, is this change away from NUTS worth considering?