Evaluating log-likelihood of unseen data in rstan

Question

I understand I can calculate the log likelihood of each sample during sampling, e.g.

...

model {

  for (i in 1:N) {
    (y[i] - 1) ~ bernoulli(p[i, 2]);
  }

}

generated quantities {

  vector[N] log_lik;
  for (i in 1:N){
    log_lik[i] = bernoulli_lpmf((y[i] - 1) | p[i, 2]);
  }

}

After fitting, I can then extract log likelihood using the loo package:

log_lik_m <- extract_log_lik(stan_fit)

But I want to evaluate log likelihood of unseen data. This is possible in brms:

ll <- log_lik(fit_star, newdata = new_df)

But I would like to do this with rstan, since I can't easily define my model in brms (I am assuming).

For reference, I am trying to use Estimated LFO-CV to evaluate and compare my time-series model. (e.g. https://github.com/paul-buerkner/LFO-CV-paper/blob/master/sim_functions.R#L186)

(https://mc-stan.org/loo/articles/loo2-lfo.html)

One method is described in this post: https://medium.com/@alex.pavlakis/making-predictions-from-stan-models-in-r-3e349dfac1ed: You can write another Stan program with only `data` and `generated quantities` blocks, where `data` contains the already estimated posterior distributions. — dipetkov, Mar 27 '19 at 18:45

score 0 · Accepted Answer · answered Mar 27 '19 at 23:14

Thanks to the link from @dipetkov, I solved this myself. I didn't use the exact methods in the link, but came up with an alternative. You can call stan functions from R to get it to compute log likelihood for your model, even with unseen data (and its very fast!).

First, I put everything in my transformed parameters block into a function in stan's functions block. Then, I created a second function that wraps the first function, and evaluates the log likelihood for given observations and provided parameter estimates (I then removed my generated_quantities block). rstan has a function expose_stan_functions which adds all functions in the stan functions block to the R environment.

You can then call the log likelihood function you made to evaluate your model with any observations (previously seen or unseen), along with a set of parameter estimates.

Would be better if you could attach your code. – Amin Shn Feb 15 '22 at 09:04 — Amin Shn, Feb 15 '22 at 09:04

Evaluating log-likelihood of unseen data in rstan

1 Answers1