Hierarchical Mixture Model in Stan

Question

I'm trying to implement a hierarchical mixture model in Stan that describes how performance on a task changes over time. In the model (see code below), there are three lower level parameters that are assumed to be drawn from a mixture of two normals (dperf_int, dperf_sd, and sf). To implement these, I've just assumed that the prior on these lower level parameters is a mixture. There are also two parameters (perf1_int and perf1_sd) that are only estimated at the group level. For these, I've assumed that the observation itself is drawn from a mixture of normals.

data {
  int<lower=0> Nsubj; //Number of subjects (60)
  int<lower=0> Nobs;  //Number of observations per subject (10)
  matrix[Nsubj,Nobs] perf; //Matrix of performance observations
} 

parameters {
  vector<lower=0,upper=1>[Nsubj] mixing_prop; // mixing proportion for each subject

  ordered[2] perf1_int; //mean performance at time 1 (estimated at group level only)
  vector<lower=0>[2] perf1_sd; //sd of performance at time 1 (estimated at group level only)

  ordered[2] dperf_int_mean; //mean of performance change intercept for each mixture
  vector<lower=0>[2] dperf_int_sd; //sd of performance change intercept for each mixture

  vector<lower=0>[2] dperf_sd_mean; //mean of performance change sd
  vector<lower=0>[2] dperf_sd_sd;   //sd of performance change sd

  vector[2] sf_mean; //mean self-feedback for each mixture
  vector<lower=0>[2] sf_sd; //sd of self-feedback for each mixture

  //subject level parameters
  vector[Nsubj] dperf_int;
  vector<lower=0>[Nsubj] dperf_sd;
  vector[Nsubj] sf;
}

model {
  real perf_change;

  for(i in 1:Nsubj){
    vector[4] increments;

    increments[1]=log_mix(mixing_prop[i],
              normal_lpdf(dperf_int[i] | dperf_int_mean[1], dperf_int_sd[1]),
              normal_lpdf(dperf_int[i] | dperf_int_mean[2], dperf_int_sd[2]));

    increments[2]=log_mix(mixing_prop[i],
              normal_lpdf(dperf_sd[i] | dperf_sd_mean[1], dperf_sd_sd[1]),
              normal_lpdf(dperf_sd[i] | dperf_sd_mean[2], dperf_sd_sd[2])); 

    increments[3]=log_mix(mixing_prop[i],
              normal_lpdf(sf[i] | sf_mean[1], sf_sd[1]),
              normal_lpdf(sf[i] | sf_mean[2], sf_sd[2])); 

    increments[4]=log_mix(mixing_prop[i],
              normal_lpdf(perf[i,1] | perf1_int[1], perf1_sd[1]),
              normal_lpdf(perf[i,1] | perf1_int[2], perf1_sd[2]));

    target+= log_sum_exp(increments);   

    for(j in 2:Nobs){
      perf_change = dperf_int[i] + perf[i,j-1]*sf[i]; 
      perf[i,j] ~ normal( perf[i,j-1] + perf_change, dperf_sd[i]);
    }
  }
}

I'm pretty sure I've implemented the mixture of the group-level parameters correctly, as I've just done what was in the Stan manual. However, I wanted to check that I've done the hierarchical mixture components of the model correctly. The model is quite inefficient. I've run 4 chains at 4000 iterations and many of the parameters have < 10 n_eff. So I suspect I'm missing something.

Here is the code used to run the model:

library(tidyverse)
library(rstan)

#load data
load(file="wide_data.RData")

#prepare data for stan
model_data = list(
  Nsubj = dim(wide_data)[1],
  Nobs = 10,
  perf = as.matrix(select(wide_data ,perf1:perf10))
)


fit = stan(file="stan_univariate_hierarchical_mixture.stan",
           data=model_data,cores=4,
           thin=10,
           warmup=1000,
           chains=4,
           iter=4000)

...and the data is here: https://www.dropbox.com/s/eqzw1lou6uba8i3/wide_data.RData?dl=0

Any help would be much appreciated!

I'd recommend Michael's case study on problems with fitting mixture models. Can you fit the model without the mixture prior? — Bob Carpenter, Apr 10 '18 at 18:47
Thanks for the quick reply Bob, and for the referral. Michael's case study was quite helpful. The model didn't converge well without the priors, after increasing max_treedepth to 15 and adapt_delta to 0.99, it converged quite nicely. However, when I added the mixture on the priors back in, convergence was quite poor again. Of the four chains, three mix quite nicely. But the forth is getting stuck pretty far out from the others (see traceplot here: https://www.dropbox.com/s/3qdwg2n9conhjq5/Rplot.png?dl=0). Just a note: I used a thinning interval of 10, so there were 1000 iter per chain. — Tim, Apr 11 '18 at 03:21
Correction to the above: there were 4000 iterations per chain. — Tim, Apr 11 '18 at 23:04
Are you ordering or something to identify the chains? Do their posteriors overlap? To deal with the tails, you can initialize closer to where you want to go or also add even stronger priors. — Bob Carpenter, Apr 12 '18 at 04:56
Hi Bob, sorry for the delay. I am ordering the chains, and there doesn't appear to be any real overlap in the posteriors. I've now gotten the model to converge. However, although there is no overlap in posteriors, the posterior on the mixing_prop parameter is basically uniform (even when I set quite informative priors that differ for the two mixtures). Is this an indication that a two-mixture model is likely a poor representation of the data, and that a one-mixture (i.e., a standard hierarchical) model would be better? — Tim, Apr 23 '18 at 05:44
Is there just supposed to be one mixture component responsible for each of the four outcomes? If so, you want to put them all inside the `log_sum_exp`. Then there's only one `target +=`. — Bob Carpenter, Apr 24 '18 at 14:57
Yep, there is only supposed to be one mixture component for the four outcomes. I've updated my code in the initial post with my modifications based on your suggestion. As can be seen, I've specified a four-element-long vector `increments`, and assigned the result of each `log_mix()` argument into each element of the vector. Then I've incremented the lp based on the result of `log_sum_exp(increments)`. Is this the best way to do it? It seems to make my code run extremely slow, and at least one chain seems to always get stuck. — Tim, Apr 29 '18 at 11:22
If you look at traceplots, what's getting stuck? Are there divergent transitions at points where things get stuck? — Bob Carpenter, May 01 '18 at 01:02

Hierarchical Mixture Model in Stan

0 Answers0