Stan (pystan) creating a vector of ints to assign to a categorical distribution

Question

First time using Stan. I am starting with an example found on a Coursera class where they're using JAGS which I am trying to reimplement in Stan. The class is free to audit and I have linked the relevant lecture.

Background on the model:

(I don't see how to use LaTeX so I'll try to format this as well as possible):
The model is a mixture model where the input is a vector n of real numbers with 2 potential classes. They assume that these classes are Z_1 ~ N(mu_1, stdev) and Z_2 ~ N(mu_2, stdev) (it is a mixture of 2 Normals, with different mu's but the same stdev). They are assuming a uniform Dirichlect prior with alpha=1 on the probability of being in each class.

Each data point is drawn from a categorical distribution based on the probability vector and they increment the likelihood with the corresponding Normal likelihood.

My code

model_code_coursera = """
data {
    int<lower=1> N;         //Num Samples
    int<lower=1> K;         //Num classes
    real y[N];              //Input which we are trying to assign to each class
} parameters {
    simplex[K] probs;       //Prob of being in each class
    vector[K] mu;           //Center of each class
    real<lower=0> prec;
    vector<lower=0, upper=K>[N] z; //Array of classes ****(option 1)****
    int<lower=0, upper=K>[N] z;  // Array of classes ****(option 2)****
} model {
    probs ~ dirichlet(rep_vector(1,K));
    prec ~ gamma(1/2, 2/2);
    for (k in 1:K)
        mu[k] ~ normal(-1+2*(k-1), 1/100);
    for (n in 1:N) {
        z[n] ~ categorical(probs);
        y[n] ~ normal(mu[z[n]], prec);
    }
}
"""

When I use option 1 I get the error,

No matches for: 
  real ~ categorical(vector)
Available argument signatures for categorical:
  int ~ categorical(vector)

which makes sense as the output of a categorical is an int.

However, when I do what makes sense and define z to be a vector of ints as I see in Stan cheat sheets I get a different error (option 2)

    10:     int<lower=0, upper=K>[N] z;  // Array of classes
                                ^
    11: } model {
  -------------------------------------------------

PARSER EXPECTED: <identifier>

score -1 · Answer 1 · answered May 05 '20 at 16:38

Stan does not directly support integer parameters. To implement a normal mixture model of the variety described in the question, the discrete parameters must be marginalized out. There's an explanation of how to do this in the Stan user's guide chapter on mixture models. The upside is that mixing is much better w/o discrete parameters and inferences in expectation are more precise.

Stan (pystan) creating a vector of ints to assign to a categorical distribution

1 Answers1