0

In R, I am running an MCMC Bayesian inference for data from mixture of Gamma distributions. JAGS is used here. The model file gmd.bug is as follows

model {
for (i in 1:N) {
y[i] ~ dsum(p*one, (1-p)*two)
}
one ~ dgamma(alpha1, beta1)
two ~ dgamma(alpha2, beta2)    alpha1 ~ dunif(0, 10)
beta1 ~ dunif(0, 10)
alpha2 ~ dunif(0, 10)
beta2 ~ dunif(0, 10)
p ~ dunif(0, 1)
}

Then, this is the inference phase

gmd.jags = jags.model("gmd.bug",
data = list(N = NROW(a), y=unlist(a)),
inits = inits, n.chains = 3, n.adapt = 1000)

Here is the error that puzzled me

Error in jags.model("gmd.bug", data = list(N = NROW(a), y = unlist(a)),  : 
Error in node y[1]
Node inconsistent with parents

Anyone know what needs to be fixed here?

lilkaskitc
  • 15
  • 4

1 Answers1

1

Answer to OP's original question When you write y[i] ~ dsum(p*dgamma(alpha1, beta1), (1-p)*dgamma(alpha2, beta2)), dgamma(alpha1, beta1) needs to be indexed by [i], as in

gamma1[i] ~ dgamma(alpha1, beta1)
gamma2[i] ~ dgamma(alpha2, beta2)

Answer to OP's second question (after edits)

This is the crux of your problem. But fixing it raises additional difficulties, because in order to ensure that y[i] is consistent with its parents at initialization, you need to make sure that at initialization it is strictly true that y[i] == p*gamma1[i]+(1-p)*gamma2[i]. If you let JAGS handle the initialization automatically, it will initialize from the priors, without understanding the constraint on initial values imposed by dsum, and you will get an error. One strategy is to initialize both gamma1 and gamma2 at y.

The following code works for me (but of course you'll want to run many more iterations):

# Data simulation:
library(rjags)
N=200
alpha1 <- 3
beta1 <- 3
alpha2 <- 5
beta2 <- 1
p <- .7

y <- vector(mode="numeric", length=N)
for(i in 1:N){
  y[i] <- p*rgamma(1,alpha1,beta1) + (1-p)*rgamma(1,alpha1,beta1)
}

# JAGS model
sink("mymodel.txt")
cat("model{
    for (i in 1:N) {
      gamma1[i] ~ dgamma(alpha1, beta1)
      gamma2[i] ~ dgamma(alpha2, beta2)
      pg1[i] <- p*gamma1[i]
      pg2[i] <- (1-p)*gamma2[i]
      y[i] ~ dsum(pg1[i], pg2[i])
    }
    alpha1 ~ dunif(0, 10)
    beta1 ~ dunif(0, 10)
    alpha2 ~ dunif(0, 10)
    beta2 ~ dunif(0, 10)
    p ~ dunif(0, 1)
    }", fill=TRUE)
sink()
jags.data <- list(N=N, y=y)

inits <- function(){list(gamma1=y, gamma2=y)}

params <- c("alpha1", "beta1", "alpha2", "beta2", "p")

nc <- 5
n.adapt <-200
n.burn <- 200
n.iter <- 1000
thin <- 10
mymodel <- jags.model('mymodel.txt', data = jags.data, inits=inits, n.chains=nc, n.adapt=n.adapt)
update(mymodel, n.burn)
mymodel_samples <- coda.samples(mymodel,params,n.iter=n.iter, thin=thin)
Jacob Socolar
  • 1,172
  • 1
  • 9
  • 23
  • I just find out that there is an issue that gamma1 and gamma2 will be almost identical in posterior, and the estimates of p is not converging. I have tried several situations and this seems to be a problem. – lilkaskitc Dec 06 '15 at 18:18
  • 1
    It makes sense that gamma1 and gamma2 are identical if p cannot converge (because if the model can't distinguish between `p=P` and `p=1-P` then obviously won't be able to learn that gamma1 and gamma2 are different). There are two possible sources of the problem. One is that the mixing is poor, in which case you could try running the model for much longer, and see if p eventually converges. The other is that you don't have enough data to estimate the model, in which case you could try simulating a much larger dataset (more elements in y) and see if the model is estimable with more data. – Jacob Socolar Dec 06 '15 at 18:23
  • Thank you. But the y here is a given dataset (200 obs, not enough) to me, and I run 10000 iterations. It is pretty funny seeing that the five MCMC chains are each suggesting a different p. Anyway, I have learnt a lot from you. – lilkaskitc Dec 06 '15 at 19:54
  • Ah! If all chains are suggesting different p's than this indicates that the problem is with the mixing, not the estimability of the model! Try running for (much) longer, or alternatively you could look into fitting the model in Stan instead of JAGS. Good luck! – Jacob Socolar Dec 06 '15 at 21:18
  • Given some additional details on your model it may be possible to choose initial values that actually give it a good start (this is no problem since the initial values should be independent from wherever MCMC comes up with ultimately) – Gmichael Jul 22 '21 at 11:34