4

I'm very new to this topic/posting on a discussion board, so I apologize in advance if something is unclear.

I'm interested in performing a stochastic search variable seleciton (SSVS) in JAGS. I've seen codes online of people performing SSVS (e.g. http://www4.stat.ncsu.edu/~reich/ABA/code/SSVS which I've copied the code below) but my understanding is that to perform this method, I need to use a spike-slab prior in JAGS. The spike can be either a point mass or a distribution with a very narrow variance. Looking at most people's codes, there's only one distribution being defined (in the one above, they define a distribution on gamma, with beta = gamma * delta) and I believe they're assuming a point mass on the spike.

So my questions are:

1) Can someone explain why the code below is using the SSVS method? For example, how do we know this isn't using GVS, which is also another method that uses a Gibbs sampler?

2) Is this a point mass on the spike?

3) If I wanted to use simulated data to test whether the Gibbs sampler is correctly drawing from the spike/slab, how would I go about doing that? Would I code for a spike and a slab and what would I be looking for in the posterior to see that it is drawing correctly?

model_string <- "model{ 
# Likelihood
for(i in 1:n){
Y[i] ~ dpois(lambda[I]) 
log(lambda[i]) <- log(N[i]) + alpha + inprod(beta[],X[i,])
}
#Priors
for(j in 1:p){
gamma[j] ~ dnorm(0,tau)
delta[j] ~ dbern(prob)
beta[j] <- gamma[j]*delta[j] 
} 
prob ~ dunif(0,1) 
tau ~ dgamma(.1,.1) 
alpha ~ dnorm(0,0.1) 
}"

I've also asked on the JAGS help page too: https://sourceforge.net/p/mcmc-jags/discussion/610036/thread/a44343e0/#ab47

Scransom
  • 3,175
  • 3
  • 31
  • 51
Tammy
  • 173
  • 9

1 Answers1

2

I am also (trying to) work on some Bayesian varibale selection stuff in JAGs. I am by no means an expert on this topic, but maybe if we chat about this more we can learn together. Here is my interpretation of the varibale selection within this code:

model_string <- "model{ 
Likelihood
for(i in 1:n){
Y[i] ~ dpois(lambda[I]) 
log(lambda[i]) <- log(N[i]) + alpha + inprod(beta[],X[i,]) 
}

Priors
for(j in 1:p){
gamma[j] ~ dnorm(0,tau)     
delta[j] ~ dbern(prob)       # delta has a Bernoulli distributed prior (so it can only be 1:included or 0:notincluded)
beta[j] <- gamma[j]*delta[j] # delta is the inclusion probability
} 
prob ~ dunif(0,1)            # This is then setting an uninformative prior around the probability of a varible being included into the model
tau ~ dgamma(.1,.1) 
alpha ~ dnorm(0,0.1) 
}"

I have tried to comment out the variable selection sections of the model. The code above looks really similar to the Kuo & Mallick method of Bayesian variable selection. I am currently having trouble tuning the spike and slab method so the estimates mix properly instead of "getting stuck" on either 0 or 1.

So my priors are set up more like:

 beta~ dnorm(0,tau)
 tau <-(100*(1-gamma))+(0.001*(gamma))     # tau is the inclusion probability 
 gamma~dbern(0.5)

I have found this paper helps to explain the differences between different variable selection methods (It gets into GVS vs SSVS):

O’Hara, R.B. & Sillanpää, M.J. (2009). A review of Bayesian variable selection methods: What, how and which. Bayesian Anal., 4, 85–118

Or this blog post: https://darrenjw.wordpress.com/2012/11/20/getting-started-with-bayesian-variable-selection-using-jags-and-rjags/

If there was no SSVS on the beta prior, the prior would look more like this:

Priors
for(j in 1:p){     
beta[j] <- ~ dnorm(0,0.01) # just setting a normally (or whatever drstribution you're working in) distributed prior around beta.
} 
tau ~ dgamma(.1,.1) 
alpha ~ dnorm(0,0.1) 
}"
CarlaBirdy
  • 137
  • 1
  • 12
  • I would love to chat with you more on this--I'll reach out when I'm back from traveling! Thanks – Tammy Apr 16 '18 at 16:48
  • @CarlyBirdy, Thanks for all of your help. 1) So here' s the link for the my [link](https://www4.stat.ncsu.edu/~reich/ABA/code/SSVS). 2) I'm a little unsure of some of the lines in your code `beta~ dnorm(0,tau) tau <-(100*(1-gamma))+(0.001*(gamma)) `. Correct me if I'm wrong but I was under the impression that the indicator shouldn't be in the sd of dnorm(mean, sd) as you have it. I think indicators usually have their own probability distribution vs. mixing the probability distribution of betas + indicators together. – Tammy Apr 30 '18 at 16:52