I am running MCMCglmm regressions to test for an interaction between two traits whilst accounting for phylogenetic influence. Both of my fixed effects are discrete (they are binary 0 or 1), here is a sample of my data:
Genus CP Stripes
genus1 1 0 genus2 1 0
genus3 0 0 genus4 0 0 genus5 0 0 genus6 0 1 genus7 0 0 genus8 0 1 genus9 0 1 genus10 1 0 genus11 1 0
genus12 1 1 genus13 0 1 genus14 1 0 genus15 1 0 genus16 0 1
The issue is that when I run this model:
CP_stripes <- MCMCglmm(CP ~ Stripes,random=~Genus,family=c("categorical"),
ginverse=list(Genus=inv.phylo$Ainv),prior=prior,data=excel,nitt=5100000,burnin=100000,thin=500)
I get an output saying that the interaction between CP and Stripes is significant:
DIC: 27.77905
G-structure: ~Genus
post.mean l-95% CI u-95% CI eff.samp
Genus 3829 70.7 9986 9.833
R-structure: ~units
post.mean l-95% CI u-95% CI eff.samp
units 1 1 1 0
Location effects: CP ~ Stripes
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) -14.607 -60.224 26.799 973.05 0.4414
Stripes1 -13.823 -32.246 1.334 87.95 0.0366 *
However if I run the exact same model again I get different outputs each time, often where the significance of the interaction changes:
DIC: 10.88278
G-structure: ~Genus
post.mean l-95% CI u-95% CI eff.samp
Genus 25244 121 43680 9.437
R-structure: ~units
post.mean l-95% CI u-95% CI eff.samp
units 1 1 1 0
Location effects: CP ~ Stripes
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) -41.728 -150.449 69.152 796.4 0.4310
Stripes1 -32.583 -75.742 3.186 92.6 0.0752 .
DIC: 22.78624
G-structure: ~Genus
post.mean l-95% CI u-95% CI eff.samp
Genus 9801 27.2 32166 2.09
R-structure: ~units
post.mean l-95% CI u-95% CI eff.samp
units 1 1 1 0
Location effects: CP ~ Stripes
post.mean l-95% CI u-95% CI eff.samp pMCMC
(Intercept) -21.89 -102.78 45.25 361.5 0.4502
Stripes1 -23.06 -63.94 1.32 26.1 0.0332 *
I'm really concerned that the models are changing this much on repeated runnings, and I'm wondering if I can trust any of my other models (which all also use a binary dependent and independent variable). Can anyone tell me why I'm getting such different outputs each time please? And is there anything I can do to ensure that when I run a model, I can trust this output will stay the same if the model is run again?
Here is one of the plots to go with the first output above, as you can see the chain has not converged very well:
I have tried messing around with the priors, but I fundamentally do not understand them so it's difficult to know if what I'm doing is 'fixing' anything. I've tried to read as much of the documentation around this as I can but am finding it really difficult to get head around a lot of the language used.
This is the prior I am currently using:
prior<-list(G=list(G1=list(V=1,nu=1,alpha.mu=0,alpha.V=1000)),R=list(V=1,nu=0.002, fix=1))
Are there any obvious problems with it?