I have two different versions of a state-space model in JAGS. Both yield equivalent inferences, but are wildly different in terms of run-time and convergence. The only difference between the two is the way of expressing process error. This happens a few times in the model, but I'll just include one line for simplicity:
## (1) residual parameterization
rate[i] <- rate[i-1] + eps[i]
eps[i] ~ dnorm(0, tau_eps)
vs.
## (2) mean parameterization
rate[i] ~ dnorm(rate[i-1], tau_eps)
Version 2 of the model runs much faster (in about one tenth the amount of time!) but is much slower to converge. In fact, version 2 takes about ten times as many iterations to do so, so the two models take an equivalent amount of time. I also notice that version 1 seems much more stable with respect to latent states (rate[i]
, etc.) at each time step , whereas version 2 seems much more stable with respect to model hyperparameters (tau_eps
, etc.) when compared for a similar run time (20k iterations for version 1, and 200k iterations for version 2.) I should note that I'm basing my assessment of convergence/stability mostly on trace plots, but my assessments are corroborated by values of Rhat and n.eff.
What's going on? In my mind, the two statements above are mathematically equivalent (and indeed, both give equivalent inferences) but why should behavior in JAGS be so different? Are the two statements above less equivalent than I thought?