I've found this same question posted everywhere, and I can't seem to get any of the solutions to fit to my data, and I'm wondering if I'm trying to fit my data to a model that's just too complex.
I'm trying to fit my data to a multinomial logistic regression model from the MCMCglmm package. I've looked at many different documents, tutorials, and the MCMCglmm manual itself, primarily Florian Jaeger's tutorial which is excellent and thorough. I get lost, however, in his selection of the values for both the G-structure and R-structure of the prior, and I keep getting this error message
Error in priorformat(if (NOpriorG) { :
V is the wrong dimension for some prior$G/prior$R elements
In particular, I'm not sure what the n
values should be given my data in both, but the somewhat opaque error message is suggesting that it's an issue with V
Here's a (relevant) subset of my data:
CG_imm locuteur enquete loc_age loc_sexe left liquid right articulation_C1 voice_C1 NC_C1 NC_right voice_right right2 pos logfreq realization
abordable 44ajs1 Nantes 79 M bl l p stop V Vstop stop NV NVstop adj NA 2
admettre 91adb1 Brunoy 54 M tR R E stop N NVstop mid-vowel V mid-vowel verb 6.52209279 0
adorable 91aal1 Brunoy 27 F bl l break stop V Vstop break break weak break adj NA 0
agréable 92aaf2 PC 55 F bl l break stop V Vstop break break strong break adj 7.95191138 0
agréable 21abm1 Dijon 31 M bl l k stop V Vstop stop NV NVstop adj 7.95191138 1
agréable 75ccr2 Paris NA F bl l break stop V Vstop break break break adj 7.95191138 0
agréable 69ajl1 Lyon 52 M bl l break stop V Vstop break break weak break adj 7.95191138 0
Alexandre 91asl1 Brunoy 64 F dR R break stop V Vstop break break weak break noun NA 0
From this dataset, I'm trying to predict realization
a variable at three-levels with many different predictors. Here's one of the models I've attempted:
k <- length(levels(df$realization))
I <- diag(k-1)
J <- matrix(rep(1, (k-1)^2), c(k-1, k-1))
prior1<-list(
R = list(fix=1, V=0.5 * (I + J), n = 2
),
G = list(
G1 = list(V = diag(4), n = 4),
G2 = list(V = diag(8), n = 8),
G3 = list(V = diag(4), n = 4),
G4 = list(V = diag(4), n = 4),
G5 = list(V = diag(14), n = 14),
G6 = list(V = 3, n = 3),
G7 = list(V = 3, n = 3),
G8 = list(V = diag(6), n = 6)))
m <- MCMCglmm(realization ~ -1 + trait + NC_C1*liquid + right2 + loc_age + logfreq + pos,
random = ~ us(trait):CG_imm + us(NC_C1*liquid):CG_imm +
us(trait):locuteur + us(trait):enquete + us(right2):CG_imm +
us(loc_age):locuteur + us(log_freq):CG_imm + us(pos):CG_imm,
rcov = ~ us(trait):units,
prior = prior1,
burnin = 15000,
nitt = 40000,
family = "categorical",
data = df)
In my full dataset, the predictor variables I've selected as random effects have the following levels:
> length(levels(ol_north$CG_imm))
[1] 181
> length(levels(ol_north$enquete))
[1] 13
> length(levels(ol_north$locuteur))
[1] 129
All of the predictors that are fixed effects are categorical except for loc_age
and log_freq
The categorical, fixed predictors are at the following levels:
> length(levels(ol_north$NC_C1))
[1] 4
> length(levels(ol_north$liquid))
[1] 2
> length(levels(ol_north$right2))
[1] 14
> length(levels(ol_north$pos))
[1] 6
I've toyed with the n
values in both the G-structure lists and R-structure, and I've adjusted the values in the diag()
argument in the G-structure to no avail. With such a complex model, I'm not sure where exactly my errors are occurring. I've simplified the model to this, and do get it to converge but with a warning:
m <- MCMCglmm(realization ~ -1 + trait + NC_C1*liquid + right2,
random = ~ us(trait):CG_imm + us(NC_C1*liquid):CG_imm,
rcov = ~ us(trait):units,
prior = list(
R = list(fix=1, V=0.5 * (I + J), n = 2),
G = list(
G1 = list(V = diag(2), n = 22),
G2 = list(V = diag(8), n = 8))),
burnin = 15000,
nitt = 40000,
family = "categorical",
data = df)
Warning message:
In MCMCglmm(realization ~ -1 + trait + NC_C1 * liquid + right2, :
some fixed effects are not estimable and have been removed. Use singular.ok=TRUE to sample these effects, but use an informative prior!
Thank you so much for any help in advance!