I found that function varIdent
has an undesired behavior which is dependent on data order. In detail, the reference category depends on data order, see the next example
set.seed(123)
library(tidyverse)
library(nlme)
#>
#> Attaching package: 'nlme'
#> The following object is masked from 'package:dplyr':
#>
#> collapse
n_smpl<-10
z0<-data.frame(sex=sample(c("F","M"),n_smpl,prob = c(0.6,0.4),replace = T) %>% factor)
summary(z0)
#> sex
#> F:6
#> M:4
X<-model.matrix(~sex, z0)
sd_e<-1.3
e<-rnorm(nrow(z0),0,sd_e)
my_bts<-c(10,1.4)
z0$y<-(X %*% my_bts+e)%>% drop
m0<-gls(model=y~sex,weights=varIdent(form=~1|sex),data=z0[order(z0$sex),])
m1<-gls(model=y~sex,weights=varIdent(form=~1|sex),data=z0[order(z0$sex,decreasing =T),])
summary(m0)
#> Generalized least squares fit by REML
#> Model: y ~ sex
#> Data: z0[order(z0$sex), ]
#> AIC BIC logLik
#> 35.82722 36.14498 -13.91361
#>
#> Variance function:
#> Structure: Different standard deviations per stratum
#> Formula: ~1 | sex
#> Parameter estimates:
#> F M
#> 1.0000000 0.5305269
#>
#> Coefficients:
#> Value Std.Error t-value p-value
#> (Intercept) 10.344227 0.5847687 17.689434 0.0000
#> sexM 0.967754 0.6973690 1.387721 0.2026
#>
#> Correlation:
#> (Intr)
#> sexM -0.839
#>
#> Standardized residuals:
#> Min Q1 Med Q3 Max
#> -1.38845834 -0.72023259 -0.02681155 0.85333098 1.31623645
#>
#> Residual standard error: 1.432385
#> Degrees of freedom: 10 total; 8 residual
summary(m1)
#> Generalized least squares fit by REML
#> Model: y ~ sex
#> Data: z0[order(z0$sex, decreasing = T), ]
#> AIC BIC logLik
#> 35.82722 36.14498 -13.91361
#>
#> Variance function:
#> Structure: Different standard deviations per stratum
#> Formula: ~1 | sex
#> Parameter estimates:
#> M F
#> 1.000000 1.884919
#>
#> Coefficients:
#> Value Std.Error t-value p-value
#> (Intercept) 10.344227 0.5847687 17.689434 0.0000
#> sexM 0.967754 0.6973690 1.387721 0.2026
#>
#> Correlation:
#> (Intr)
#> sexM -0.839
#>
#> Standardized residuals:
#> Min Q1 Med Q3 Max
#> -1.38845834 -0.72023259 -0.02681155 0.85333098 1.31623645
#>
#> Residual standard error: 0.7599187
#> Degrees of freedom: 10 total; 8 residual
Created on 2023-04-04 by the reprex package (v2.0.1)
Look at Variance function
outputs for m0
and m1
models, the reference category is different in each model. This does not happens in the mean structure component of the model.