0

I am running a multinomial analysis with vglm(). It all works, but then I try to follow the instructions from the following website (https://rcompanion.org/handbook/H_08.html) to do a pairwise test, because emmeans cannot handle pairwise for vglm models. The lrtest() part gives me the following error: Error in lrtest.default(model) : 'list' object cannot be coerced to type 'double'

I cannot figure out what is wrong, I even copy and pasted the exact code that the website used (see below) and get the same error with their own code and dataset. Any ideas?

Their code and suggestion for doing pairwise testing with vglm() is the only pairwise testing option I found for vglm() anywhere on the web.

Here is the code along with all the expected output and extra details from their website (it is simpler than mine but gets same error anyways).

Input = ("
County       Sex     Result  Count
Bloom        Female  Pass     9
Bloom        Female  Fail     5
Bloom        Male    Pass     7
Bloom        Male    Fail    17
Cobblestone  Female  Pass    11
Cobblestone  Female  Fail     4
Cobblestone  Male    Pass     9
Cobblestone  Male    Fail    21
Dougal       Female  Pass     9
Dougal       Female  Fail     7
Dougal       Male    Pass    19
Dougal       Male    Fail     9
Heimlich     Female  Pass    15
Heimlich     Female  Fail     8
Heimlich     Male    Pass    14
Heimlich     Male    Fail    17
")

Data = read.table(textConnection(Input),header=TRUE)


### Order factors otherwise R will alphabetize them

Data$County = factor(Data$County,
                     levels=unique(Data$County))

Data$Sex    = factor(Data$Sex,
                     levels=unique(Data$Sex))

Data$Result = factor(Data$Result,
                     levels=unique(Data$Result))


###  Check the data frame

library(psych)

headTail(Data)

str(Data)

summary(Data)


### Remove unnecessary objects

rm(Input)


Multinomial regression


library(VGAM)

model = vglm(Result ~ Sex + County + Sex:County,
             family=multinomial(refLevel=1),
             weights = Count,
             data = Data)


summary(model)


library(car)

Anova(model,
      type="II",
      test="Chisq")```

Analysis of Deviance Table (Type II tests)

Response: Result Df Chisq Pr(>Chisq)
Sex 1 6.7132 0.00957 ** County 3 4.1947 0.24120
Sex:County 3 7.1376 0.06764 .

library(rcompanion)

nagelkerke(model)

$Pseudo.R.squared.for.model.vs.null Pseudo.R.squared McFadden 0.0797857 Cox and Snell (ML) 0.7136520 Nagelkerke (Cragg and Uhler) 0.7136520

$Likelihood.ratio.test Df.diff LogLik.diff Chisq p.value 7 -10.004 20.009 0.0055508

library(lmtest)

lrtest(model)

Likelihood ratio test

Model 1: Result ~ Sex + County + Sex:County Model 2: Result ~ 1

#Df LogLik Df Chisq Pr(>Chisq)
1 8 -115.39
2 15 -125.39 7 20.009 0.005551 **

Post-hoc analysis

At the time of writing, the lsmeans package cannot be used with vglm models.

One option for post-hoc analysis would be to conduct analyses on reduced models, including only two levels of a factor. For example, if the variable County x Sex term had been significant, the following code could be used to create a reduced dataset with only Bloom–Female and Bloom–Male, and analyze this data with vglm.

Data.b        = Data[Data$County=="Bloom" &
                     (Data$Sex=="Female"| Data$Sex=="Male") , ]

Data.b$County = factor(Data.b$County)
Data.b$Sex    = factor(Data.b$Sex)


summary(Data.b)

County Sex Result Count
Bloom:4 Female:2 Pass:2 Min. : 5.0 Male :2 Fail:2 1st Qu.: 6.5 Median : 8.0 Mean : 9.5 3rd Qu.:11.0 Max. :17.0

library(VGAM)

model.b = vglm(Result ~ Sex,
                family=multinomial(refLevel=1),
                weights = Count,
                data = Data.b)

lrtest(model.b)

Likelihood ratio test

#Df LogLik Df Chisq Pr(>Chisq) 1 2 -23.612
2 3 -25.864 1 4.5041 0.03381 *

Summary table of results

Comparison p-value Bloom–Female - Bloom–Male 0.034 Cobblestone–Female - Cobblestone–Male 0.0052 Dougal–Female - Dougal–Male 0.44 Heimlich–Female - Heimlich–Male 0.14

p.value = c(0.034, 0.0052, 0.44, 0.14)

p.adj = p.adjust(p.value,
                 method = "fdr")


p.adj = signif(p.adj,
               2)

p.adj

[1] 0.068 0.021 0.440 0.190

Comparison p-value p.adj Bloom–Female - Bloom–Male 0.034 0.068 Cobblestone–Female - Cobblestone–Male 0.0052 0.021 Dougal–Female - Dougal–Male 0.44 0.44 Heimlich–Female - Heimlich–Male 0.14 0.19

catcool7
  • 71
  • 1
  • 7

2 Answers2

1

It looks to me like qdrq() can be used. As I commented, you can't use the lazy interface, you have to give all the specific needed parameters:

> library(emmeans)
> RG = qdrg(formula(model), Data, coef(model), vcov(model), link = "log")
> RG
'emmGrid' object with variables:
    Sex = Female, Male
    County = Bloom, Cobblestone, Dougal, Heimlich
Transformation: “log” 

> emmeans(RG, consec ~ Sex | County)
$emmeans
County = Bloom:
 Sex    emmean    SE  df asymp.LCL asymp.UCL
 Female -0.588 0.558 Inf  -1.68100    0.5054
 Male    0.887 0.449 Inf   0.00711    1.7675

County = Cobblestone:
 Sex    emmean    SE  df asymp.LCL asymp.UCL
 Female -1.012 0.584 Inf  -2.15597    0.1328
 Male    0.847 0.398 Inf   0.06643    1.6282

County = Dougal:
 Sex    emmean    SE  df asymp.LCL asymp.UCL
 Female -0.251 0.504 Inf  -1.23904    0.7364
 Male   -0.747 0.405 Inf  -1.54032    0.0459

County = Heimlich:
 Sex    emmean    SE  df asymp.LCL asymp.UCL
 Female -0.629 0.438 Inf  -1.48668    0.2295
 Male    0.194 0.361 Inf  -0.51320    0.9015

Results are given on the log (not the response) scale. 
Confidence level used: 0.95 

$contrasts
County = Bloom:
 contrast      estimate    SE  df z.ratio p.value
 Male - Female    1.475 0.716 Inf   2.060  0.0394

County = Cobblestone:
 contrast      estimate    SE  df z.ratio p.value
 Male - Female    1.859 0.707 Inf   2.630  0.0085

County = Dougal:
 contrast      estimate    SE  df z.ratio p.value
 Male - Female   -0.496 0.646 Inf  -0.767  0.4429

County = Heimlich:
 contrast      estimate    SE  df z.ratio p.value
 Male - Female    0.823 0.567 Inf   1.450  0.1470

Results are given on the log (not the response) scale. 

If I understand this model correctly, the response is the log of the ratio of the 2nd multinomial response to the 1st. So what we see above is estimated differences of logs and setimated differences of those differences. If run with type = "response" you would get estimated ratios, and ratios of those ratios.

Russ Lenth
  • 5,922
  • 2
  • 13
  • 21
  • Thanks Russ, now I have another issue. Your emmeans summary displays LCL and UCL information, as well as z.ratio and p.value. However mine displays HPD information and no p.value. I tried different ways of adjusting the code but have gotten no where. Any ideas? I put my output as comment below due to character limit – catcool7 Apr 29 '22 at 06:07
  • Here is part of my output I am getting but I cannot post it all due to character limit: $emmeans County = Bloom: Sex emmean lower.HPD upper.HPD Female -5.55e-17 -0.311 0.311 Male -8.33e-17 -0.202 0.202 ........... $contrasts County = Bloom: contrast estimate lower.HPD upper.HPD Male - Female 0.00e+00 -0.513 0.513 ......................... Point estimate displayed: median Results are given on the log (not the response) scale. HPD interval probability: 0.95 – catcool7 Apr 29 '22 at 06:10
  • Hmmm, that means your model is way, way different than the one shown. If is fitted using MCMC methods, and you must have given the posterior sample info to qdrg as well. Or did you regrid it with an Nsim argument? Maybe start over with a new RG – Russ Lenth Apr 29 '22 at 12:57
  • You can add `frequentist = TRUE` and get the confidence limits, etc.; but that does not answer what happened that turned this into a Bayesian model. – Russ Lenth Apr 29 '22 at 14:03
  • Thanks for your helpful reply Russ. Ok, adding `frequentist = TRUE` did indeed revert back to LCL, zratio and pvalues. Now I am having a new issue. Up until now, I was using the example dataset I included in this question and it finally works. However, I just now tried applying it to my own data and for some reason receive the following error: `Error in ref_grid(result, ...) : Something went wrong: Non-conformable elements in reference grid.` I will post a new "answer" to show you my dataset I am using. – catcool7 May 05 '22 at 05:06
  • This could happen if you have a rank deficiency, and it isn't handled in the conventional way by `vglm`. Look at what you provided for `coef` and `vcov` and see if *all* the names agree. And look at the model summary and messages when fitting the model for any indications of singularity. It *might* help to install **emmeans** from [GitHub](https://github.com/rvlenth/emmeans) as I did do some work in `qdrg()` to make it more robust in such situations. – Russ Lenth May 07 '22 at 21:45
  • Thanks, I checked for rank deficiency and singularity but that was not the fix. I found what is triggering the error, but haven't solved how to fix the issue. For my data, I have 4 levels within the 'result' column, compared to just 2 levels for the example dataset. When I remove 2 levels in my dataset to see what happens, then it works without conformable element issues. It doesn't matter which levels I remove, or what the count of each level is if I mess around with the count. Any ideas? I haven't found a fix. – catcool7 May 12 '22 at 01:45
  • Don't use an example dataset! You should almost always use the same dataset that was used to fit the model. If not, it has to have the same factors with the same numbers of levels. If you want to exclude some levels, use an `at` argument (see `? ref_grid`) – Russ Lenth May 12 '22 at 02:10
  • If you're talking about a multivariate response, `coef` should be a matrix with that many columns. – Russ Lenth May 12 '22 at 02:16
1

Probably something changed in either the VGAM package or the lmtest package since that was written.

But the following will work for a likelihood ratio test for vglm models:

VGAM::lrtest(model)

VGAM::lrtest(model, model2)
Sal Mangiafico
  • 440
  • 3
  • 8