2

I am analyzing percentage data with glmer, and I have read that Gamma family should be suitable for this kind of data. I have checked my data and there are no values below 0, but I still get an error saying I have non-positive values.

  > summary(total_F$p.prcnt)
       Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
       0.00   50.00   75.00   68.56  100.00  100.00 

Just adding my code, that I used:

F_par1<- glmer(p.prcnt ~ b.element+distance+b.element*distance +year+sampling.round+(1|LS1),  
                   family = Gamma, 
                   data=total_F)

What are my options? I tried to modify my data in Excel to be proportional and use binomial, but I would prefer to use Gamma, if possible.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
Sisi
  • 51
  • 3

3 Answers3

1

It appears that zeroes are also considered as non-positive. Here an example the cbpp dataset that comes with lme4.

library(lme4)

table(cbpp$incidence)
#  0  1  2  3  4  5  8 11 12 
# 22 13  9  5  1  2  2  1  1 

As we can see, there are zeroes in the DV,

glmer(incidence ~ period + (1 | herd), data=cbpp, family=Gamma)
# Error in eval(family$initialize, rho) : 
#   non-positive values not allowed for the 'Gamma' family

which leads to the error. No error, if we exclude zeros:

glmer(incidence ~ period + (1 | herd), data=cbpp[cbpp$incidence != 0, ], family=Gamma)
# Generalized linear mixed model fit by maximum likelihood (Laplace Approximation) ['glmerMod']
#  Family: Gamma  ( inverse )
# Formula: incidence ~ period + (1 | herd)
#    Data: cbpp[cbpp$incidence != 0, ]
#      AIC      BIC   logLik deviance df.resid 
# 121.5891 130.7473 -54.7946 109.5891       28 
# Random effects:
#   Groups   Name        Std.Dev.
#     herd     (Intercept) 0.1714  
#     Residual             0.5186  
# Number of obs: 34, groups:  herd, 15
# Fixed Effects:
# Intercept)      period2      period3      period4  
#     0.3891       0.2251       0.1487       0.4620  
jay.sf
  • 60,139
  • 8
  • 53
  • 110
  • Thank you, but the zero that I have in the data are meaningful, therefore I cannot simply remove them. Should I discard Gamma and just do binomial? – Sisi Oct 18 '22 at 10:19
  • @Sisi Sure the zeros are meaningful, most likely Gamma isn't the right choice. However that's a statistical question and should be asked on [Cross Validated](https://stats.stackexchange.com/help/on-topic). – jay.sf Oct 18 '22 at 10:21
1

This is close to a CrossValidated question, but:

  • zero values are "non-positive" (non-positive includes 0 and negative; if the software was OK with zeros, it would have said "negative values are not allowed"
  • if you have meaningful 0 and 1 (or in your case 100%) values:
    • if the denominator is a known integer (e.g. if the underlying data are of the form "X out of Y", e.g. number of sampled cookies that are chocolate chip) rather than denominatorless (e.g. "proportion of time spent browsing Stack Overflow"), then you should use a binomial or related (e.g. beta-binomial) response distribution
    • For proportion data with non-negligible 0 and 1 values, you can:
      • arcsin-sqrt transform and fit with a linear model
      • use a zero-one-inflated beta (ZOIB) model — currently available only in Bayesian flavours, e.g. in the zoib and brms packages
      • use an ordered-beta regression, available in the ordbetareg package (also Bayesian/built on brms) and in the development version of glmmTMB [hopefully coming to CRAN in the next few days]
Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
1

Yes, I wrote the ordered beta regression to handle proportions, which it does using a single statistical distribution and without needing non-linear transformations of the outcome. You can read the package vignette here: https://cran.r-project.org/web/packages/ordbetareg/vignettes/package_introduction.html.