1

I have a linear mixed effects model and I am trying to do variable selection. The model is testing the level of forest degradation in 1000 sampled points. Most points have no degradation, and so the dependent variable is highly skewed with many zeros. Therefore, I am using the Tweedie distribution to fit the model. My main question is: can the Tweedie distribution actually be used in the glmmLasso function? My second question is: do I even need to use this distribution in glmmLasso()? Any help is much appreciated!

When I run the function with family = tweedie(var.power=1.2,link.power=0) I get the following error:

Error in logLik.glmmLasso(y = y, yhelp = yhelp, mu = mu, family = family,  : 
  object 'loglik' not found

If I change the link.power from 0 to 1 (which I think is not correct for my model, but just for the sake of figuring out the problem), I get a different error:

Error in grad.lasso[b.is.0] <- score.beta[b.is.0] - lambda.b * sign(score.beta[b.is.0]) : 
  NAs are not allowed in subscripted assignments

Here tweedie comes from the statmod package. A simple example:

library(tweedie)
library(tidyverse)
library(glmmLasso)
library(statmod)

power <- 2
mu <- 1
phi <- seq(2, 8, by=0.1)
set.seed(10000)
y <- rtweedie( 100, mu=mu, power=power, phi=3)

x <- rnorm(100)
z <- c(rep(1, 50), rep(2,50))
df = as.data.frame(cbind(y,x,z))
df$z = as.factor(df$z)
f = y ~ x

varSelect = glmmLasso(fix = f, rnd = list(z=~1), data = df, 
                      lambda = 5, family = tweedie(var.power=1.2,link.power=0))

Shawn Hemelstrand
  • 2,676
  • 4
  • 17
  • 30
  • It looks like the log-likelihood is hard-coded (see `glmmLasso:::logLik.glmmLasso`), and the only allowable choices are Poisson/binomial/"acat"/"cumulative". I might take a shot at hacking this. I am mildly nervous about how well the exponential-family theory behind all of this extends to the Tweedie, but if it works ... – Ben Bolker Jan 28 '23 at 00:27
  • Thank you @BenBolker If the Tweedie is not compatible, would you have any other suggestions for a zero-inflated distribution to use with glmmLasso? Or perhaps a different way to do variable selection for a mixed model with zero inlated distribution? – Matt Marcus Jan 29 '23 at 21:38

1 Answers1

0

I created a hacked version of glmmLasso that incorporates the Tweedie distribution as an option and put it on Github. I had to change two aspects of the code:

  • add a clause to compute the log-likelihood if family$family == "Tweedie"
  • in a number of places where the code was essentially if (family$family in list_of_families) ..., add "Tweedie" as an option.
remotes::install_github("bbolker/glmmLasso-bmb")
packageVersion("glmmLasso")
## [1] ‘1.6.2.9000’

Your example runs for me now, but I haven't checked at all to see if the results are sensible.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Thanks, the function runs now. However, the results it gives me show NA's for all std.errors, z-values and p-values. Does this mean the function is rejecting all variables in the model? – Matt Marcus Jan 31 '23 at 03:40