How to handle zero-inflated (semi-)continuous data in R?

Question

I would like to model / fit Value on explanatory variables Type and Material (Value ~ Material + Type). Having a look at the sample test data provided here, one could see that Material X has all zero Values except for one, which makes the distribution of Value zero-inflated, across all observations. Given the model diagnostics, linear assumptions do not hold here.

Value is a numeric variable, and all observations are independent from each other.

I would like to know how can I find a proper distribution for this data, or transform it in a way that I could handle these zeros.

I read about gamlss and pscl packages, but I struggled applying them to my data.

ID <- seq(from = 1, to = 36)
Type <- rep(c("A", "B"),each=18)
Material <- rep (c("X","Y","Z","X","Y","Z"), each = 6)
Value <- c(0,0,0,2,0,0,27,50,30,103,104,223,147,
           127,115,78,148,297,0,0,0,0,0,0,84,
           59,56,53,64,86,90,75,95,111,215,191)
test.data <- data.frame(ID,Type,Material,Value)
test.data$ID <- factor(test.data$ID)
test.data$Type <- factor(test.data$Type)
test.data$Material <- factor(test.data$Material)

These are not necessarily continuous data (i.e., the responses are all integers), *and* they're not necessarily zero-inflated. It could just be that the mean is very low for Material X. This might be more appropriate for [CrossValidated](https://stats.stackexchange.com) ... — Ben Bolker, Jul 25 '22 at 19:55

score 0 · Answer 1 · answered Jul 25 '22 at 15:33

You could try:

m1 <- gamlss(Value ~ Material + Type, sigma.fo =~ Material + Type,
family=ZIP)

ZIP(mu, sigma) is a zero inflated Poisson distribution, which is a mixture of zero with probability sigma, and a Poisson distribution PO(mu) with probability (1-sigma).

You could then look at the residuals using plot(m1) or wp(m1)

The model may not be adequate and may need a zero inflated negative binomial distribution, ZINBI(mu,sigma,nu) which is a mixture of zero with probability nu, and a negative binomial distribution NBI(mu,sigma) with probability (1-nu):

m2 <- gamlss(Value ~ Material + Type, sigma.fo =~ Material + Type,
nu.fo =~ Material + Type,family=ZIPBNI)

Alternatively an interaction term may be needed for mu, (and/or sigma or nu), e.g.

m3 <- gamlss(Value ~ Material*Type, sigma.fo =~ Material + Type,
family=ZIP)

How to handle zero-inflated (semi-)continuous data in R?

1 Answers1