I would like to model / fit Value on explanatory variables Type and Material (Value ~ Material + Type). Having a look at the sample test data provided here, one could see that Material X has all zero Values except for one, which makes the distribution of Value zero-inflated, across all observations. Given the model diagnostics, linear assumptions do not hold here.
Value is a numeric variable, and all observations are independent from each other.
I would like to know how can I find a proper distribution for this data, or transform it in a way that I could handle these zeros.
I read about gamlss
and pscl
packages, but I struggled applying them to my data.
ID <- seq(from = 1, to = 36)
Type <- rep(c("A", "B"),each=18)
Material <- rep (c("X","Y","Z","X","Y","Z"), each = 6)
Value <- c(0,0,0,2,0,0,27,50,30,103,104,223,147,
127,115,78,148,297,0,0,0,0,0,0,84,
59,56,53,64,86,90,75,95,111,215,191)
test.data <- data.frame(ID,Type,Material,Value)
test.data$ID <- factor(test.data$ID)
test.data$Type <- factor(test.data$Type)
test.data$Material <- factor(test.data$Material)