0

I have a dependent variable (DV) that is a proportion that is bounded by [0,1). Initially I was considering using a beta regression to model the relationship between this proportion and two other factors (Zone and Season), but being that the data includes 0's I would have to transform the DV using the suggested method by Smithson and Verkuilen (2006) which suggests the following transformation: (y · (n − 1) + 0.5)/n where n is the sample size.

This is a valid option, but I started thinking that since the proportion I am modeling as a response is really a weighted count/total it may be better to model the response as a binomial and use an offset term for the weights. The DV used in my example is p where p is (# observed/total)/# of days so # of days would be the weighting factor in this case.

Which method would be most appropriate in this case?

amela
  • 3
  • 2

1 Answers1

0

I would recommend using the "Zero One Inflated Beta regression" package in R, zoib, which has specialised methods for dealing with true 0 and 1 observations in proportions data. While it's not as intuitive as betareg the maintainer is good at answering questions.

https://journal.r-project.org/archive/2015/RJ-2015-019/RJ-2015-019.pdf

An example with factors Zone and Season, where all factors affect the mean, variance, probability of zeroes and probability of ones (which may or may not be plausible in your data) would be:

zmod <- zoib(Yprop ~ Zone*Season | Zone*Season | Zone*Season | Zone*Season, data=zone_season_data)
plot(as.numeric(zone_season_data$Zone),
    apply( rbind(zm1$ypred[[1]],zm1$ypred[[2]]), 2,mean),
    ylab = 'Predicted Yprop', xlab = 'Zone',
    ylim = c(0,1), pch = 19, col = 4 )
Foztarz
  • 71
  • 6