2

I wonder if somebody here can help me.

I am trying to fit a beta GLM with betareg package since my dependent variable is a proportion (relative density of whales in 500m grid size) varying from 0 to 1. I have three covariates:

  • Depth (measured in meters ranging from 4 to 100m),
  • Distance to Coast (measured in meters ranging from 0 to 21346m) and
  • distance to boats (measured in meters ranging from 0 to 20621).

My dependent variable has a lot of 0s and many values that are too close to 0 (as in 7.8e-014). When I try to fit the model the following error shows:

invalid dependent variable, all observations must be in (0, 1). 

From what I looked from previous discussions it seems this is caused by my 0s in the dataset (I should not have any 0s or 1s). When I change all my 0 to only positive definite (e.g. 0.0000000000000001) the error message I get is:

Error in chol.default(K) : 
  the leading minor of order 2 is not positive definite
In addition: Warning messages:
1: In digamma(mu * phi) : NaNs produced
2: In digamma(phi) : NaNs produced
Error in chol.default(K) : 
  the leading minor of order 2 is not positive definite
In addition: Warning messages:
1: In betareg.fit(X, Y, Z, weights, offset, link, link.phi, type, control) :
  failed to invert the information matrix: iteration stopped prematurely
2: In digamma(mu * phi) : NaNs produced

From what I saw at several forums it seems this is because my matrix is not positive definite. It may be either indefinite (i.e. have both positive and negative eigenvalues) or my matrix may be near singular, i.e. it's smallest eigenvalue is very close to 0 (and so computationally it is 0).

My question is: since I only have this dataset, is there any way to solve these problems and run a beta regression? Or, is there any other model that I could use instead of betareg package that it could work?

Here is my code:

betareg(Density~DEPTH+DISTANCE_TO_COAST+DIST_BOAT,data=misti)
landroni
  • 2,902
  • 1
  • 32
  • 39
  • hard to say without a reproducible example, but I would try (1) adding a large offset to your 0 values to make sure you're away from the boundary (e.g. 1e-6); (2) scaling and centering your predictor variables. Do univariate beta regressions work (e.g. `betareg(Density~DEPTH,data=misti)` ? You may eventually have to take zero-inflation into account. โ€“ Ben Bolker Oct 15 '14 at 16:10

2 Answers2

11

When I change all my 0 to only positive definite (e.g. 0.0000000000000001)

Doing this seems like a bad idea, resulting in the error messages you see.

It seems that betareg currently only works strictly for data inside the (0,1) interval, and here's what the package vignette has to say:

The class of beta regression models, as introduced by Ferrari and Cribari-Neto (2004), is useful for modeling continuous variables y that assume values in the open standard unit interval (0, 1). [...] Furthermore, if y also assumes the extremes 0 and 1, a useful transformation in practice is (y ยท (n โˆ’ 1) + 0.5)/n where n is the sample size (Smithson and Verkuilen 2006).

So one way to approach this would be:

y.transf.betareg <- function(y){
    n.obs <- sum(!is.na(y))
    (y * (n.obs - 1) + 0.5) / n.obs
}


betareg( y.transf.betareg(Density) ~ DEPTH+DISTANCE_TO_COAST+DIST_BOAT, data=misti)

For an alternative approach to betareg, using a binomial GLM with a logit link, see this question on Cross Validated and the linked UCLA FAQ:

Some will suggest using a quasibinomial GLM instead to model proportions/percentages...

Community
  • 1
  • 1
landroni
  • 2,902
  • 1
  • 32
  • 39
  • I have two similar dependent variables with the same min. >0 and max<1 after the transformation, but still get the error "Error in chol.default(K) : the leading minor of order 2 is not positive definite", for one of them, but not the other. Any idea what can be the cause? โ€“ jlp Jun 04 '21 at 12:24
0

Instead of a beta regression, you can just run a linear model using the logistic transformation of your dependent variable. Try the following:

   logistic <- function(p) log(p / (1-p) +0.01)
   lm(logistic(Density)~DEPTH+DISTANCE_TO_COAST+DIST_BOAT,data=misti)
Cyrus Mohammadian
  • 4,982
  • 6
  • 33
  • 62