0

I keep receiving the following error:

 Error: no valid set of coefficients has been found: please supply starting values 

My code is below:

glm(y ~ a + b + c + d, data = data,
  family = binomial(link="log"))

My question is, how do I supply starting values?

user438383
  • 5,716
  • 8
  • 28
  • 43
Qregty
  • 9
  • 2
  • 2
    Does this answer your question? [Default starting values fitting logistic regression with glm](https://stackoverflow.com/questions/60526586/default-starting-values-fitting-logistic-regression-with-glm) – nCessity Feb 22 '21 at 21:45
  • 1
    there's something wrong with your dataset. can you provide a reproducible example or some of your dataset? – StupidWolf Feb 24 '21 at 08:08

3 Answers3

0

This can happen when coefficients result in a positive linear combination, which in log-binomial regression means P(success) > 1. Supplying different starting values can help avoid this situation.

The short answer is to add start=coefini to the glm function. Where coefini is a vector whose length is the number of model parameters plus 1 (for the intercept). So your function becomes

glm(y ~ a + b + c + d, data = data,
  family = binomial(link="log"), start=coefini)

But choosing which values to supply can be challenging.

The question/answer in this other community was useful to me in computing starting values for this kind of analysis.

That question is about the starting values for LOGISTIC regression, and its answer is that "you can get them using [linear regression] by regressing the logit of the response, y, on the predictors with weight ny(1-y)"

In that question, the binary (1/0) dependent variable is remiss. And the answer provides these steps to obtain starting values for a logistic regression:

y=.1*(remiss=0)+.9*(remiss=1)
logit=log(y/(1-y))
wt=y*(1-y)

Then the starting values come from the weighted linear regression of logit with the predictors of interest.

But I adjusted a couple of things for link="log" instead of "logit": Instead of

logit = log(y/(1-y))
wt = y*(1-y)

I used

depvar = log(y)
wt = y/(1-y)

Then the starting values for the log-binomial model are the results of the weighted linear regression of depvar with the same predictors. If these results are stored in a vector called coefini, then just add start=coefini to your glm command.

I am more into Python than R these days, but I believe with your example, this means doing the following:

data['y0'] <- 0.1 * (data['y'] == 0) + 0.9 * (data['y'] == 1)
data['depvar'] <- log(data['y0'])
data['wt'] <- data['y0'] / (1 - data['y0'])
coefini <- coef(glm(formula = depvar ~ a + b + c + d, weights=wt, data = data))
glm(y ~ a + b + c + d, data = data,
  family = binomial(link="log"), start=coefini)
John
  • 91
  • 1
  • 9
0

Disclaimer: I did not try this but I am giving my educated guess on the matter.

It could be that the c in your formula argument is being misread as a vector which is written in R with c().

My possible suggestion would be to try to rewrite your code with the c argument wrapped with backticks.

Try this instead:

glm(y ~ a + b + `c` + d, data = data,
  family = binomial(link="log"))
Bensstats
  • 988
  • 5
  • 17
0

try to use logbin() function from logbin package instead of glm()

  • This does not provide an answer to the question. Once you have sufficient [reputation](https://stackoverflow.com/help/whats-reputation) you will be able to [comment on any post](https://stackoverflow.com/help/privileges/comment); instead, [provide answers that don't require clarification from the asker](https://meta.stackexchange.com/questions/214173/why-do-i-need-50-reputation-to-comment-what-can-i-do-instead). - [From Review](/review/late-answers/33053577) – xilliam Nov 02 '22 at 10:05