0

Is it possible to calculate a NLS regression when you have x = 0 ?

mydf <- structure(list(x = c(0, 25.8, 51.6, 77.4, 103.2, 129, 154.8, 
                             180.6, 206.4, 232.2, 258), y = c(1, 0.04347826, 0.021739129, 
                                                              0.013888888, 0.010869564, 0.008620688, 0.007299269, 0.006289307, 
                                                              0.005405404, 0.004830917, 0.004329003)), class = "data.frame", row.names = c(NA, 
                                                                                                                                           -11L))

lmod <- lm(log(y) ~ log(x), data = mydf)

a_start <- exp(lmod$coefficients[[1]])
b_start <- lmod$coefficients[[2]]

nlmod <- nls(y ~ a*x^b, data = mydf, 
             start = list(a = a_start,
                          b = b_start))

library(ggplot2)
ggplot(mydf, aes(x = x, y = y)) +
  geom_point() +
  stat_smooth(method = 'nls', formula = y~a*x^b, 
              method.args = list(start = c(a = a_start,b = b_start)),
              se=FALSE)

This produces the following errors because of log(0).

Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  NA/NaN/Inf in 'x'

Error in numericDeriv(form[[3L]], names(ind), env) : 
  Missing value or an infinity produced when evaluating the model

Obviously it works when I filter by x>0 but y is limited to1 as it's a proportion and when using x>0 only we get values where y>1.

Josh J
  • 395
  • 1
  • 3
  • 13

1 Answers1

2

Use pmax(x, 1) in place of x and simplify slightly by using nls for the starting values as well:

fm0 <- nls(log(y) ~ log(a * pmax(x, 1)^b), mydf, start = list(a = 1, b = 1))
fm1 <- nls(y ~ a * pmax(x, 1)^b, mydf, start = coef(fm0)) 

Update

Have made a number of improvements.

G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341
  • Unfortunately ```max(predict(fm1)) = 1.311985``` and ```y``` is a proportion so should only have a max of 1. – Josh J Mar 05 '20 at 15:29
  • 1
    Not clear what you have done but using fm1 from the answer (note that I made some improvements) gives `max(fitted(fm1))` equal to 1 (modulo floating point approx). Also, if you don't mind changing the model try `nls(y ~ 1 / (1 + b*x^c), mydf, start = c(b = 1, c = 1))` – G. Grothendieck Mar 05 '20 at 16:06
  • I'd gone back to the full dataset. That new model did the trick, thanks. I'm a bit new to nls model fitting - any tips or links to help with deciding on the best model formula? – Josh J Mar 05 '20 at 17:27
  • Hard to say. Have a look through the `nls` answers on SO. – G. Grothendieck Mar 05 '20 at 17:54