1

The data set (x.test, y.test) is an exponential fit. I'm trying to fit a custom non-linear function and attached is the code. The regular points plot just fine but I'm unable to get the fit line to work. Any suggestions?

x.test <- runif(50,2,8)
y.test <- 0.5^(x.test)
df <- data.frame(x.test, y.test)

library(ggpmisc)
my.formula <- y ~ lambda/ (1 + aii*x)
ggplot(data  = df, aes(x=x.test,y=y.test)) + 
  geom_point(shape=21, fill="white", color="red", size=3) + 
  stat_smooth(method="nls",formula =  y.test ~ lambda/ (1 + aii*x.test), method.args=list(start=c(lambda=1000,aii=-816.39)),se=F,color="red") +
   geom_smooth(method="lm", formula = my.formula , col = "red") + stat_poly_eq(formula = my.formula, aes(label = stringr::str_wrap(paste(..eq.label.., ..rr.label.., sep = "~~~"))),  parse = TRUE, size = 2.5, col = "red") + stat_function(fun=function (x.test){
  y.test ~ lambda/ (1 + aii*x.test)}, color = "blue")

enter image description here

Rspacer
  • 2,369
  • 1
  • 14
  • 40

1 Answers1

3

A few things:

  • you need to use y and x as the variable names in the formula argument to geom_smooth, regardless of what the names are in your data set
  • you need better starting values (see below)
  • there's a GLM trick you can use to fit this model; doesn't always work (can be numerically unstable), but it doesn't need starting values and will work more often than nls()
  • I don't think lm() and stat_poly_eq() are going to work as expected (or maybe at all) with a nonlinear formula ...

simulate data

(same as your code but using set.seed() - probably not important here but good practice)

set.seed(101)
x.test <- runif(50,2,8)
y.test <- 0.5^(x.test)
df <- data.frame(x.test, y.test)

attempt nls fit with your starting values

It's usually a good idea to troubleshoot by fitting any smoothing terms outside of ggplot2, so you have fewer layers to dig through to find the problems:

nls(y.test ~ lambda/(1+ aii*x.test),
    start = list(lambda=1000,aii=-816.39),
    data = df)

Error in nls(y.test ~ lambda/(1 + aii * x.test), start = list(lambda = 1000, : singular gradient

OK, still doesn't work. Let's use glm() to get better starting values: we use an inverse-link GLM:

1/y = b0 + b1*x
  y = 1/(b0 + b1*x) 
    = (1/b0)/(1 + (b1/b0)*x)

So:

g1 <- glm(y.test ~ x.test, family = gaussian(link = "inverse"))
s0 <- with(as.list(coef(g1)), list(lambda = 1/`(Intercept)`, aii = x.test/`(Intercept)`))

This gives lambda = -0.09, aii = -0.638 (with a little bit more work we could probably also figure out how to eyeball these by looking at the starting point and scale of the curve).

ggplot(data  = df, aes(x=x.test,y=y.test)) +
  geom_point(shape=21, fill="white", color="red", size=3) +
  stat_smooth(method="nls",
              formula =  y ~ lambda/ (1 + aii*x),
              method.args=list(start=s0),
              se=FALSE,color="red") +
  stat_smooth(method = "glm",
              formula = y ~ x,
              method.args = list(gaussian(link = "inverse")),
              color = "blue", linetype = 2)

enter image description here

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
  • Very helpful! I have two quick follow-ups (a) Is there an easy way to compute the inverse-link functions as you did that goes in s0. Some of the models that I'm using are not as strightforward. (b) can you recommend a good book for ecologists that walks through some ideas on inverse-link MLL use? – Rspacer Jan 03 '22 at 18:51
  • (a) I'm not quite sure what you mean. The inverse-link functions are available in the fitted `glm` object as `family(fit)$linkinv`, but that might not be what you want. (b) I don't know of a place that these tricks are collected: https://www.ericrscott.com/post/hacking-glms/ https://stackoverflow.com/questions/57153916/how-do-i-plot-a-mle2-fit-of-a-model-in-ggplot2-along-with-the-data . If you can formulate your questions more specifically/precisely you could ask follow-up questions ... – Ben Bolker Jan 03 '22 at 19:01