nls() returns different fittings on two similar sets of data (R)

Question

Dear stack overflow community,

I am trying to fit the Hill equation to some fluorescence induction curves (inducer vs fluorescence response). I have two types of data that both represent fluorescence response, I have dubbed them "max" and "rate". My problem is that nls() fits the "rate" data curves consistently better than the "max" curves, even though they look, in my opinion, quite similar. Below are some example data from my experiments:

#example data
df_max  <- tibble(inducer_conc = c(0.00,0.04,0.10,0.20,0.30,0.40,0.50,0.60,0.80,1.00), resp = c(3942,   3791,   3696,   3976,   4131,   4368,   4984,   5178,   5257,   5159))
df_rate <- tibble(inducer_conc = c(0.00,0.04,0.10,0.20,0.30,0.40,0.50,0.60,0.80,1.00), resp = c(0.0026,0.0099,0.0060,0.0136,0.0226,0.0327,0.0470,0.0481,0.0482,0.0473))

#set beta values as the highest observed output
beta_max  <- max(df_max$resp )
beta_rate <- max(df_rate$resp )

#set y-intercept as the output at 0 inducer concentration
C_max  <- df_max$resp[1]
C_rate <- df_rate$resp[1]

#nls fitting with "max steady states" as response
model <- function(inducer_conc, n, K) ((beta_max * (inducer_conc^n)/(K^n + inducer_conc^n)) + C_max)
fit_max <- nls(resp ~model(inducer_conc, n, K), data = df_max, start=c(n=2,K=0.4))

plot(df_max$inducer_conc, df_max$resp , main = "data", pch = 16) +
lines(df_max$inducer_conc, fitted(fit_max), col = 'red', lwd = 2)

#nls fitting with "rates" as response
model <- function(inducer_conc, n, K) ((beta_rate * (inducer_conc^n)/(K^n + inducer_conc^n)) + C_rate)
fit_rate <- nls(resp ~model(inducer_conc, n, K), data = df_rate, start=c(n=2,K=0.4))

plot(df_rate$inducer_conc, df_rate$resp , main = "data", pch = 16) +
lines(df_rate$inducer_conc, fitted(fit_rate), col = 'red', lwd = 2)

I have wasted quite some time playing around with starting parameters for the "max" dataset, and the current starting parameters are the ones that dont give me an infinity value or the "singular gradient" error.

I realize this might be a fitting/statistic problem and not a coding problem, but being a novice in coding in general, I wanted to try my luck as I might not be implementing or using nls() correctly.

`nls` does exactly what the source data set "tells" it to do. Your inputs differ so of course the outputs will differ. In cases like this, I strongly recommend plotting both data sets and visually evaluating the differences. — Carl Witthoft, Nov 29 '22 at 18:09
BTW, you should post the results achieved and point out what you don't think is correct or accurate. As it is we can't even tell if the differences are so small that they could be attributed to float-precision; or whether the starting-point parameters you've chosed are on different sides of a local max or min. — Carl Witthoft, Nov 29 '22 at 18:12

G. Grothendieck · Answer 1 · 2022-11-30T12:50:17.493

Try optimizing over all 4 parameters. The red line is the one fit in the question and the black line is optimizing over all 4 parameters.

fit_max_plin <- nls(resp ~ 
  cbind(beta_max = (inducer_conc^n)/(K^n + inducer_conc^n), C_max = 1),
  data = df_max, start = c(n = 2, K = 0.4), algorithm = "plinear")

fit_max_plin
## Nonlinear regression model
##   model: resp ~ cbind(beta_max = (inducer_conc^n)/(K^n + 
## inducer_conc^n),     C_max = 1)
##    data: df_max
##             n             K .lin.beta_max    .lin.C_max 
##        6.4064        0.4148     1385.5323     3862.5874 
##  residual sum-of-squares: 87433
##
## Number of iterations to convergence: 18 
## Achieved convergence tolerance: 6.615e-06

plot(df_max$inducer_conc, df_max$resp , main = "data", pch = 16) +
lines(fitted(fit_max_plin) ~ inducer_conc, df_max)
lines(df_max$inducer_conc, fitted(fit_max), col = 'red', lwd = 2)

nls() returns different fittings on two similar sets of data (R)

1 Answers1