18

I have read a post ( Sigmoidal Curve Fit in R ). It was labeled duplicated, but I can't see anything related with the posts. And the answer given for the posts was not enough.

I read a webpage

Similar to the others, he uses this format to fit the line:

fitmodel <- nls(y~a/(1 + exp(-b * (x-c))), start=list(a=1,b=.5,c=25))

The problem is that, a,b,c were given in most of the cases and I have no clues which set of a,b,c should I use for my set of data. Could someone give me some advice on how to obtain the parameters?

Here is my set of numbers :

x <- c(3.9637878,3.486667,3.0095444,2.5324231,2.0553019,1.5781806,1.1010594,0.6242821)
y <- c(6491.314,6190.092,2664.021,2686.414,724.707,791.243,1809.586,541.243)
FunnyFunkyBuggy
  • 495
  • 1
  • 7
  • 21
  • you have to guess the `a, b, c`. If you have an idea how the curve should look like it always help to draw a curve with the random coefficients (ex. `a = 20, b = 0.1, c = 0.2, curve(a/(1 + exp(-b * (x-c))), 0, 100)` and see how are your guesses check if the the `x-c` is correct. Shouldn't it be `x^(-c)` – Mateusz1981 Oct 09 '15 at 08:19
  • It is a possible way but are there any other possible ways for me to do it in a more "statistically convincing" manner? For example, is it possible to make a loop to find the optimal set of combinations of a,b,c. Or, if any, can I use some functions or commands leave it for the program to calculate it for me? – FunnyFunkyBuggy Oct 09 '15 at 08:30
  • once I was also looking for this "gral". I have not found yet. I am not statistician but I think that guessing starting values is a common way of estimating equation parameters in `nlm` – Mateusz1981 Oct 09 '15 at 08:32

3 Answers3

12

Luckily R offers a selfstarting model for the logistic model. It uses a slight reparametrization, but is really the same model as yours: Asym/(1+exp((xmid-input)/scal))

A selfstarting model can estimate good starting values for you, so you don't have to specify them.

plot(y ~ x)
fit <- nls(y ~ SSlogis(x, Asym, xmid, scal), data = data.frame(x, y))

summary(fit)
#Formula: y ~ SSlogis(x, Asym, xmid, scal)
#
#Parameters:
#      Estimate Std. Error t value Pr(>|t|)
#Asym 1.473e+04  2.309e+04   0.638    0.551
#xmid 4.094e+00  2.739e+00   1.495    0.195
#scal 9.487e-01  5.851e-01   1.622    0.166
#
#Residual standard error: 941.9 on 5 degrees of freedom
#
#Number of iterations to convergence: 0 
#Achieved convergence tolerance: 4.928e-06

lines(seq(0.5, 4, length.out = 100), 
      predict(fit, newdata = data.frame(x = seq(0.5, 4, length.out = 100))))

resulting plot

Of course your data doesn't really support the model. The estimated mid point is just at the right limit of your data range and thus the parameter estimates (in particular for the asymptote) are very uncertain.

Roland
  • 127,288
  • 10
  • 191
  • 288
  • Thanks, your answer is very useful for me. There is another question: why the shape of the curve is not in a 'S' shape? It's more like a exponential curve to me. I see your equation is the same with mine but where is the problem? – FunnyFunkyBuggy Oct 09 '15 at 09:39
  • The model is a sigmoidal curve. You just don't see the upper part of the S because it is outside of the plot limits. – Roland Oct 09 '15 at 10:06
3

The code I used to fit your data:

 df <- data.frame(x=c(3.9637878,3.486667,3.0095444,2.5324231,2.0553019,1.5781806,1.1010594,0.6242821),                     
                  y=c(6491.314,6190.092,2664.021,2686.414,724.707,791.243,1809.586,541.243))

library(drc)
fm <- drm(y ~ x, data = df, fct = G.3())

plot(fm)
summary(fm)

The way it looks after fitting: enter image description here

amonk
  • 1,769
  • 2
  • 18
  • 27
numb
  • 119
  • 1
  • 1
  • 11
1

I see two issues.

  1. the default algorithm of nls is very sensitive to the starting parameter. In your example data I found it useful to use algorithm='port'. Alternatively switching to a "robust" implementation might also help.

  2. It helps understanding the role of the parameter in your model.

The simple interpretation for your model is: The sigmoid goes in y from 0 to a. It reaches the "half way" point at x=c. b has the role of a slope, and if negative the model would go from a to 0 instead.

Specifically to the test data posted by you I would estimate the start values as following:

  • First thing I notice - your data is not exactly 'close' to zero so maybe it might be useful to add an offset d which is around 1000.
  • a is then 5000 or greater
  • c is somewhere greater 2 - maybe 3
  • b one needs to guess - from x 2 to 3.5 your signal jumps from 1000 to 6000 gives 5000 difference - divided by a - a slope of 1/1.5 = 0.66 or greater... lets round to one.

So ultimately using the formula

fitmodel <- nls(y ~a/(1 + exp(-b * (x-c)) ) + d, start=list(a=5000,b=1,c=3, d=1000))

gives a fit (also works without the d). Trying around I found setting algorithm='port' made the command even less sensitive to the start values.

bdecaf
  • 4,652
  • 23
  • 44