-2

I have two datasets, which I would like to find the best fitting function types:

x <- c(10, 40, 70, 100, 130, 160, 190, 220)
y1 <- c(41.8, 45.2, 50.8, 62.5, 73.2, 86.9, 95.4, 107.9)
y2 <- c(1.9, 34.3, 269.2, 1119.4, 2627.1, 5801.2, 11794.8, 24139.9)

par(mfrow = c(1,2))
plot(x, y1); plot(x, y2)

enter image description here

I would like to say something about the trends of y1 and y2, like y1 seems to follow a linear trend, and y2 looks exponential. First I tried with symbolic regressions, especially with 'rgp' package, but that's documentation is very poor, and it has some problems as well (don't work in archive mode, and others). Unfortunately, there isn't any other symbolic regression package.

What do you suggest, what should I do? How can I say/proof such conclusions like: "y1 follows a linear trend in the function of x" ?

Zheyuan Li
  • 71,365
  • 17
  • 180
  • 248
poetyi
  • 236
  • 4
  • 13
  • Which programming problem do you have? Consider other sites like stats.stackexchange.com – llrs Oct 20 '16 at 08:44
  • @Zheyuan Li Yes but you can compare models according to fittness and complexity, for example in a plot with Pareto front. – poetyi Oct 20 '16 at 08:58
  • @Llopis rgp uses genetic programming, but maybe there are other useful methods. – poetyi Oct 20 '16 at 08:59
  • In other words, what do you expect from the community of Stack Overflow? Advise on which package to use? Advise on how to perform a linear trend? – llrs Oct 20 '16 at 09:01

1 Answers1

1

use linear model:

summary.lm(lm(y1~x))
...
     Multiple R-squared:  0.9802,   Adjusted R-squared:  0.9768  F-statistic:
     296.4 on 1 and 6 DF,  p-value: 2.46e-06

y1_r = 0.33044 * x + 32.46230

So 97% A-R-squered and small p-value sad that you have linear trend. For y2 you can use nonlinear regression in the same way.

r <-  lm(y1 ~ x)
nr <- nls(y1 ~ exp(a + b * x), start = list(a = 0, b = 0))
plot( x,y1)
lines(x,predict(nr))
lines(x, predict(r), col = 'red')

you can decide which model is better with comparing simple SD of residuals

Residual standard error: 3.732 
Residual standard error: 2.515
ooolllooo
  • 353
  • 1
  • 3
  • 11