Extrapolation in R

Question

I have a reference data set, p,

p=structure(list(v = 0:26, t = c(Inf, 1.016, 0.568, 0.418666666666667, 
0.344, 0.2992, 0.269333333333333, 0.248, 0.232, 0.219555555555556, 
0.2096, 0.201454545454545, 0.194666666666667, 0.188923076923077, 
0.184, 0.179733333333333, 0.176, 0.172705882352941, 0.169777777777778, 
0.167157894736842, 0.1648, 0.162666666666667, 0.160727272727273, 
0.15895652173913, 0.157333333333333, 0.15584, 0.154461538461538
)), .Names = c("v", "t"), row.names = c(NA, -27L), class = "data.frame")

Which p$v and p$t have the following relation :

t=(0.16*(0.75*v + 5.6))/v

my second data set is measured data ,w, contains the same variable like :

w=structure(list(v = c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26), t = c(0.235291176470588, 
0.354020375722543, 0.310974343434343, 0.25272725498699, 0.20351968240702, 
0.163155804025208, 0.132330740162655, 0.108593108108108, 0.0859813015873016, 
0.0655131899302683, 0.0492580103144236, 0.0368029846567365, 0.030538003253355, 
0.0300744415648525, 0.0347586421891237, 0.0451097744360902, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("v", "t"
), row.names = c(NA, -27L), class = "data.frame")

My reference data p, follows a power law, I would like to do a prediction based on power law fit on the data to replace the NA on my measured data. How could I do this in R ?

Maurits Evers · Accepted Answer · 2018-03-14T12:17:07.483

1

How about something like this?

# Non-linear power law fit
fit.nls <- nls(
    t ~ a * v^b,
    data = w[-1, ],
    start = list(a = 0.4, b = -0.7),
    na.action = na.exclude);
# Linear fit with log-log transformation
fit.lm <- lm(
    log(t) ~ log(v),
    data = w[-1, ])

# Plot
w %>%
    rename(t.src = t) %>%
    mutate(
        t.pred.nls = predict(fit.nls, data.frame(v = v)),
        t.NA.pred.nls = ifelse(
            is.na(t.src),
            predict(fit.nls, data.frame(v = v)),
            t.src),
        t.pred.lm = exp(predict(fit.lm, data.frame(v = v)))) %>%
gather(source, t, 3:5) %>%
ggplot(aes(v, t, colour = source)) +
    geom_line() +
    geom_point(aes(v, t.src), colour = "black") +
    scale_colour_manual(values = c(
        "t.NA.pred.nls" = "black",
        "t.pred.nls" = "blue",
        "t.pred.lm" = "red"));

The black dots show the actual measurements. The blue curve is the power law nonlinear model fit, the red curve the linear fit after log-log transformation; the black curve corresponds to your original data, where v = NA values have been replaced with the nonlinear model fit estimates (so for v > 15 the black and blue curves overlap).

edited Mar 14 '18 at 12:17

answered Mar 14 '18 at 09:46

Maurits Evers

49,617
4
47
68

Thank you for your reply, Is there any way that I could do the prediction to replace `NA` in `w` with a condition on top that the slope or trend should be similar to what I have in my reference data, p ? – Haribo Mar 14 '18 at 10:05
*"with a condition on top that the slope or trend should be similar to what I have in my reference data, p"* Not sure what you mean. How do you define "similar"? I'm fitting a LOESS curve, which tries to fit a smooth curve between two variables. The whole point is that you don't make any assumption regarding a global trend (hence the "local" in local polynomial regression). I'm still unsure as to what you're trying to do. Can you not use the explicit functional form `t = f(t)` to replace `NA` values in `w$t`? Perhaps it is a nonlinear regression model that you are after? – Maurits Evers Mar 14 '18 at 10:13
If you look at `p` data set, it looks like a power law. I would like to replace `NA` in `w` in a way that I get a smooth power law at the end. By using LOESS it returns a parabola and the other one `f <- function(v) (0.16 * (0.75 * v + 5.6)) / v; w$t[is.na(w$t)] <- f(w$v[is.na(w$t)]);` has a discontinuity – Haribo Mar 14 '18 at 10:22
Yes, that's why I asked you to clarify. So you want to fit a power law-based nonlinear function to your data, and use the model to replace `NA` values. Please edit your post to include those details. – Maurits Evers Mar 14 '18 at 10:24
@user9112767 I've revised my solution, please take a look. – Maurits Evers Mar 14 '18 at 10:56
I'm getting error : `Error in a * v^b : non-numeric argument to binary operator In addition: Warning message: In nls(t ~ a * v^b, data = w[-1, ], na.action = na.exclude) : No starting values specified for some parameters. Initializing ‘b’ to '1.'. Consider specifying 'start' or using a selfStart model` – Haribo Mar 14 '18 at 11:52
@user9112767 Yes, unfortunately `nls` can be somewhat difficult. The problem lies in the missing starting values. I've edited my solution to include a set of starting parameters for `a` and `b`, for the sample data you provide. You will have to tune these manually for your data; usually this means by trial-and-error. Welcome to the world of nonlinear model fitting. – Maurits Evers Mar 14 '18 at 12:00
@user9112767 Perhaps an alternative would be to log-transform your data; then you have `log y = log a + b log x` and you could use a simple linear model of the form `lm(log(t) ~ log(v), data = w[-1, ])`. – Maurits Evers Mar 14 '18 at 12:03
@user9112767 I've added an example using a linear model after a log-log transformation of your data. Please take a look. – Maurits Evers Mar 14 '18 at 12:17
Thank you for very helpful solutions – Haribo Mar 14 '18 at 12:27
@user9112767 The model in `mod` doesn't have any parameters (you don't have `a` or `b`), so you're not actually fitting a (nonlinear) model; hence the error. If you want to plot a function, have a look at `?stat_function`. – Maurits Evers Mar 15 '18 at 08:12

Extrapolation in R

1 Answers1

Linked

Related