21

I have a dataset of real data, for example looking like this:

# Dataset 1 with known data
known <- data.frame(
    x = c(0:6),
    y = c(0, 10, 20, 23, 41, 39, 61)
)

plot (known$x, known$y, type="o")

Now I want to get an aswer to the question "What would the Y value for 0.3 be, if all intermediate datapoints of the original dataset, are on a straight line between the surrounding measured values?"

 # X values of points to interpolate from known data
 aim <- c(0.3, 0.7, 2.3, 3.3, 4.3, 5.6, 5.9)

If you look at the graph: I want to get the Y-Values, where the ablines intersect with the linear interpolation of the known data

abline(v = aim, col = "#ff0000")

So, in the ideal case I would create a "linearInterpolationModel" with my known data, e.g.

model <- linearInterpol(known)

... which I can then ask for the Y values, e.g.

model$getEstimation(0.3)

(which should in this case give "3")

abline(h = 3, col = "#00ff00")

How can I realize this? Manually I would for each value do something like this:

  1. What is the closest X-value smaller Xsmall and the closest X-value larger Xlarge than the current X-value X.
  2. Calculate the relative position to the smaller X-Value relPos = (X - Xsmall) / (Xlarge - Xsmall)
  3. Calculate the expected Y-value Yexp = Ysmall + (relPos * (Ylarge - Ysmall))

At least for the software Matlab I heard that there is a built-in function for such problems.

Thanks for your help,

Sven

Frank Schmitt
  • 30,195
  • 12
  • 73
  • 107
R_User
  • 10,682
  • 25
  • 79
  • 120

2 Answers2

19

You could be looking at approx() and approxfun() ... or I suppose you could fit with lm for linear or lowess for non-parametric fits.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • 2
    Thanks. ´approx()´ does exactly what I was looking for. No I used `plot(approx(known$x, known$y, xout=aim))`. Do you know of a 2D version of approx? I want to interpolate datapoints in a matrix,... – R_User Oct 27 '11 at 08:56
  • 1
    The akima package has an `interp` function that I have used for fitting 3D data on an irregular grid. It produces a "linear" interpolation by default but also allows a spline fit to be specified. If your "matrix" of points is regular you can view with `wireframe` or `persp` but they do not accept irregular data. – IRTFM Oct 27 '11 at 13:07
  • But `wireframe` and `persp` seem to be only useful for plotting. I have logarithmically spaced datapoints. I want to give 3D data to the function I'm searching, and additionally the x and y coordinate of a datapoint. The function should return the interpolated z value for the x and y coordinate. This seems to be impossible with the three named functions. So, do I have to do this by hand? – R_User Oct 27 '11 at 13:42
  • Why can't `interp` do it? If you wanted a linear interpolation on a logarithmic scale, then you could also transform, interpolate, and back transform. If you have a specific dataset in mind, then you would (as always) get a better answer if you posted data and complete description of the problem. This seems to be different than the original question and you might consider posting a new question with data and description. – IRTFM Oct 27 '11 at 14:20
  • 'loess' can also do 2 and 3-d fitting and interpolations. – IRTFM Apr 20 '17 at 15:47
11

To follow up on DWin's answer, here's how you'd get the predicted values using a linear model.

model.lm <- lm(y ~ x, data = known)

# Use predict to estimate the values for aim.
# Note that predict expects a data.frame and the col 
# names need to match
newY <- predict(model.lm, newdata = data.frame(x = aim))

#Add the predicted points to the original plot
points(aim, newY, col = "red")

And of course you can retrieve those predicted values directly:

> cbind(aim, newY)
  aim       newY
1 0.3  2.4500000
2 0.7  6.1928571
3 2.3 21.1642857
....
R_User
  • 10,682
  • 25
  • 79
  • 120
Chase
  • 67,710
  • 18
  • 144
  • 161
  • +1 for predict(), highly useful and not immediately apparent when looking at ?lm – Brandon Bertelsen Jul 18 '11 at 12:27
  • +1 for giving the example for predict(). This solution does something slightly different to ehat I was looking for. It makes a linear regression for ALL datapoints first and calculates the Y-values for given x-values on the regression curve. I was searching for a linear regression between only the two closest points. That can be done by approx(). – R_User Oct 27 '11 at 09:16