0

I have two sets of data points, set1 and set2, each of which contains two columns of x and y values, like that (for one of them, the other has similar structure and values)

x            y

0.015        0.01
0.025        0.015
..           ..
0.115        0.07

so that we have an x axis that scales in steps of 0.01, while y is random. Then I have a third set, set3, which looks this way

x           y

0.025       0.2
0.075       0.1
...         ...
3.475       0.005

so the increment is x is again constant and in this case equal to 0.05, while y is again random. The range in x of set3 is much wider than set1 and set2.

My goal is to have three sets that span the same range in x.

To do so, I though about interpolating the two shorter sets, set1 and set2, whose x ranges are contained in set3's one.

I did it (for set1 for example, analogously for set2), using

import scipy.interpolate as itp
spline_set1 = itp.splrep(xvalues_set1, yvalues_set1)
extended_set1 = itp.splev(xvalues_set3, spline_set1)

but a plot of extended_set1 looks as if this is not the way to go. The values are too high, many orders of magnitude bigger than they should be.

Any ideas?

johnhenry
  • 1,293
  • 5
  • 21
  • 43
  • I just tried the `splrep` on my own data and honestly I don't quite get how it works and what output it produces. For spline interpolation I always use `scipy.interpolate.CubicSpline`. This usually works pretty well and you could give it a try. Secondly I don't understand your desired output.An interpolation of set1 and set2 will just narrow the spaces between your x-values and y-values accordingly (i.e. you fill in the gaps). But if `set3` has a wider range, an interpolation of `set1` and `set2` is not going to help you for the values they don't contain themselves. What's your desired result? – offeltoffel Sep 18 '17 at 11:27
  • @offeltoffel thanks, that's indeed a good point. In fact I think what I am looking for is extrapolation rather than interpolation. What I want is to have values for set1 and set2 that are defined on the same range of x values for set3. Which I think should be done by extrapolation, as the x ranges of set1 and set2 are smaller and contained in the x range of set3 – johnhenry Sep 18 '17 at 12:09

1 Answers1

0

Following your answer to my comment and assuming you are looking for extrapolation rather than interpolation:

Basically, you are creating information that is not there. Any extrapolation is based on your knowledge about the behavior of y in relation to x. The y3-values are unimportant in this case (which is why you did not need them in your own solution).

The basic tool for spline interpolation is scipy.interpolate.UnivariateSpline or scipy.interpolate.CubicSpline likewise. Both are able to extrapolate. In your case, it works like that:

import scipy.interpolate as itp
spline = itp.UnivariateSpline(xvalues_set1, yvalues_set1)
extended_set1 = spline(xvalues_set3)

However, the result could remain questionable. The behaviour of a spline-extrapolation might seem unrationable while still being mathematically correct. If you want to understand what's going on, I suggest you plot your result using matplotlib.pyplot.

offeltoffel
  • 2,691
  • 2
  • 21
  • 35