-2

I want to create a new df that given a starting value x0 and an end value x1 the output interpolates/extrapolates for a given n number of points.

For example, given the df below, I want to create a new df between x0=57000 and x1=62000 in steps of 250, or n=21 points:

    x = [57136,57688,58046,58480,58730,59210,59775,60275,60900,61365,62030]
    y = [3.87, 3.55, 3.75, 2.04, 2.66, 3.1, 3.38, 4.13, 3.7, 4, 5.78]

    df = pd.DataFrame(data=[x,y]).transpose()
    df.columns=['x','y']

Given df1, I want to create a new df2 such that the output will be:

    >>>print(df2)
            x         y
    0       57000     2.78745
    1       57250     2.74425
    2       57500     2.70106
    3       57750     2.72185
    4       58000     2.93666
    5       58250     2.34479
    6       58500     1.67233
    7       58750     2.13959
    8       59000     2.31422
    9       59250     2.47805
    10      59500     2.58523
    11      59750     2.69242
    12      60000     2.97746
    13      60250     3.28227
    14      60500     3.18627
    15      60750     3.04574
    16      61000     3.04658
    17      61250     3.25947
    18      61500     3.62019
    19      61750     4.10685
    20      62000     4.59351
CBH
  • 13
  • 1
  • 4
  • Welcome to StackOverflow. Please read and follow the posting guidelines in the help documentation. [on topic](http://stackoverflow.com/help/on-topic) and [how to ask](http://stackoverflow.com/help/how-to-ask) apply here. StackOverflow is not a design, coding, research, or tutorial service. – Prune Jul 05 '17 at 17:34
  • Where is your own coding attempt? What algorithm are you using to get those desired values? The values in your list are all in the range [2, 6], but you have desired values greater than 10. – Prune Jul 05 '17 at 17:35
  • Prune - I output wrong df. I've fixed it such that df2 corresponds to df1. To generate df2 a built-in algorithm from originlab was used. I want to be able to achieve the same type of interpolate/extrapolate for given steps in python. – CBH Jul 05 '17 at 17:42

1 Answers1

1

For interpolation in Python you could use scipy.interpolate.InterpolatedUnivariateSpline

import numpy as np
from scipy.interpolate import InterpolatedUnivariateSpline
x = [57136,57688,58046,58480,58730,59210,59775,60275,60900,61365,62030]
y = [3.87, 3.55, 3.75, 2.04, 2.66, 3.1, 3.38, 4.13, 3.7, 4, 5.78]

interpolation_function = InterpolatedUnivariateSpline(x,y)
new_x = np.arange(57000,62001,250)
new_y = interpolation_function(new_x)

The output will be numpy arrays, which can then be put into a pandas dataframe.

This will most certainly not get you the values you indicate in your answer, as the original y-values are all in the range [2, 6], so one would expect the output to also be in this range (for interpolated values) as pointed out by @Prune.

InterpolatedUnivariateSpline by default allows for extrapolation(see ext parameter). If you want linear interpolation instead of cubic interpolation(k=3, the default) you can specify k=1 as an argument.

Pandas also has its own interpolation method interpolate which you could use if your starting point is a Dataframe

M.T
  • 4,917
  • 4
  • 33
  • 52
  • Your answer really helps me out. Thank you for introducing me to InterpolatedUnivariateSpline. – CBH Jul 05 '17 at 17:57