0

Let's say I have the following data:

import pandas as pd
df_data = pd.DataFrame(data={'time': [0, 0.1, 0.3, 0.4, 0.7], 'vals': [1, 3, 5, 7, 9]})
df_data
   time  vals
0   0.0     1
1   0.1     3
2   0.3     5
3   0.4     7
4   0.7     9

If we take this to represent samples of some variable in time, it is quite obvious that the sampling period is not even - ergo, the sampling is non-uniform. So we cannot really just use .diff() in order to calculate the "differential" of vals:

df_data.diff()
   time  vals
0   NaN   NaN
1   0.1   2.0
2   0.2   2.0
3   0.1   2.0
4   0.3   2.0

So, instead, here I'd go through the definition of finite difference, manually like so:

df_data.diff()["vals"]/df_data.diff()["time"]
0          NaN
1    20.000000
2    10.000000
3    20.000000
4     6.666667
dtype: float64

And indeed, the growth of vals per unit time is stronger in the time from 0.0 to 0.1, than from 0.1 to 0.3 etc.

So, I was wondering: is there a built in "switch" in pandas, so I can tell it "calculate differential/derivative of this column in respect to that column" - or should I just do it manually?

sdbbs
  • 4,270
  • 5
  • 32
  • 87

1 Answers1

0

AFAIK, there is no such "switch". See also this related SO post. I'd stick to your own solution for the sake of readability:

dx = df_data["vals"].diff()
dt = df_data["time"].diff()
x_dot = dx/dt

Better specify the column of interest before calling diff() to avoid unnecessary/repeated work.

normanius
  • 8,629
  • 7
  • 53
  • 83