Pandas: determine where 2 time series are intersecting and in which direction

Question

This is a follow up to this question: determine the coordinates where two pandas time series cross, and how many times the time series cross

I have 2 series in my Pandas dataframe, and would like to know where they intersect.

With this code, we can create a third column that will contain True everytime the two series intersect:

df['difference'] = df.A - df.B

df['cross'] = np.sign(df.difference.shift(1))!=np.sign(df.difference)
np.sum(df.cross)-1

Now, instead of a simple True or False, I would to know in which direction the intersection took place. For example: from 1 to 2, it intersected upwards, from 2 to 3 downwards, from 3 to 4 no intersections, from 4 to 5 upwards.

   A    B  Cross_direction
0  1  0.5  None
1  2  3.0  Upwards
2  3  1.0  Downwards
3  4  1.0  None
4  5  6.0  Upwards

In pseudo-code, it should be like this:

cross_directions = [none, none, ... * series size]
for item in df['difference']:
    if item > 0 and next_item < 0:
        cross_directions.append("up")
    elif item < 0 and next_item > 0:
        cross_directions.append("down")

The problem is that next_item is unavailable with this syntax (we obtain that in the original syntax using .shift(1)) and that it takes a lot of code.

Should I look into implementing the code above using something that can group the loop by 2 items at a time? Or is there a simpler and more elegant solution like the one from the previous question?

`.shift()` is pretty much always the way to do exactly what you want in Pandas, namely looping over pairs of values. If you take one series and then compare it with itself shifted once, then you've got pairs of values (item and next_item), so you can easily compare. — alkasm, Oct 01 '18 at 03:17
I'm totally missing how I could combine the loop and shift :( — Saturnix, Oct 01 '18 at 03:31
You don't need to loop. You can just compare the two columns directly. — alkasm, Oct 01 '18 at 03:53

score 7 · Accepted Answer · edited Mar 06 '20 at 08:51

7

You can use numpy.select.

Below code should work for you, the code is as follows:

df = pd.DataFrame({'A': [1, 2, 3, 4,5], 'B': [0.5, 3, 1, 1, 6]})
df['Diff'] = df.A - df.B
df['Cross'] = np.select([((df.Diff < 0) & (df.Diff.shift() > 0)), ((df.Diff > 0) & (df.Diff.shift() < 0))], ['Up', 'Down'], 'None')

#Output dataframe
   A    B  Diff Cross
0  1  0.5   0.5  None
1  2  3.0  -1.0    Up
2  3  1.0   2.0  Down
3  4  1.0   3.0  None
4  5  6.0  -1.0    Up

edited Mar 06 '20 at 08:51

halfer

19,824
17
99
186

answered Oct 01 '18 at 04:10

nandneo

495
4
13

Exactly what I was looking for, thank you very much! – Saturnix Oct 01 '18 at 13:37

score 0 · Answer 2 · answered Oct 01 '18 at 03:53

My very lousy and redundant solution.

dataframe['difference'] = dataframe['A'] - dataframe['B']
dataframe['temporary_a'] = np.array(dataframe.difference) > 0
dataframe['temporary_b'] = np.array(dataframe.difference.shift(1)) < 0
cross_directions = []
for index,row in dataframe.iterrows():
    if not row['temporary_a'] and not row['temporary_b']:
        cross_directions.append("up")
    elif row['temporary_a'] and row['temporary_b']:
        cross_directions.append("down")
    else:
        cross_directions.append("not")
dataframe['cross_direction'] = cross_directions

No need to cast as arrays, you can just use `dataframe.difference > 0`. Also no need for `iterrows()`, you can compare the columns directly, e.g. `df.loc[df['temporary_a'] & ~df['temporary_b'], 'cross_directions'] = 'up'` — alkasm, Oct 01 '18 at 04:02

Pandas: determine where 2 time series are intersecting and in which direction

2 Answers2