0

This is a follow up to this question: determine the coordinates where two pandas time series cross, and how many times the time series cross

I have 2 series in my Pandas dataframe, and would like to know where they intersect.

   A    B
0  1  0.5
1  2  3.0
2  3  1.0
3  4  1.0
4  5  6.0

With this code, we can create a third column that will contain True everytime the two series intersect:

df['difference'] = df.A - df.B

df['cross'] = np.sign(df.difference.shift(1))!=np.sign(df.difference)
np.sum(df.cross)-1

Now, instead of a simple True or False, I would to know in which direction the intersection took place. For example: from 1 to 2, it intersected upwards, from 2 to 3 downwards, from 3 to 4 no intersections, from 4 to 5 upwards.

enter image description here

   A    B  Cross_direction
0  1  0.5  None
1  2  3.0  Upwards
2  3  1.0  Downwards
3  4  1.0  None
4  5  6.0  Upwards

In pseudo-code, it should be like this:

cross_directions = [none, none, ... * series size]
for item in df['difference']:
    if item > 0 and next_item < 0:
        cross_directions.append("up")
    elif item < 0 and next_item > 0:
        cross_directions.append("down")

The problem is that next_item is unavailable with this syntax (we obtain that in the original syntax using .shift(1)) and that it takes a lot of code.

Should I look into implementing the code above using something that can group the loop by 2 items at a time? Or is there a simpler and more elegant solution like the one from the previous question?

Saturnix
  • 10,130
  • 17
  • 64
  • 120
  • `.shift()` is pretty much always the way to do exactly what you want in Pandas, namely looping over pairs of values. If you take one series and then compare it with itself shifted once, then you've got pairs of values (item and next_item), so you can easily compare. – alkasm Oct 01 '18 at 03:17
  • I'm totally missing how I could combine the loop and shift :( – Saturnix Oct 01 '18 at 03:31
  • You don't need to loop. You can just compare the two columns directly. – alkasm Oct 01 '18 at 03:53

2 Answers2

7

You can use numpy.select.

Below code should work for you, the code is as follows:

df = pd.DataFrame({'A': [1, 2, 3, 4,5], 'B': [0.5, 3, 1, 1, 6]})
df['Diff'] = df.A - df.B
df['Cross'] = np.select([((df.Diff < 0) & (df.Diff.shift() > 0)), ((df.Diff > 0) & (df.Diff.shift() < 0))], ['Up', 'Down'], 'None')

#Output dataframe
   A    B  Diff Cross
0  1  0.5   0.5  None
1  2  3.0  -1.0    Up
2  3  1.0   2.0  Down
3  4  1.0   3.0  None
4  5  6.0  -1.0    Up
halfer
  • 19,824
  • 17
  • 99
  • 186
nandneo
  • 495
  • 4
  • 13
0

My very lousy and redundant solution.

dataframe['difference'] = dataframe['A'] - dataframe['B']
dataframe['temporary_a'] = np.array(dataframe.difference) > 0
dataframe['temporary_b'] = np.array(dataframe.difference.shift(1)) < 0
cross_directions = []
for index,row in dataframe.iterrows():
    if not row['temporary_a'] and not row['temporary_b']:
        cross_directions.append("up")
    elif row['temporary_a'] and row['temporary_b']:
        cross_directions.append("down")
    else:
        cross_directions.append("not")
dataframe['cross_direction'] = cross_directions
Saturnix
  • 10,130
  • 17
  • 64
  • 120
  • No need to cast as arrays, you can just use `dataframe.difference > 0`. Also no need for `iterrows()`, you can compare the columns directly, e.g. `df.loc[df['temporary_a'] & ~df['temporary_b'], 'cross_directions'] = 'up'` – alkasm Oct 01 '18 at 04:02