Pandas - add column with row value applying conditions relative to current row

Question

I’m trying to add a new column to my dataframe that contains the time value of the first instance where the tick is equal to the current tick plus 1.

df2 is somthing like this:

             Time     Tick    Desired col
Count                     
0      1594994400  3212.25    1594994405
1      1594994401  3212.00    1594994404
2      1594994402  3212.25    1594994405
3      1594994402  3212.50       NaN
4      1594994403  3212.75       NaN
5      1594994404  3212.75       NaN
6      1594994404  3213.00       NaN
7      1594994405  3213.25       NaN
8      1594994405  3213.25       NaN
9      1594994405  3213.25       NaN

I'm hoping to do something like:

df2['Desired col'] = df2['Tick'].loc[(df2['Tick'(other rows)]==df2['Tick'current row] +1)&(df2['Time'(other rows)]>=df2['Time'](current row)].idxmax()

Hope that makes sense. I'm new to pandas and python, this is my first posted question. Many thanks to the stackoverflow community for all the excellent reference material available!

Nicolò Gasparini · Answer 1 · 2020-07-20T07:11:47.147

0

If you want a one liner this should do it:

df['Desired'] = df.apply(lambda x: df[df['Tick'] == x['Tick']+1].reset_index().iloc[0]['Timestamp'], axis=1)

Problem is it will throw a KeyError [0] because in your 'Tick' column you don't always have tick + 1, what I suggest is this:

def desired_generator(row, df):
    try:
        return df[df['Tick'] == row['Tick']+1].reset_index().iloc[0]['Timestamp']
    except:
        return None

df['Desired'] = df.apply(lambda x: desired_generator(x, df), axis=1)

edited Jul 20 '20 at 07:11

answered Jul 19 '20 at 21:05

Nicolò Gasparini

2,228
2
24
53

Thanks for this! A lot of insight was gained by your solutions. – user13959578 Jul 20 '20 at 04:34
Any idea how to make this faster? – user13959578 Jul 20 '20 at 05:14

Pandas - add column with row value applying conditions relative to current row

1 Answers1