46

I have a dataframe like this

d={}
d['z']=['Q8','Q8','Q7','Q9','Q9']
d['t']=['10:30','10:31','10:38','10:40','10:41']
d['qty']=[20,20,9,12,12]

I want compare first row with second row

  1. is qty same as next row AND
  2. is t greater in the next row AND
  3. is z value same as next row

The desired value is

   qty                   t   z  valid
0   20 2015-06-05 10:30:00  Q8  False
1   20 2015-06-05 10:31:00  Q8   True
2    9 2015-06-05 10:38:00  Q7  False
3   12 2015-06-05 10:40:00  Q9  False
4   12 2015-06-05 10:41:00  Q9   True
firelynx
  • 30,616
  • 9
  • 91
  • 101
NinjaGaiden
  • 3,046
  • 6
  • 28
  • 49
  • You've not stated what to do when your conditions are `True`, also post your desired df to avoid ambiguity – EdChum Jun 05 '15 at 18:24
  • Also in your sample df, there are no rows where column 'z' is the same as the next row – EdChum Jun 05 '15 at 18:34
  • 1
    updated the original post – NinjaGaiden Jun 05 '15 at 20:31
  • Your rules and your desired output conflict. Row 0 should clearly be True. You have set Row 1 as True, but row 2 has a different z and a different qty, so row 1 should be False. It seems you are not comparing to the next row, but to the previous. – firelynx Jun 06 '15 at 19:03

1 Answers1

81

Looks like you want to use the Series.shift method.

Using this method, you can generate new columns which are offset to the original columns. Like this:

df['qty_s'] = df['qty'].shift(-1)
df['t_s'] = df['t'].shift(-1)
df['z_s'] = df['z'].shift(-1)

Now you can compare these:

df['is_something'] = (df['qty'] == df['qty_s']) & (df['t'] < df['t_s']) & (df['z'] == df['z_s'])

Here is a simplified example of how Series.shift works to compare next row to the current:

df = pd.DataFrame({"temp_celcius":pd.np.random.choice(10, 10) + 20}, index=pd.date_range("2015-05-15", "2015-05-24")) 
df
            temp_celcius

2015-05-15            21
2015-05-16            28
2015-05-17            27
2015-05-18            21
2015-05-19            25
2015-05-20            28
2015-05-21            25
2015-05-22            22
2015-05-23            29
2015-05-24            25

df["temp_c_yesterday"] = df["temp_celcius"].shift(1)
df
            temp_celcius  temp_c_yesterday
2015-05-15            21               NaN
2015-05-16            28                21
2015-05-17            27                28
2015-05-18            21                27
2015-05-19            25                21
2015-05-20            28                25
2015-05-21            25                28
2015-05-22            22                25
2015-05-23            29                22
2015-05-24            25                29

df["warmer_than_yesterday"] = df["temp_celcius"] > df["temp_c_yesterday"]
            temp_celcius  temp_c_yesterday warmer_than_yesterday
2015-05-15            21               NaN                 False
2015-05-16            28                21                  True
2015-05-17            27                28                 False
2015-05-18            21                27                 False
2015-05-19            25                21                  True
2015-05-20            28                25                  True
2015-05-21            25                28                 False
2015-05-22            22                25                 False
2015-05-23            29                22                  True
2015-05-24            25                29                 False

If I misunderstood your query, please post a comment and I'll update my answer.

firelynx
  • 30,616
  • 9
  • 91
  • 101
  • 14
    `shift(1)` compares the previous row, next row would be `shift(-1)` – EdChum Jun 05 '15 at 18:35
  • @EdChum Thnx m8. I guess I was answering a bit fast. – firelynx Jun 05 '15 at 18:37
  • Yep, this seems like it. Column z is a string how do i compare it to the next row? I wasn't able to get it to work with shift() – NinjaGaiden Jun 05 '15 at 20:31
  • @user3589054 Shift is the function to do this, but you are not comparing along the y axis of the dataframe, you are copying the data into another column and offsetting it one step, so you can compare it per row. The example I added to my answer should explain this better. – firelynx Jun 06 '15 at 18:45