Create a new column by comparing rows pandas

Question

My dataframe looks like this

df = pd.Dataframe({ 'a': ["10001", "10001", "10002", "10002" , "10002"], 'b': ['hello', 'hello', 'hola', 'hello', 'hola']})

I want to create a new column 'c' of boolean values with the following condition:

If values of 'a' is the same (i.e. 1st and 2nd row, 3rd and 4th and 5th row), check if values of 'b' of those rows are the same. (2nd row returns True. 4th row returns False).
If values of 'a' is not the same, skip.

My current code is the following:

def check_consistency(col1,col2):
    df['match'] = df[col1].eq(df[col1].shift())
    t = []
    for i in df['match']:
        if i == True:
            t.append(df[col2].eq(df[col2].shift()))
check_consistency('a','b')

And it returns error.

Please show your desired output for the example you have provided. — jpp, Jul 17 '18 at 01:03
Please provide a [**Minimal, Complete, and Verifiable example**](https://stackoverflow.com/help/mcve) — U13-Forward, Jul 17 '18 at 01:20

score 0 · Answer 1 · answered Jul 17 '18 at 01:24

0

I think this is groupby

df.groupby('a').b.apply(lambda x : x==x.shift())
Out[431]: 
0    False
1     True
2    False
3    False
4    False
Name: b, dtype: bool

answered Jul 17 '18 at 01:24

BENY

317,841
20
164
234

Mankind_008 · Answer 2 · 2018-07-17T06:19:44.740

A bitwise & should do: Checking if both the conditions are satisfied:

df['c'] = (df.a == df.a.shift()) & (df.b == df.b.shift()) 

df.c
#0    False
#1     True
#2    False
#3    False
#4    False
#Name: c, dtype: bool

Alternatively, if you want to make your current code work, you can do something like (essentially doing the same check as above):

def check_consistency(col1,col2):
    df['match'] = df[col1].eq(df[col1].shift())

    for i in range(len(df['match'])):
        if (df['match'][i] == True):
            df.loc[i,'match'] = (df.loc[i, col2] == df.loc[i-1, col2])

check_consistency('a','b')

Create a new column by comparing rows pandas

2 Answers2