0

For the following table:

enter image description here

Using Pandas - I would like achieve the desired_output column, that is TRUE when the value below the current cell i different - otherwise FALSE.

I have tried the following code - but error occurs.

df['desired_output']=df.two.apply(lambda x: True if df.iloc[int(x),1]==df.iloc[int(x+1),1] else False)

2 Answers2

3

Compare by Series.ne with Series.shifted values and first missing value is replaced by original value:

df = pd.DataFrame({'city':list('mmmssb')})

df['out'] = df['city'].ne(df['city'].shift(fill_value=df['city'].iat[0]))
print (df)
  city    out
0    m  False
1    m  False
2    m  False
3    s   True
4    s  False
5    b   True

For oldier pandas versions if no missing values in column city is used replace first missing value by Series.fillna:

df['out'] = df['city'].ne(df['city'].shift().fillna(df['city']))
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
3
df['desired_output'] = df['city'].shift().bfill() != df['city']
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
Andreas
  • 8,694
  • 3
  • 14
  • 38