Using lambda if condition on different columns in Pandas dataframe

Question

I have simple dataframe:

import pandas as pd
frame = pd.DataFrame(np.random.randn(4, 3), columns=list('abc'))

Thus for example:

a   b   c
0   -0.813530   -1.291862   1.330320
1   -1.066475   0.624504    1.690770
2   1.330330    -0.675750   -1.123389
3   0.400109    -1.224936   -1.704173

And then I want to create column “d” that contains value from “c” if c is positive. Else value from “b”.

I am trying:

frame['d']=frame.apply(lambda x: frame['c'] if frame['c']>0 else frame['b'],axis=0)

But getting “ValueError: ('The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().', 'occurred at index a')

I was trying to google how to solve this, but did not succeed. Any tip please?

`lambda x: ...` as in it takes an argument `x` which is not used for the logic..... — Tadhg McDonald-Jensen, May 25 '16 at 16:40
`frame['c']>0 ` produces a series of values in column c that are greater then 0, which is then tried to use the booleaness of it instead of `x['c']>0` which will compare the value at the specific cell to 0 and return a boolean. — Tadhg McDonald-Jensen, May 25 '16 at 16:42

score 52 · Accepted Answer · answered May 25 '16 at 16:40

52

is that what you want?

In [300]: frame[['b','c']].apply(lambda x: x['c'] if x['c']>0 else x['b'], axis=1)
Out[300]:
0   -1.099891
1    0.582815
2    0.901591
3    0.900856
dtype: float64

answered May 25 '16 at 16:40

MaxU - stand with Ukraine

205,989
36
386
419

2

axis=1 is important at the end. Otherwise, it gives keyerror. – jkr Jan 11 '23 at 04:08

piRSquared · Answer 2 · 2016-05-25T17:14:12.107

8

Solution

use a vectorized approach

frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)

Explanation

This is derived from the sum of

(frame.c > 0) * frame.c  # frame.c if positive

Plus

(frame.c <= 0) * frame.b  # frame.b if c is not positive

However

(frame.c <=0 )

is equivalent to

(1 - frame.c > 0)

and when combined you get

frame['d'] = frame.b + (frame.c > 0) * (frame.c - frame.b)

edited May 25 '16 at 17:14

answered May 25 '16 at 17:08

piRSquared

285,575
57
475
624

score 3 · Answer 3 · edited Apr 02 '22 at 15:22

I came by and faced something like this and this how I retrieve new column based on conditions from other columns

df["col3"] = df[["col1", "col2"]].apply(
    lambda x: "return this if first statement is true"
    if (x.col1 == "value1" and x.col2 == "value2")
    else "return this if the statement right below this line is true"
    if (x.col1 == "value1" and x.col2 != "value2")
    else "return this if the below is true"
    if (x.col1 != "value1" and x.col2 == "Value2")
    else "return this because none of the above statements were true",
    axis=1
)

Using lambda if condition on different columns in Pandas dataframe

3 Answers3

Solution

Explanation

Linked