2

When I have a below df, I want to get a column 'C' which has max value between specific value '15' and column 'A' within the condition "B == 't'"

testdf = pd.DataFrame({"A":[20, 16, 7, 3, 8],"B":['t','t','t','t','f']})
testdf

    A   B
0   20  t
1   16  t
2   7   t
3   3   t
4   8   f

I tried this:

testdf.loc[testdf['B']=='t', 'C'] = max(15,(testdf.loc[testdf['B']=='t','A']))

And desired output is:

    A   B   C
0   20  t   20
1   16  t   16
2   7   t   15
3   3   t   15 
4   8   f   8

Could you help me to get the output? Thank you!

corezal
  • 31
  • 2

3 Answers3

3

Use np.where with clip:

testdf['C'] = np.where(testdf['B'].eq('t'), 
                       testdf['A'].clip(15), df['A'])

Or similarly with series.where:

testdf['C'] = (testdf['A'].clip(15)
                   .where(testdf['B'].eq('t'), testdf['A'])
              )

output:

    A  B   C
0  20  t  20
1  16  t  16
2   7  t  15
3   3  t  15
4   8  f   8
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
0

You could also use the update method:

testdf['C'] = testdf['A']

    A  B   C
0  20  t  20
1  16  t  16
2   7  t   7
3   3  t   3
4   8  f   8


values = testdf.A[testdf.B.eq('t')].clip(15)

values
Out[16]: 
0    20
1    16
2    15
3    15
Name: A, dtype: int64

testdf.update(values.rename('C'))

    A  B     C
0  20  t  20.0
1  16  t  16.0
2   7  t  15.0
3   3  t  15.0
4   8  f   8.0
sammywemmy
  • 27,093
  • 4
  • 17
  • 31
0

To apply any formula to individual values in a dataframe you can use

df['column'] =df['column'].apply(lambda x: anyFunc(x))

x here will catch individual values of column one by one and pass it to the function where you can manipulate it and return back.