85
data = {'name' : ['bill', 'joe', 'steve'],
    'test1' : [85, 75, 85],
    'test2' : [35, 45, 83],
     'test3' : [51, 61, 45]}
frame = pd.DataFrame(data)

I would like to add a new column that shows the max value for each row.

desired output:

 name test1 test2 test3 HighScore
 bill  75    75    85    85
 joe   35    45    83    83 
 steve  51   61    45    61 

Sometimes

frame['HighScore'] = max(data['test1'], data['test2'], data['test3'])

works but most of the time gives this error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Why does it only work sometimes? Is there another way of doing it?

foundart
  • 370
  • 2
  • 12
user2333196
  • 5,406
  • 7
  • 31
  • 35
  • 2
    Faster solutions along with performance comparisons for this particular operation can be found in [this answer](https://stackoverflow.com/a/54299629/4909087). – cs95 Jan 24 '19 at 10:50

3 Answers3

135
>>> frame['HighScore'] = frame[['test1','test2','test3']].max(axis=1)
>>> frame
    name  test1  test2  test3  HighScore
0   bill     85     35     51         85
1    joe     75     45     61         75
2  steve     85     83     45         85
Roman Pekar
  • 107,110
  • 28
  • 195
  • 197
  • I couldn't figure out what the (axis=1) does? – rrlamichhane Dec 30 '16 at 22:31
  • 3
    @RanjanR.Lamichhane In short, `max(axis=1)` gets the rowwise max, while `max(axis=0)` gets the column-wise max. Take a look at http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.max.html – NWaters Feb 23 '17 at 11:09
15
>>> frame['HighScore'] = frame[['test1','test2','test3']].apply(max, axis=1)
>>> frame
    name  test1  test2  test3  HighScore
0   bill     85     35     51        85
1    joe     75     45     61        75
2  steve     85     83     45        85
alko
  • 46,136
  • 12
  • 94
  • 102
  • 1
    This method works better by ignoring NA's when calculating max by default – DACW May 24 '16 at 13:41
  • 1
    Is there a way to get the name of the column. e.g. HighScore = 85 in the first row the column name is test1 for that high score – Jorge Apr 05 '18 at 18:44
3

if a max or min value between multiple columns in a df is to be determined then use:

df['Z']=df[['A','B','C']].apply(np.max,axis=1)
wscourge
  • 10,657
  • 14
  • 59
  • 80
Vikas goel
  • 31
  • 1