-1

I am having a really hard time understanding how to do even basic data manipulation without iteration, so if I stop making sense, try to go easy on me. Let's suppose I have a dataframe df that looks like this:

        f1    f2    f3    f4
1       1     2     3     'Sari'
2       2     1     4     'Sally'
3       3     0     1     'Jose'

I want to know how to get the max integer in each row. I'm fine with storing it in a new column, f5. So, perhaps code that

df['f5'] = ??? #I'm stuck...

philosofool
  • 773
  • 4
  • 12

2 Answers2

1

Use df.max(axis=1):

In [2682]: df 
Out[2682]: 
   f1  f2  f3       f4
1   1   2   3   'Sari'
2   2   1   4  'Sally'
3   3   0   1   'Jose'

In [2684]: df['f5'] = df.select_dtypes('number').max(axis=1)
In [2685]: df                
Out[2685]: 
   f1  f2  f3       f4  f5
1   1   2   3   'Sari'   3
2   2   1   4  'Sally'   4
3   3   0   1   'Jose'   3

df.select_dtypes('number') selects only those columns which have dtype as int or float. This ensures that max calculation is done only on numerical columns not on string columns.

axis=1 calculates the function on row-level.

axis=0 calculates the function on column-level.

Mayank Porwal
  • 33,470
  • 8
  • 37
  • 58
-1

Holy crap I think I figured it out...

df['f5'] = df[['f1','f2','f3']].max(axis=1)

Let me know if there's a better way.

philosofool
  • 773
  • 4
  • 12
  • Thanks. Saw the comment about df.select_dtypes('number') which is obviously helpful if you have a lot more than 3 columns. – philosofool Jun 11 '20 at 21:23
  • If the number of columns keep increasing, you will have to manually add it in the list `['f1','f2','f3']`.. So to avoid that, please check my answer. – Mayank Porwal Jun 11 '20 at 21:23
  • Besides that, this won't work, did you try to run your code? – Erfan Jun 11 '20 at 21:25
  • @philosofool I guess your code is missing `[]`. It should be `df['f5'] = df[['f1','f2','f3']].max(axis=1)`. – Mayank Porwal Jun 11 '20 at 21:28
  • @Erfan It was hypothetical code, and so I didn't try to run it. The version I ran did work; in the version above, there are missing ```[ ]``` around the index to df, which I forgot. i.e ```df[['f1','f2','f3']].max(axis=1)``` – philosofool Jun 12 '20 at 22:56