2

I have a csv file that has over 1000 rows and 50 columns. Each row has values (i.e. 0.554562) and each column has the same type of values:

As an example of my csv file ():

       Albany  Ukraine  Germany  Swiss   England

kevin  0.5655  0.5777   0.3232   0.1212  0.9595

brayan 0.4655  0.2777   0.1232,  0.9212  0.5595

alex   0.1655  0.2777   0.3232   0.1212  0.9795

Now I want to find the highest values in each row and create new column and added there like:

        Albany Ukraine Germany Swiss  England highest

kevin   0.5655 0.5777  0.3232  0.1212 0.9595  0.9595

brayan  0.4655 0.2777  0.1232  0.9212 0.5595  0.9212

alex    0.1655 0.2777  0.3232  0.1212 0.9795  0.9795

I already checked a few posts here such as 1 2, but none of them helped me.

It would be great if you can provide your help with code that I can run in my side and learn. Thanks

Addition: Also is there any way to say like Kevin with the highest prob of [0.9595] is belong to England?

Bilgin
  • 499
  • 1
  • 10
  • 25

1 Answers1

2

Work on axis=1 (rows) and assign a new column using max

df["highest"] = df.max(axis=1)

Using idxmax will tell you where the max lies

top_prob = df.idxmax(axis=1)
Will
  • 1,532
  • 10
  • 22
  • thank you for your good comment. I am now able to get the highest value found and printed in a new column, but for the second part that you mentioned to use `top_prob = df.idxmax(axis=1)`, in my case where in my first column is the name of the country (and the column head is unnamed), it is preventing the process and i am getting error as: `TypeError: reduction operation 'argmax' not allowed for this dtype` do you have any suggestion to fix this. thank you – Bilgin May 29 '19 at 22:01