1

I have a dataframe with two numeric columns, A & B. I want to find the top 5 values from col A and return the values from Col B held in the location of those top 5.

Many thanks.

smci
  • 32,567
  • 20
  • 113
  • 146

2 Answers2

2

I think need DataFrame.nlargest with column A for top 5 rows and then select column B:

df = pd.DataFrame({'A':[4,5,26,43,54,36,18,7,8,9],
                   'B':range(10)})

print (df)
    A  B
0   4  0
1   5  1
2  26  2
3  43  3
4  54  4
5  36  5
6  18  6
7   7  7
8   8  8
9   9  9

print (df.nlargest(5, 'A'))
    A  B
4  54  4
3  43  3
5  36  5
2  26  2
6  18  6

a = df.nlargest(5, 'A')['B']
print (a)
4    4
3    3
5    5
2    2
6    6
Name: B, dtype: int64

Alternative solution with sorting:

a = df.sort_values('A', ascending=False)['B'].head(5)
print (a)
4    4
3    3
5    5
2    2
6    6
Name: B, dtype: int64
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
2

nlargest function on the dataframe will do your work, df.nlargest(#of rows,'column_to_sort')

import pandas
df = pd.DataFrame({'A':[1,1,1,2,2,2,2,3,4],'B':[1,2,3,1,2,3,4,1,1]})
df.nlargest(5,'B')
Out[13]: 
    A      B
6   2      4
2   1      3
5   2      3
1   1      2
4   2      2
# if you want only certain column in the output, the use

df.nlargest(5,'B')['A']
harshil9968
  • 3,254
  • 1
  • 16
  • 26