21

I would like to obtain the n-th minimum or the n-th maximum value from numerical columns in the DataFrame in pandas.

Example:

df = pd.DataFrame({'a': [3.0, 2.0, 4.0, 1.0],'b': [1.0, 4.0 , 2.0, 3.0]})

     a    b
0  3.0  1.0
1  2.0  4.0
2  4.0  2.0
3  1.0  3.0

The third largest value in column a is 2 and the second smallest value in column b is also 2.

Krzysztof Słowiński
  • 6,239
  • 8
  • 44
  • 62

3 Answers3

29

You can use nlargest/nsmallest -

df    
     a    b
0  3.0  1.0
1  2.0  4.0
2  4.0  2.0
3  1.0  3.0
df.a.nlargest(3).iloc[-1]
2.0

Or,

df.a.nlargest(3).iloc[[-1]]

1    2.0
Name: a, dtype: float64

And, as for b -

df.b.nsmallest(2).iloc[-1]
2.0

Or,

df.b.nsmallest(2).iloc[[-1]]

2    2.0
Name: b, dtype: float64

Quick observation here - this sort of operation cannot be vectorised. You are essentially performing two completely different operations here.

cs95
  • 379,657
  • 97
  • 704
  • 746
5
df =  
     a    b
0  3.0  1.0
1  2.0  4.0
2  4.0  2.0
3  1.0  3.0

df.nlargest(3,'a')
   =2.0

df.nsmallest(2,'b')=2.0
cs95
  • 379,657
  • 97
  • 704
  • 746
nbm
  • 69
  • 7
1

5th max: df.a.nlargest(5).min()
5th min: df.a.nsmallest(5).max().
Tips: add param keep may acts better:
df.a.nlargest(5,keep='first').min()

Anthony
  • 111
  • 1
  • 3
  • This question is relatively old so thank you for this solution because it seems to be the most pythonic and clean (compared to others). How about performance? What is your experience? – Soren V. Raben Jul 23 '23 at 13:11