2

I'd like to search a pandas DataFrame for minimum values. I need the min in the entire dataframe (across all values) analogous to df.min().min(). However I also need the know the index of the location(s) where this value occurs.

I've tried a number of different approaches:

  • df.where(df == (df.min().min())),
  • df.where(df == df.min().min()).notnull()(source) and
  • val_mask = df == df.min().min(); df[val_mask] (source).

These return a dataframe of NaNs on non-min/boolean values but I can't figure out a way to get the (row, col) of these locations.

Is there a more elegant way of searching a dataframe for a min/max and returning a list containing all of the locations of the occurrence(s)?

import pandas as pd

keys = ['x', 'y', 'z']
vals = [[1,2,-1], [3,5,1], [4,2,3]]
data = dict(zip(keys,vals))
df = pd.DataFrame(data)

list_of_lowest = []

for column_name, column in df.iteritems():
    if len(df[column == df.min().min()]) > 0:
        print(column_name, column.where(column ==df.min().min()).dropna())
        list_of_lowest.append([column_name, column.where(column ==df.min().min()).dropna()])

list_of_lowest
output: [['x', 2   -1.0
Name: x, dtype: float64]]
Community
  • 1
  • 1
Jason
  • 4,346
  • 10
  • 49
  • 75
  • 1
    Sorry are you just after turning the output of this: `df[df==df.max().max()].dropna(axis=1, thresh=1).dropna()` to a list? – EdChum Mar 31 '16 at 12:31
  • @EdChum Yes! Thanks, this is exactly what I was after. Could you explain what the thresh=1 argument does? I don't understand the docs "require that many non-NA values" means.. – Jason Mar 31 '16 at 12:45
  • Related [Get (row,col) indices of max value in dataframe](https://stackoverflow.com/questions/29504938/get-row-col-indices-of-max-value-in-dataframe) – smci Feb 16 '18 at 06:58

1 Answers1

2

Based on your revised update:

In [209]:
keys = ['x', 'y', 'z'] 
vals = [[1,2,-1], [3,5,-1], [4,2,3]] 
data = dict(zip(keys,vals)) 
df = pd.DataFrame(data)
df

Out[209]:
   x  y  z
0  1  3  4
1  2  5  2
2 -1 -1  3

Then the following would work:

In [211]:
df[df==df.min().min()].dropna(axis=1, thresh=1).dropna()

Out[211]:
     x    y
2 -1.0 -1.0

So this uses the boolean mask on the df:

In [212]:
df[df==df.min().min()]

Out[212]:
     x    y   z
0  NaN  NaN NaN
1  NaN  NaN NaN
2 -1.0 -1.0 NaN

and we call dropna with param thresh=1 this drops columns that don't have at least 1 non-NaN value:

In [213]:
df[df==df.min().min()].dropna(axis=1, thresh=1)

Out[213]:
     x    y
0  NaN  NaN
1  NaN  NaN
2 -1.0 -1.0

Probably safer to call again with thresh=1:

In [214]:
df[df==df.min().min()].dropna(axis=1, thresh=1).dropna(thresh=1)

Out[214]:
     x    y
2 -1.0 -1.0
EdChum
  • 376,765
  • 198
  • 813
  • 562
  • Almost, ideally it would be good if it were possible for it to work in situations where there are ties for the extreme eg: `values. keys = ['x', 'y', 'z'] vals = [[1,2,-1], [3,5,-1], [4,2,3]] data = dict(zip(keys,vals)) df = pd.DataFrame(data)` So I prefer your answer the comments. Could you add it here? – Jason Mar 31 '16 at 12:43