96

How can I reference the minimum value of two dataframes as part of a pandas dataframe equation? I tried using the python min() function which did not work. I am looking for something along the lines of this:

data['eff'] = pd.DataFrame([data['flow_h'], data['flow_c']]).min() *Cp* (data[' Thi'] - data[' Tci'])

I also tried to use pandas min() function, which is also not working.

min_flow = pd.DataFrame([data['flow_h'], data['flow_c']]).min()
InvalidIndexError: Reindexing only valid with uniquely valued Index objects

I was confused by this error. The data columns are just numbers and a name, I wasn't sure where the index comes into play.

import pandas as pd
import numpy as np

np.random.seed(365)
rows = 10
flow = {'flow_c': [np.random.randint(100) for _ in range(rows)],
        'flow_d': [np.random.randint(100) for _ in range(rows)],
        'flow_h': [np.random.randint(100) for _ in range(rows)]}
data = pd.DataFrame(flow)

# display(data)
   flow_c  flow_d  flow_h
0      82      36      43
1      52      48      12
2      33      28      77
3      91      99      11
4      44      95      27
5       5      94      64
6      98       3      88
7      73      39      92
8      26      39      62
9      56      74      50
cottontail
  • 10,268
  • 18
  • 50
  • 51
kilojoules
  • 9,768
  • 18
  • 77
  • 149

4 Answers4

201

If you are trying to get the row-wise mininum of two or more columns, use pandas.DataFrame.min. Note that by default axis=0; specifying axis=1 is necessary.

data['min_c_h'] = data[['flow_h','flow_c']].min(axis=1)

# display(data)
   flow_c  flow_d  flow_h  min_c_h
0      82      36      43       43
1      52      48      12       12
2      33      28      77       33
3      91      99      11       11
4      44      95      27       27
5       5      94      64        5
6      98       3      88       88
7      73      39      92       73
8      26      39      62       26
9      56      74      50       50
Asclepius
  • 57,944
  • 17
  • 167
  • 143
Happy001
  • 6,103
  • 2
  • 23
  • 16
  • Is it possible to add a shift(1) in here or would I need to add a temporary column (e.g. data["flow_c_1"] = data["flow_c_1"].shift(1)) ? Thx – user27074 Jul 17 '23 at 09:14
19

If you like to get a single minimum value of multiple columns:

data[['flow_h','flow_c']].min().min()

the first "min()" calculates the minimum per column and returns a pandas series. The second "min" returns the minimum of the minimums per column.

b0lle
  • 732
  • 6
  • 19
1

One may also transpose and call min().

data['min_flow'] = data[['flow_h','flow_c']].T.min()

or call min(axis=1) on the underlying numpy array.

data['min_flow'] = data[['flow_h','flow_c']].values.min(axis=1)

If no axis is passed, ndarray.min is evaluated on the entire array, so for a single minimum value of multiple columns (i.e. minimum of minimums), accessing the underlying numpy array could be useful too:

minimum = data[['flow_h','flow_c']].values.min()
cottontail
  • 10,268
  • 18
  • 50
  • 51
0

You could also use numpy to reference or get the row-wise mininum of two (or more) columns in another one. Important to specify axis=1.

import numpy as np
...
data['min_c_h']= np.min(data[['flow_h','flow_c']],axis=1)
Marc Steffen
  • 113
  • 1
  • 7