20

I am trying to create a column which contains only the minimum of the one row and a few columns, for example:

    A0      A1      A2      B0      B1      B2      C0      C1
0   0.84    0.47    0.55    0.46    0.76    0.42    0.24    0.75
1   0.43    0.47    0.93    0.39    0.58    0.83    0.35    0.39
2   0.12    0.17    0.35    0.00    0.19    0.22    0.93    0.73
3   0.95    0.56    0.84    0.74    0.52    0.51    0.28    0.03
4   0.73    0.19    0.88    0.51    0.73    0.69    0.74    0.61
5   0.18    0.46    0.62    0.84    0.68    0.17    0.02    0.53
6   0.38    0.55    0.80    0.87    0.01    0.88    0.56    0.72

Here I am trying to create a column which contains the minimum for each row of columns B0, B1, B2.

The output would look like this:

    A0      A1      A2      B0      B1      B2      C0      C1      Minimum
0   0.84    0.47    0.55    0.46    0.76    0.42    0.24    0.75    0.42
1   0.43    0.47    0.93    0.39    0.58    0.83    0.35    0.39    0.39
2   0.12    0.17    0.35    0.00    0.19    0.22    0.93    0.73    0.00
3   0.95    0.56    0.84    0.74    0.52    0.51    0.28    0.03    0.51
4   0.73    0.19    0.88    0.51    0.73    0.69    0.74    0.61    0.51
5   0.18    0.46    0.62    0.84    0.68    0.17    0.02    0.53    0.17
6   0.38    0.55    0.80    0.87    0.01    0.88    0.56    0.72    0.01

Here is part of the code, but it is not doing what I want it to do:

for i in range(0,2):
    df['Minimum'] = df.loc[0,'B'+str(i)].min()
yash.trojan.25
  • 211
  • 1
  • 3
  • 8

2 Answers2

32

This is a one-liner, you just need to use the axis argument for min to tell it to work across the columns rather than down:

df['Minimum'] = df.loc[:, ['B0', 'B1', 'B2']].min(axis=1)

If you need to use this solution for different numbers of columns, you can use a for loop or list comprehension to construct the list of columns:

n_columns = 2
cols_to_use = ['B' + str(i) for i in range(n_columns)]
df['Minimum'] = df.loc[:, cols_to_use].min(axis=1)
Marius
  • 58,213
  • 16
  • 107
  • 105
  • Thanks. But what if for different cases the number of columns change, which is why I have a for loop, where actually, I am passing a variable in the range, like so: for i in range(0,total): I tried what you suggested, but it gives me all zeros. – yash.trojan.25 Aug 25 '14 at 06:30
  • I tried it with the for loop, but it returns the last element from the row/column, and not the minimum. – yash.trojan.25 Aug 25 '14 at 07:03
  • @yash.trojan.25 I tried it with your example DataFrame... and it worked! – Andy Hayden Aug 25 '14 at 07:49
  • @yash.trojan.25: See my edit for how to use this with variable numbers of columns. I'm not sure how you were getting "all zeros" but I did test this solution with your example data and got your expected output. – Marius Aug 25 '14 at 07:58
  • Awesome! That worked perfectly! Exactly what I wanted. Thanks. – yash.trojan.25 Aug 25 '14 at 16:12
  • @Marius can you explain how the `:` slice works here. It allows for using keys as without the keys aren't found in the index. Any pointer to documentation appreciated. – noblerthanoedipus Jan 30 '16 at 02:20
3

For my tasks a universal and flexible approach is the following example:

df['Minimum'] = df[['B0', 'B1', 'B2']].apply(lambda x: min(x[0],x[1],x[2]), axis=1)

The target column 'Minimum' is assigned the result of the lambda function based on the selected DF columns['B0', 'B1', 'B2']. Access elements in a function through the function alias and his new Index(if count of elements is more then one). Be sure to specify axis=1, which indicates line-by-line calculations. This is very convenient when you need to make complex calculations. However, I assume that such a solution may be inferior in speed.

As for the selection of columns, in addition to the 'for' method, I can suggest using a filter like this:

calls_to_use = list(filter(lambda f:'B' in f, df.columns))

literally, a filter is applied to the list of DF columns through a lambda function that checks for the occurrence of the letter 'B'.

after that the first example can be written as follows:

calls_to_use = list(filter(lambda f:'B' in f, df.columns))    
df['Minimum'] = df[calls_to_use].apply(lambda x: min(x), axis=1)

although after pre-selecting the columns, it would be preferable:

df['Minimum'] = df[calls_to_use].min(axis=1)
DenPO
  • 51
  • 3
  • 3
    While this code may solve the question, [including an explanation](https://meta.stackexchange.com/questions/114762/explaining-entirely-code-based-answers) really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion. – 4b0 Mar 21 '19 at 10:43