0

I'm trying to find the "best in class", with each class being a whole dollar amount in prices and "best" being maximum value like so:


example = pd.DataFrame({    
    "values": [1, 2, 3, 3, 3, 4, 4, 5, 6, 6, 7, 7, 7, 8,8,8,8,8, 9 ],
    "prices": [1.1, 2.2, 3.31, 3.32, 3.33, 4.1, 4.2, 5.1, 6.1, 6.2, 7.1, 7.2, 7.3, 8.1, 8.2, 8.3, 8.4, 8.5, 9.1]
})

Is there a way to compare the value of a column against multiple? Perhaps something like:

example["is_best"] = np.where(example["prices"] < all(example[example[values]])

Expected output:

    values  prices  is_best
0   1       1.10    1
1   2       2.20    1
2   3       3.31    1
3   3       3.32    0
4   3       3.33    0
5   4       4.10    1
6   4       4.20    0
7   5       5.10    0
8   6       6.10    1
9   6       6.20    0

Caveat: There will be other columns with varying data in them.

Amorphous
  • 675
  • 1
  • 8
  • 26
  • 1
    Are you just wanting to group by "values" and find the maximum of "prices" for each group? – Riley Nov 06 '21 at 03:45
  • It sounds like `example['is_best'] = example.groupby('values').transform('max')` [Python Pandas max value in a group as a new column](https://stackoverflow.com/q/35640364/15497888) – Henry Ecker Nov 06 '21 at 03:54
  • I'm trying to find the minimum price for each value, aka the lease amount of money to spend for the most bang – Amorphous Nov 06 '21 at 04:01
  • 1
    Then just do `'min'`? `example['is_best'] = example.groupby('values')['prices'].transform('min')` – Henry Ecker Nov 06 '21 at 04:15

1 Answers1

0

I assume you're wanting to preserve your dataframe structure other than just ID'ing the max in each group. In this case, transform and np.where as you suggested do the trick.

import pandas as pd
import numpy as np
df = pd.DataFrame({    
    "values": [1, 2, 3, 3, 3, 4, 4, 5, 6, 6, 7, 7, 7, 8,8,8,8,8, 9 ],
    "prices": [1.1, 2.2, 3.31, 3.32, 3.33, 4.1, 4.2, 5.1, 6.1, 6.2, 7.1, 7.2, 7.3, 8.1, 8.2, 8.3, 8.4, 8.5, 9.1]
})

df['is_best'] = np.where(
    df['prices']==df.groupby('values')['prices'].transform('min'),
   1, 0
)

Output:

values  prices  is_best
0   1   1.10    1
1   2   2.20    1
2   3   3.31    1
3   3   3.32    0
4   3   3.33    0
5   4   4.10    1
6   4   4.20    0
7   5   5.10    1
8   6   6.10    1
9   6   6.20    0
10  7   7.10    1
11  7   7.20    0
12  7   7.30    0
13  8   8.10    1
14  8   8.20    0
15  8   8.30    0
16  8   8.40    0
17  8   8.50    0
18  9   9.10    1
born_naked
  • 748
  • 9
  • 19