2

Apologies if the title is hard to parse, here is what I am trying to do:

If I have the following dataframe

    run   group   value
0   1     A       3
1   2     A       2
2   3     A       3
3   4     B       5
4   5     B       1
5   6     C       3
6   7     C       4

I want to set the output column for each run equal to the maximum value of each group, so it would look like this

    run   group   value
0   1     A       3
1   2     A       3
2   3     A       3
3   4     B       5
4   5     B       5
5   6     C       4
6   7     C       4

Is there a way to do this without a for loop? The closest I've come is to use a groupby to get the max, turn that into a dictionary, then map it back on to the original dataframe like so

df = pd.DataFrame([[1, "A", 2], [2, "A", 3], [3, "B", 5], [4, "B", 1], [5, "C", 3], [6, "C", 4]], columns=["Run", "Group", "Value"])`
max_vals = df.groupby("Group")["Value"].max().to_dict()
df["Value"] = df["Group"].map(max_vals)

but it feels like there should be a neater way to do this.

Apollo42
  • 49
  • 6

1 Answers1

1

You can use groupby.transform() and return the max value:

df['Value'] = df.groupby(['Group'])['Value'].transform(max)
sophocles
  • 13,593
  • 3
  • 14
  • 33