Similar to pandas groupby where you get the max of one column and the min of another column
Given that the Data Frame is about Car Race, we have in df the following [Race_Event_ID, Car_ID, Driver_ID, AvgSpeed]
Group By df.group_by(['Race_Event_ID'])['Speed'].agg(['max','mean',.....]
can give Group by Stats.
But I need to have the 'Speed' , Car_ID, Driver_ID
of the Topper, or Whose speed = 'Max Speed'
, and Similarly the Tailer's Speed, Car_ID, Driver_ID , ie Speed = Min Speed.
Given that there could be Tie with Top Speed, Lets get Both or at least one.
One Solution is ,
df_max =df.group_by(['Race_Event_ID'])['Speed'].agg(['max'])
df_max.merge(df,on='Race_Event_ID')
This solution may take time. And within the same iteration of the data , things may be achievable if we were to iterate ourselves. Do we have an efficient way in Pandas ? And I would be curious to see a Spark DataFrame solution as well.