I have dataFrame and I need to drop duplicates per group ('col1') based on a minimum value in another column 'abs(col1 - col2)', but I need to change this condition for the last group by taking the max value in 'abs(col1 - col2)' that corresponding to the last group in 'col1' where I sorted the 'col1' with ascending order. (to behave as a loop)
Update 1 :
I need to assign the last group dynamically.
for example, If I have a data frame as
- creating DataFrame
df = pd.DataFrame( {'col0':['A','A','A','A','A','A','A','A','A','A','A','A','B','B','B','B','B','B','B','B','B','B','B','B'],'col1':[1,1,1,2,2,2,3,3,3,4,4,4,2,2,2,3,3,3,4,4,4,5,5,5], 'col2':[2,3,4,1,3,4,1,2,4,1,2,3,3,4,5,2,4,5,2,3,5,2,3,4]})
compute Diff column (this column will be used as a condition)
df['abs(col1 - col2)']=abs(df['col1']-df['col2'])
- The original Df as follow :
- The desired Df should looks like:
my trial:
df.sort_values(by=['col0','col1','abs(col1 - col2)','col2'],ascending=[True,True,True,False]).drop_duplicates(['col0','col1'])
the resulting as follow: