0

I'm trying to apply scaler to columns of a dataframe after groupby.

scaler = MinMaxScaler()

df = pd.DataFrame({'a':[1,2,3,4,5,], 'b':[10,20,30,40,50,], 'k':[False, True, False, True, False]})

    

for name, g in df.groupby('k'):
    scaler = MinMaxScaler()
    scaler.fit(g['a'].values[..., np.newaxis])
    
    for col in ['a', 'b']:
    
        
        new_v = scaler.transform(g[col].values[..., np.newaxis])[:, 0]
        print(new_v)
        g[col] = new_v

How to make the operation actually change the df itself?

eugene
  • 39,839
  • 68
  • 255
  • 489
  • Does this answer your question? [Group by MinMaxScaler in pandas dataframe](https://stackoverflow.com/questions/67656988/group-by-minmaxscaler-in-pandas-dataframe/67657243#67657243) – Shubham Sharma May 23 '21 at 13:48
  • Does this answer your question? [Group by MinMaxScaler in pandas dataframe](https://stackoverflow.com/questions/67656988/group-by-minmaxscaler-in-pandas-dataframe) – StupidWolf May 24 '21 at 09:56
  • I've seen the answer, I don't think I can apply the answer to my question. – eugene May 24 '21 at 10:14

1 Answers1

0

How to make the operation actually change the df itself?

If you cannot get groupby.transform to work, use the groups' indices to filter the left-hand-side of an assignment:

import pandas as pd
df = pd.DataFrame({'a':[1,2,3,4,5,],
                   'b':[10,20,30,40,50,],
                   'k':[False, True, False, True, False]})
                   
for name, g in df.groupby('k'):
    if name:
        new_values = g / 23
    else:
        new_values = g + .99999
    for col in ['a', 'b']:
        df.loc[g.index,col] = new_values[col]

wwii
  • 23,232
  • 7
  • 37
  • 77