2

I have a dataframe like this:

ID date        var1 var2 var3
AB 22/03/2020  0     1   3 
AB 29/03/2020  0     3   3 
CD 22/03/2020  0     1   1

And I would like to have a new dataset that, if it is a maximal column (can happen ties too) leaves the same number of the original dataset on the rows; otherwise set -1 if it is not the maximal. So it would be:

ID date        var1   var2    var3
AB 22/03/2020  -1     -1        3 
AB 29/03/2020  -1      3        3 
CD 22/03/2020  -1      1        1

But I am not managing at all how to do this. What can I try next?

halfer
  • 19,824
  • 17
  • 99
  • 186
Catarina Nogueira
  • 1,024
  • 2
  • 12
  • 28

2 Answers2

1

IIUC use where and date back

s=df.loc[:,'var1':]
df.update(s.where(s.eq(s.max(1),axis=0),-1))
df
   ID        date  var1  var2  var3
0  AB  22/03/2020    -1    -1     3
1  AB  29/03/2020    -1     3     3
2  CD  22/03/2020    -1     1     1
BENY
  • 317,841
  • 20
  • 164
  • 234
1

Select only numeric columns by DataFrame.select_dtypes:

df1 = df.select_dtypes(np.number)

Or select all columns without first two by positions by DataFrame.iloc:

df1 = df.iloc[:, 2:]

Or select columns with var label by DataFrame.filter:

df1 = df1.filter(like='var')

And then set new values by DataFrame.where with max:

df[df1.columns] = df1.where(df1.eq(df1.max(1), axis=0), -1)
print (df)
   ID        date  var1  var2  var3
0  AB  22/03/2020    -1    -1     3
1  AB  29/03/2020    -1     3     3
2  CD  22/03/2020    -1     1     1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Hi, thanks a lot for the answer! I got a warning and I was trying to fix but I couldn't. I read about the error and I understood, I am just not managing how to fix it. The error is: "A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead" – Catarina Nogueira Apr 10 '20 at 15:42
  • 1
    @Catarina Nogueira How working add copy, if some filtration? Like `df1 = df.select_dtypes(np.number).copy()` ? – jezrael Apr 10 '20 at 18:41