0

I tried to set the partial data to -1, but I get a SettingWithCopyWarning.

I tried to find StackOverflow, but lots of answers use loc to solved.

The data comes from Kaggle Titanic.

import pandas as pd
train = pd.read_csv('data/train.csv')
y = train[["Survived"]]
y.loc[y["Survived"]  == 0,"Survived"] = -1
Brad Solomon
  • 38,521
  • 31
  • 149
  • 235
sappy
  • 770
  • 2
  • 6
  • 16

1 Answers1

1

Your logic appears confused. Try this instead:

train.loc[train["Survived"]  == 0,"Survived"] = -1

There is no need to set y = train[['Survived']] and this is what is causing your warning.

You can read about how to use .loc accessor in the Pandas documentation.

jpp
  • 159,742
  • 34
  • 281
  • 339
  • But I still don't know why y = train[['Survived']] cause this bug. It is a DataFrame and isn't a copy. Why can't I use loc on y ? – sappy Mar 26 '18 at 17:16
  • This is a *warning*. It may or may not work, but it's not good practice. – jpp Mar 26 '18 at 17:21
  • Thanks for your help. I think I know why the warning happened. – sappy Mar 27 '18 at 12:59