I got around a SettingWithCopyWarning, feels like the wrong way and computationally inefficient, is there a better way?

Question

I encountered the ever-common SettingWithCopyWarning when trying to change some values in a DataFrame. I found a way to get around this without having to disable the warning, but I feel like I've done it the wrong way, and that it is needlessly wasteful and computationally inefficient.

label_encoded_feature_data_to_be_standardised_X_train = X_train_label_encoded[['price', 'vintage']]
label_encoded_feature_data_to_be_standardised_X_test = X_test_label_encoded[['price', 'vintage']]
label_encoded_standard_scaler = StandardScaler()
label_encoded_standard_scaler.fit(label_encoded_feature_data_to_be_standardised_X_train)

X_train_label_encoded_standardised = label_encoded_standard_scaler.transform(label_encoded_feature_data_to_be_standardised_X_train)
X_test_label_encoded_standardised = label_encoded_standard_scaler.transform(label_encoded_feature_data_to_be_standardised_X_test)

That's how it's set up, then I get the warning if I do this:

X_train_label_encoded.loc[:,'price'] = X_train_label_encoded_standardised[:,0]

of if I do this:

X_train_label_encoded_standardised_df = pd.DataFrame(data=X_train_label_encoded_standardised, columns=['price', 'vintage'])

And I solved it by doing this:

X_train_label_encoded = X_train_label_encoded.drop('price', axis=1)
X_train_label_encoded['price'] = X_train_label_encoded_standardised_df.loc[:,'price']

This also works:

X_train_label_encoded.replace(to_replace=X_train_label_encoded['price'], value=X_train_label_encoded_standardised_df['price'])

But even that feels overly clunky with the extra DataFrame creation.

Why can't I just assign the column in some way? Or using some arrangement of the replace method? The documentation doesn't seem to have a solution, or am I just reading it wrong? Missing some obvious but not spelled out solution?

Is there a better way of doing this?

jpp · Accepted Answer · 2018-11-23T17:09:16.290

1

Many times, this warning is just a warning. If your code works and you aren't using chained assignment, you often have nothing to worry about.

If your transformation maintains the index, including order, and your data is numeric, you can use pd.DataFrame.values:

X_train_label_encoded['price'] = X_train_label_encoded_standardised.values[:, 0]

This should sidestep the warning since X_train_label_encoded_standardised.values evaluates to a lower-level NumPy array.

edited Nov 23 '18 at 17:09

answered Nov 23 '18 at 17:04

jpp

159,742
34
281
339

Thank you. In the end I just kept what I had, as it is the most explicit and I don't _strictly_ require that level of efficiency in this case. – Chor Hatara Hud'u Keturi Nov 26 '18 at 15:03

I got around a SettingWithCopyWarning, feels like the wrong way and computationally inefficient, is there a better way?

1 Answers1