I encountered the ever-common SettingWithCopyWarning
when trying to change some values in a DataFrame. I found a way to get around this without having to disable the warning, but I feel like I've done it the wrong way, and that it is needlessly wasteful and computationally inefficient.
label_encoded_feature_data_to_be_standardised_X_train = X_train_label_encoded[['price', 'vintage']]
label_encoded_feature_data_to_be_standardised_X_test = X_test_label_encoded[['price', 'vintage']]
label_encoded_standard_scaler = StandardScaler()
label_encoded_standard_scaler.fit(label_encoded_feature_data_to_be_standardised_X_train)
X_train_label_encoded_standardised = label_encoded_standard_scaler.transform(label_encoded_feature_data_to_be_standardised_X_train)
X_test_label_encoded_standardised = label_encoded_standard_scaler.transform(label_encoded_feature_data_to_be_standardised_X_test)
That's how it's set up, then I get the warning if I do this:
X_train_label_encoded.loc[:,'price'] = X_train_label_encoded_standardised[:,0]
of if I do this:
X_train_label_encoded_standardised_df = pd.DataFrame(data=X_train_label_encoded_standardised, columns=['price', 'vintage'])
And I solved it by doing this:
X_train_label_encoded = X_train_label_encoded.drop('price', axis=1)
X_train_label_encoded['price'] = X_train_label_encoded_standardised_df.loc[:,'price']
This also works:
X_train_label_encoded.replace(to_replace=X_train_label_encoded['price'], value=X_train_label_encoded_standardised_df['price'])
But even that feels overly clunky with the extra DataFrame creation.
Why can't I just assign the column in some way? Or using some arrangement of the replace method? The documentation doesn't seem to have a solution, or am I just reading it wrong? Missing some obvious but not spelled out solution?
Is there a better way of doing this?