I've seen many questions about the infamous SettingWithCopy
warning. I've even ventured to answer a few of them. Recently, I was putting together an answer that involved this topic and I wanted to present the benefit of a dataframe view. I failed to produce a tangible demonstration of why it is a good idea to create a dataframe view or whatever the thing is that generates SettingWithCopy
Consider df
df = pd.DataFrame([[1, 2], [3, 4]], list('ab'), list('AB'))
df
A B
x 1 2
y 3 4
and dfv
which is a copy of df
dfv = df[['A']]
print(dfv.is_copy)
<weakref at 0000000010916E08; to 'DataFrame' at 000000000EBF95C0>
print(bool(dfv.is_copy))
True
I can generate the SettingWithCopy
dfv.iloc[0, 0] = 0
However, dfv
has changed
print(dfv)
A
a 0
b 3
df
has not
print(df)
A B
x 1 2
y 3 4
and dfv
is still a copy
print(bool(dfv.is_copy))
True
If I change df
df.iloc[0, 0] = 7
print(df)
A B
x 7 2
y 3 4
But dfv
has not changed. However, I can reference df
from dfv
print(dfv.is_copy())
A B
x 7 2
y 3 4
Question
If dfv
maintains it's own data (meaning, it doesn't actually save memory) and it assigns values via assignment operations despite warnings, then why do we bother saving references in the first place and generate SettingWithCopyWarning
at all?
What is the tangible benefit?