2

When I have a DataFrame with different attrs values for both the DataFrame and the Series it contains then after copying the DataFrame the original instance has the attrs of the DataFrame for all Series as well and the attrs of the Series is lost.

To be clear I'm not talking about the copied instance but about the original instance.

Here is an example of what is going on:

df = pd.DataFrame([1, 2])
df.attrs['a'] = 'b'
df[0].attrs['c'] = 'd'
print(f"DataFrame attrs: {df.attrs} /  Series attrs: {df[0].attrs}")
df_copy = df.copy()
print(f"DataFrame after copy attrs: {df.attrs} /  Series attrs: {df[0].attrs}")
print(f"Copied DataFrame attrs: {df_copy.attrs} /  Series attrs: {df_copy[0].attrs}")
<<<
DataFrame attrs: {'a': 'b'} /  Series attrs: {'a': 'b', 'c': 'd'}
DataFrame after copy attrs: {'a': 'b'} /  Series attrs: {'a': 'b'}
Copied DataFrame attrs: {'a': 'b'} /  Series attrs: {'a': 'b'}

Also note the entry 'a': 'b' in the Series attrs which is somehow copied from the DataFrame, I also find that strange, is it a bug or a feature?

I've also tried using the _metadata solution instead but this behaves in a more or less similar way. It might use the same underlying logic:

df = pd.DataFrame([1, 2])
df._metadata.append('meta')
df.meta = 'a'
df[0]._metadata.append('meta')
df[0].meta = 'b'
print(f"DataFrame meta: {df.meta} /  Series meta: {df[0].meta}")
df_copy = df.copy()
print(f"DataFrame after copy meta: {df.meta} /  Series meta: {df[0].meta}")
print(f"Copied DataFrame meta: {df_copy.meta} /  Series meta: {df_copy[0].meta}")
<<<
DataFrame meta: a /  Series meta: b
DataFrame after copy meta: a /  Series meta: a
Copied DataFrame meta: a /  Series meta: a

I'm using Pandas version 1.4.2 and python 3.9

lou
  • 1,740
  • 1
  • 13
  • 13

0 Answers0