0

Here is test code,

testframe = pd.DataFrame({"A":"1","B":"1","C":"1","D":"1"})

dataframe1 = testframe 

#try to remove column C
dataframe1.drop(['C'],axis=1,inplace=True)

#then modify col D, like
>>A B D
  1 1 2

#rename col 'D' to 'E', now dataframe1 contains only 'A' and 'E'
#(does this mean )
dataframe1.rename(columns={"D": "E"}, inplace=True)
print(dataframe1)

# result is ,
>>A B E
  1 1 2
#join two dataframe(is this allowed? it seems that I joined a copy of 'testframe')

testframe = testframe.set_index(['A','B']).join(dataframe1.set_index(['A','B']))

print(testframe)

but the result is

A B C D E
1 1 1 2 2

It looks like joint of two dataframe, but it is not, in fact.

Why is Col D changed?

Henry Ecker
  • 34,399
  • 18
  • 41
  • 57
Rui
  • 97
  • 1
  • 8

2 Answers2

0

you can use pd.concat method to join two dataframes read_this

  • Actually, I have tried 'pd.concat', but this function will generate much redundant data, the root cause is dataframe assignment is just a 'shallow copy' by default. Thanks all the same. – Rui Dec 15 '22 at 04:15
0

The root cause is found.

    dataframe1 = testframe
    dataframe1.rename(columns={"D": "E"}, inplace=True)

code above is just a shallow copy of original 'testframe', so rename will make sense to testframe, and it works after changing code as following with deep copy,

    dataframe1 = testframe.copy(deep=True)

and also merge() will get similar result as join() in this case.

Rui
  • 97
  • 1
  • 8