I have the following pandas dataframe (pandas 0.20.2, python 3.6.2):
# df=pd.DataFrame([['abc00010 Pathway'],['abc00020 Pathway']], columns=["ENTRY"])
df3=pd.DataFrame(columns=["ENTRY"])
df3.loc[:,"ENTRY"]=[list(['abc00010 Pathway']),list(['abc00020 Pathway'])]
df["ENTRY2"]=df.loc[:,"ENTRY"]
df["ENTRY3"]=df.loc[:,"ENTRY"]
df["ENTRY4"]=df.loc[:,"ENTRY"]
df["ENTRY5"]=df.loc[:,"ENTRY"]
df["ENTRY6"]=df.loc[:,"ENTRY"]
dfcleaner=re.compile(r"\W+?Pathway")
df.loc[:,"ENTRY"]=df.loc[:,"ENTRY"].apply(str)
df.loc[:,"ENTRY"].replace(dfcleaner,"", inplace=True, regex=True)
df.loc[:,"ENTRY2"]=df.loc[:,"ENTRY2"].apply(str)
df.loc[:,"ENTRY2"].replace(dfcleaner,"")
df.loc[:,"ENTRY3"].replace(dfcleaner,"", inplace=True, regex=True)
df["ENTRY4"]=df.loc[:,"ENTRY4"].str.replace(dfcleaner,"")#>NANA
df.loc[:,"ENTRY5"]=df.loc[:,"ENTRY5"].replace(dfcleaner,"", inplace=True, regex=True)
df.loc[:,"ENTRY6"]=df.loc[:,"ENTRY6"].replace(dfcleaner,"", regex=True)
ENTRY ENTRY2 ENTRY3 ENTRY4 ENTRY5 ENTRY6
0 ['abc00010'] ['abc00010 Pathway'] ['abc00010 Pathway'] nan None ['abc00010 Pathway']
1 ['abc00020'] ['abc00020 Pathway'] ['abc00020 Pathway'] nan None ['abc00020 Pathway']
I expected ENTRY2 not to be changed, as well as ENTRY3 and ENTRY6 since they are not strings nor converted to it, or ENTRY5 as replacing in place will return none.
What I did not expect was the ENTRY4 behavior with the string accessor. Could you explain it to me? Can't decide if it is a bug or not, it has not yet been reported if it is one...
EDITED the code above as the first one did not give a df exactly similar to what I wanted/what matches the results in my code