I have two data frame. Both have the same set of columns but some columns are categorical typed (based on the actual containing values). In order to combine them I refresh the categorical type of the categorical columns with the union of both values.
def appendDFsWithCat(df1, df2):
columns = df1.select_dtypes(include=['category']).columns
for c in columns:
catValues1 = list(df1[c].cat.categories)
catValues2 = list(df2[c].cat.categories)
catValues = list(set(catValues1 + catValues2))
df1[c] = df1[c].cat.set_categories(catValues)
df2[c] = df2[c].cat.set_categories(catValues)
return df1.append(df2, ignore_index=True).reset_index(drop=True)
Everything works like expected but I would like to understand why a SettingWithCopyWarning is raising when executing this code:
df1[c] = df1[c].cat.set_categories(catValues)
Utility.py:149: SettingWithCopyWarning:
I found no other possibility to refresh the category data than the used one.