Existing Dataframe:
Id status countries
01 pass ['xyx','Indonesia','brazil']
02 fail ['PQ','XT','sri lanka']
03 pass ['spain', 'india','xtx']
Expected Dataframe :
Id status countries filtered_countries_name
01 pass ['xyx','Indonesia','brazil'] 'Indonesia','brazil'
02 fail ['PQ','XT','sri lanka'] 'sri lanka'
03 pass ['spain', 'india','xtx'] 'spain', 'india'
i do have master list of specific countries(those i want to check) from where i am comparing the existing list in countries column.
my approach :
countries_list = ['china', 'india', 'united states', 'indonesia', 'brazil', 'pakistan', 'nigeria', 'bangladesh', 'russia', 'japan', 'mexico', 'philippines', 'vietnam', 'ethiopia', 'egypt', 'germany', 'iran', 'turkey', 'democratic republic of the congo', 'thailand', 'france', 'united kingdom', 'italy', 'burma', 'south africa', 'south korea', 'colombia', 'spain', 'ukraine', 'tanzania', 'kenya', 'argentina', 'algeria', 'poland', 'sudan', 'uganda','Indonesia','brazil','spain','sri lanka']
import re
countries_re = '|'.join(str(v) for v in countries_list )
df['filtered_countries_name'] = df['countries'].str.extractall(countries_re)
but unable to fetch with the same with this error
TypeError: incompatible index of inserted column with frame index
any leads..??