I have a list of names and a dataframe with a column of free form text. I am trying to scan through the column of text and if it contains a string from the list then append the string as an additional column on the data frame.
I have only found ways to make it appear as a binary or True/False in the additional column.
sys_list = ['AAAA', 'BBBB', 'AD-12', 'B31-A']
data = {'text': ['need help with AAAA system requesting help', 'AD-12 crashed, need
support', 'fuel system down', '/BBBB needs refresh']}
df = pd.DataFrame(data)
with the end result being
text System
0 need help with AAAA system requesting help AAAA
1 AD-12 crashed, need support AD-12
2 fuel system down 0
3 /BBBB needs refresh BBBB
I have tried
# which gives True or False values
pattern = '|'.join(sys_list)
df['System'] = df['text'].str.contains(pattern)
# which gives 0 or 1
df['System'] = [int(any(w in sys_list for w in x.split())) for x in df['text']]