I have a dataframe, based on the strings in a column named "originator" I would like to check if the string has a word that resides in another list. If the string has a word that resides in the said list, update column originator_prediction to "org".
Is there a better way to do this? I did it the following way but its slow.
for row in df['ORIGINATOR'][1:]:
string = str(row)
splits = string.split()
for word in splits:
if word in COMMON_ORG_UNIGRAMS_LIST:
df['ORGINATOR_PREDICTION'] = 'Org'
else:
continue
df = pd.DataFrame({'ORIGINATOR': ['JOHN DOE', 'APPLE INC', 'MIKE LOWRY'],
'ORGINATOR_PREDICTION': ['Person', 'Person','Person']})
COMMON_ORG_UNIGRAMS_LIST = ['INC','LLC','LP']
Concretely, if you look at row 2 in our dataframe "APPLE INC" should have an originator_prediction = 'ORG' not person.
The reason being, we looped through our common org unigrams list and the word INC was in there.