21

I have a dataframe and a list

df = pd.DataFrame({'IDs':[1234,5346,1234,8793,8793],
                    'Names':['APPLE ABCD ONE','APPLE ABCD','NO STRAWBERRY YES','ORANGE AVAILABLE','TEA AVAILABLE']})

kw = ['APPLE ABCD', 'ORANGE', 'LEMONS', 'STRAWBERRY', 'BLUEBERRY', 'TEA COFFEE']

I want to create a new column flag such that if Names column contain keyword from kw, flag will be 1 else 0.

Expected Output:

    IDs     Names               Flag
0   1234    APPLE ABCD ONE      1
1   5346    APPLE ABCD          1
2   1234    NO STRAWBERRY YES   1
3   8793    ORANGE AVAILABLE    1
4   8793    TEA AVAILABLE       0

I am able to get the output using below code:

ind=[]
for idx, value in df.iterrows():
    x = 0
    for u in kw:
        if u in value['Names']:
            ind.append(True)
            x = 1
            break
    if x == 0:
        ind.append(False)

df['flag'] = ind

Is there an alternate way to avoid for loop and making it more efficient?

Sociopath
  • 13,068
  • 19
  • 47
  • 75
  • Possible duplicate of [check if string in pandas dataframe column is in list](https://stackoverflow.com/questions/17972938/check-if-string-in-pandas-dataframe-column-is-in-list) – Space Impact Nov 17 '18 at 11:36

2 Answers2

27

Use apply and lambda like:

df['Names'].apply(lambda x: any([k in x for k in kw]))

0     True
1     True
2     True
3     True
4    False
Name: Names, dtype: bool
Franco Piccolo
  • 6,845
  • 8
  • 34
  • 52
  • It works perfectly, thanks Franco. It would be convenient to count all the 'true' in resulting object, so: names = df['Names'].apply(lambda x: any([k in x for k in kw])); names.value_counts() – Helen Kapatsa May 24 '22 at 16:59
22

You can use the isin function of pandas

df['Names'].isin(kw)
Aditya Lahiri
  • 447
  • 3
  • 4