0

I have a list like :

keyword_list = ['motorcycle love hobby ', 'bike love me', 'cycle', 'dirtbike cycle motorbike ']

I want to find these words in the panda's data frame column and if 3 words match then it should create a new column with these words.

I need something like this :

enter image description here

macropod
  • 12,757
  • 2
  • 9
  • 21
rudra
  • 11
  • 3

1 Answers1

1

You can probably use set operations:

kw = {s: set(s.split()) for s in keyword_list}

def subset(s):
    S1 = set(s.split())
    for k, S2 in kw.items():
        if S2.issubset(S1):
            return k

df['trigram'] = [subset(s) for s in df['description'].str.lower()]

print(df)

Output:

                                   description                 trigram
0  I love motorcycle though I have other hobby   motorcycle love hobby 
1                                  I have bike                    None
mozway
  • 194,879
  • 13
  • 39
  • 75