1

I am trying to extract strings from a DF in pandas dataframe and the source strings are in a list from which I have to match. I tried using a df.str.extract(list1) but i got an error of unhashable types i guess i the way I compare the list to the DF is not correct

From

Col 1   Col 2
1       The date
2       Three has come
3       Mail Sent
4       Done Deal

To

Col 1   Col 2           Col 3 
1       The date        NaN
2       Three has come  Three has
3       Mail Sent        Mail
4       Done Deal        Done

My list is like below

List1 = ['Three has' , 'Mail' , 'Done' , 'Game' , 'Time has come']
yasin mohammed
  • 461
  • 2
  • 10
  • 26

1 Answers1

7

You can use extract with join all values in List by | what means or in regex:

List1 = ['Three has' , 'Mail' , 'Done' , 'Game' , 'Time has come']
df['Col 3'] = df['Col 2'].str.extract("(" + "|".join(List1) +")", expand=False)
print (df)
   Col 1           Col 2      Col 3
0      1        The date        NaN
1      2  Three has come  Three has
2      3       Mail Sent       Mail
3      4       Done Deal       Done

Another solution:

List1 = ['Three has' , 'Mail' , 'Done' , 'Game' , 'Time has come']

df['Col 3'] = df['Col 2'].apply(lambda x: ''.join([L for L in List1 if L in x]))
df['Col 3'] = df['Col 3'].mask(df['Col 3'] == '')
print (df)
   Col 1           Col 2      Col 3
0      1        The date        NaN
1      2  Three has come  Three has
2      3       Mail Sent       Mail
3      4       Done Deal       Done
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252