0

I have a list of words and a string. I want to take each word in the list of words and check if it matches any of the words in the string.

wordList = ['i', 'love', 'this', 'phone', 'amazing', 'be']

string = "love amazing great best good nice"

I have the following code:

    for word in wordsList:

        posMatch = re.search(word, string)
            if posMatch:
                print (posMatch.group())

So in this case, the output is:

i
love
amazing
be

But I need the output to be:

love
amazing

It is taking "i" and the "be" as a match because they are parts of some of the words in the string. I could place a space before and after the regex, but I am not sure how to do that. Anyone have a good way to do this? Thanks in advance.

modarwish
  • 495
  • 10
  • 22

1 Answers1

0

One way to accomplish this would be to split your string into a list, then check if each word is in the list:

string_list = string.split(' ')
for word in wordList:
    if word in string_list:
        print word

To do this using the re module you'll need the /b regex anchor to mark word boundaries

for word in wordList:
    posMatch = re.search(r'\b%s\b' % word, string)
    if posMatch:
        print (posMatch.group())
brobas
  • 596
  • 4
  • 5
  • 1
    That's quite a few splits and look ups going on there though... – Jon Clements Dec 15 '14 at 07:02
  • What JonClements said: re-splitting `string` each time through the loop is _very_ wasteful. It'd be better to do it outside the loop. And you could make it faster to look up if you store the words from `string` in a set. Also, using a module name like `string` as a variable name is a bit confusing, but I realise you just copied that from the OP. – PM 2Ring Dec 15 '14 at 07:33
  • thank you for explaining this, I changed my answer. – brobas Dec 15 '14 at 07:36