How do I check a string for certain words without getting matches for parts of the words in the string in Python?

Question

I have a list of words and a string. I want to take each word in the list of words and check if it matches any of the words in the string.

wordList = ['i', 'love', 'this', 'phone', 'amazing', 'be']

string = "love amazing great best good nice"

I have the following code:

    for word in wordsList:

        posMatch = re.search(word, string)
            if posMatch:
                print (posMatch.group())

So in this case, the output is:

i
love
amazing
be

But I need the output to be:

love
amazing

It is taking "i" and the "be" as a match because they are parts of some of the words in the string. I could place a space before and after the regex, but I am not sure how to do that. Anyone have a good way to do this? Thanks in advance.

If you're doing this for production code, you might want to take a look at [nltk](http://www.nltk.org/) — John La Rooy, Dec 15 '14 at 06:47
@Barmar thanks. But can you show me how to write up the code for that? — modarwish, Dec 15 '14 at 06:55
@Barmar. What about just in a comment? I would appreciate it. — modarwish, Dec 15 '14 at 06:59
Is there some problem with the answers in the other question? — Barmar, Dec 15 '14 at 07:01

brobas · Accepted Answer · 2014-12-15T07:34:05.987

0

One way to accomplish this would be to split your string into a list, then check if each word is in the list:

string_list = string.split(' ')
for word in wordList:
    if word in string_list:
        print word

To do this using the re module you'll need the /b regex anchor to mark word boundaries

for word in wordList:
    posMatch = re.search(r'\b%s\b' % word, string)
    if posMatch:
        print (posMatch.group())

edited Dec 15 '14 at 07:34

answered Dec 15 '14 at 06:59

brobas

596
4
5

1

That's quite a few splits and look ups going on there though... – Jon Clements Dec 15 '14 at 07:02
What JonClements said: re-splitting `string` each time through the loop is _very_ wasteful. It'd be better to do it outside the loop. And you could make it faster to look up if you store the words from `string` in a set. Also, using a module name like `string` as a variable name is a bit confusing, but I realise you just copied that from the OP. – PM 2Ring Dec 15 '14 at 07:33
thank you for explaining this, I changed my answer. – brobas Dec 15 '14 at 07:36

How do I check a string for certain words without getting matches for parts of the words in the string in Python?

1 Answers1