If you want to check that 2 sentences have the same words (with the same number of occurences), you could split the sentences in words and sort them:
>>> sorted("hello world my name is foobar".split())
['foobar', 'hello', 'is', 'my', 'name', 'world']
>>> sorted("my name is foobar world hello".split())
['foobar', 'hello', 'is', 'my', 'name', 'world']
You could define the check in a function:
def have_same_words(sentence1, sentence2):
return sorted(sentence1.split()) == sorted(sentence2.split())
print(have_same_words("hello world my name is foobar", "my name is foobar world hello"))
# True
print(have_same_words("hello world my name is foobar", "my name is foobar world hello"))
# True
print(have_same_words("hello", "hello hello"))
# False
print(have_same_words("hello", "holle"))
# False
If case isn't important, you could compare lowercase sentences:
def have_same_words(sentence1, sentence2):
return sorted(sentence1.lower().split()) == sorted(sentence2.lower().split())
print(have_same_words("Hello world", "World hello"))
# True
Note: you could also use collections.Counter
instead of sorted
. The complexity would be O(n)
instead of O(n.log(n))
, which isn't a big difference anyway. import collections
might take a longer time than sorting the strings:
from collections import Counter
def have_same_words(sentence1, sentence2):
return Counter(sentence1.lower().split()) == Counter(sentence2.lower().split())
print(have_same_words("Hello world", "World hello"))
# True
print(have_same_words("hello world my name is foobar", "my name is foobar world hello"))
# True
print(have_same_words("hello", "hello hello"))
# False
print(have_same_words("hello", "holle"))
# False