1

Let's say I have a function that takes in some string, and then I need to return the set of words in this string that occur exactly once. What is the best way to go about doing this? Would using dict be helpful? I've tried some pseudocode like:

counter = {}
def FindWords(string):
    for word in string.split()
        if (word is unique): counter.append(word)
return counter

Is there a better way to implement this? Thanks!

edit:

Say I have: "The boy jumped over the other boy". I want to return "jumped," "over," and "other."

Also, I'd like to return this as a set, and not a list.

J. P.
  • 289
  • 4
  • 14

5 Answers5

3

You can use the Counter from collections and return a set of the words that occur only once.

from collections import Counter

sent = 'this is my sentence string this is also my test string'

def find_single_words(s):
    c = Counter(s.split(' '))
    return set(k for k,v in c.items() if v==1)

find_single_words(sent)
# returns:
{'also', 'sentence', 'test'}

To do this with just the base Python utilities, you can use a dictionary to keep count of the occurrences, replicating the functionality of Counter.

sent = 'this is my sentence string this is also my test string'

def find_single_words(s):
    c = {}
    for word in s.split(' '):
        if not word in c:
             c[word] = 1
        else:
             c[word] = c[word] + 1
    return [k for k,v in c.items() if v==1]

find_single_words(sent)
# returns:
['sentence', 'also', 'test']
James
  • 32,991
  • 4
  • 47
  • 70
  • Is there a way to do this without exporting outside tools like Counter? – J. P. Oct 03 '17 at 22:24
  • 1
    @J.P. `collections` is part of the standard library, it is not really an outside tool – James Oct 03 '17 at 22:24
  • @J.P. i added an additional part to my answer, see above – James Oct 03 '17 at 22:31
  • Hi, thanks! Do you know how you would change this if you wanted to return a set instead of a list? Instead of c.items(), could you return a set instead? – J. P. Oct 03 '17 at 22:39
  • @J.P. sure, i modified the second part of my answer to return a set – James Oct 04 '17 at 00:50
  • Great, thanks a lot! when you use: return[k for k,v in c.items() if v==1], is v being newly defined here as an index of c? – J. P. Oct 04 '17 at 03:20
  • @James, if you test it with the OP's input ("The boy jumped over the other boy"), your code returns `{'The', 'jumped', 'other', 'over', 'the'}`, which is not what the OP wanted. The words should be converted to lowercase, then look for their frequency. – srikavineehari Oct 15 '17 at 07:49
0

This might be what you have in mind.

>>> counts = {}
>>> sentence =  "The boy jumped over the other boy"
>>> for word in sentence.lower().split():
...     if word in counts:
...         counts[word]+=1
...     else:
...         counts[word]=1
...         
>>> [word for word in counts if counts[word]==1]
['other', 'jumped', 'over']
>>> set([word for word in counts if counts[word]==1])
{'other', 'jumped', 'over'}

But using defaultdict from Collections, as someone else suggested, is nicer.

Bill Bell
  • 21,021
  • 5
  • 43
  • 58
0
s='The boy jumped over the other boy'
def func(s):
    l=[]
    s=s.split(' ')  #edit for case-sensitivity here
    for i in range(len(s)):
        if s[i] not in s[i+1:] and s[i] not in s[i-1::-1]:
            l.append(s[i])
    return set(l)  #convert to set and return
print(func(s))

this should work pretty fine.

check for each element whether any element matches it it in the list ahead or behind it if not then append it.

if you do not want case sensitivity then you can add s=s.lower() or s=s.upper() before splitting it.

TubbyStubby
  • 137
  • 3
  • 13
  • Going through the entire word list for every word makes this an O(n^2) algorithm, which can get pretty slow as the input gets bigger. Using a dictionary to count the number of occurrences would scale to large inputs a lot better. – Bass Oct 03 '17 at 23:14
0

You can try this:

s = "The boy jumped over the other boy"
s1 = {"jumped", "over", "other"}
final_counts = [s.count(i) for i in s1]

Output:

[1, 1, 1]
Ajax1234
  • 69,937
  • 8
  • 61
  • 102
0

Try this.

>>> sentence = "The boy jumped over the other boy"
>>> set(word for word in sentence.lower().split() if sentence.count(word) == 1)
{'other', 'over', 'jumped'}
>>> 

Edit: This is easier to read:

>>> sentence = 'The boy jumped over the other boy'
>>> words = sentence.lower().split()
>>> uniques = {word for word in words if words.count(word) == 1}
>>> uniques
{'over', 'other', 'jumped'}
>>> type(uniques)
<class 'set'>
srikavineehari
  • 2,502
  • 1
  • 11
  • 21