I have a corpus of text containing sentences.
I wish to count the number of occurrences of each word and avoid adding any word more than once (e.g. Multiple occurrences of ',' must be added once to return something like ',': 2047
)
Desired output:'partner': 7, 'meetings': 7, '14': 7, 'going': 7,
etc.
I realize that I need to use a set()
to avoid duplicates. But I don't know how. Currently, I am avoiding adding elements that are already in the list by saying append only if not already in occurrences
This however isn't working as I am getting ',':2047
multiple times in the result.
I am avoiding list comprehensions in the sample code to increase reader's comprehension! :P
Counting occurrences of words[i] in words
occurrences = []
for i in range(1, words.__len__() - 1):
if words[i-1] not in occurrences:
occurrences.append((words[i - 1], words.count(words[i - 1])))
print(occurrences)