I was trying to implement a code that would allow me to find the 10 most frequent words in a text. I'm new at python, and am more used to languages like C#, java or even C++. Here is what I did:
f = open("bigtext.txt","r")
word_count = {}
Basicaly, my idea is to create a dictionary that contains the number of times that each word is present in my text. If the word is not present, I will add it to the dictionary with the value of 1. If the world is already present in the dictionary, I will increment its value by 1.
for x in f.read().split():
if x not in word_count:
word_count[x] = 1
else:
word_count[x] += 1
sorted(word_count.values)
Here, I will sort my dictionary by values (since I'm looking for the 10 most frequent worlds, I need the 10 words with the biggest values).
for keys,values in word_count.items():
values = values + 1
print(word_count[-values])
if values == 10:
break
Here is the part were it all fails. I know now for sure (since I sorted my dictionary by the value of the values). That my 10 most frequent words are the 10 last elements of my dictionary. I want to display those. So I decided to initialize values at 1
and to display my dictionary backward till values = 10
so that I won't need to display more than what I need. But unfortunately, I get this following error:
File "<ipython-input-19-f5241b4c239c>", line 13 for keys,values in word_count.items() ^ SyntaxError: invalid syntax
I do know that my mistake is that I didn't display my dictionary backwards correctly. But I don't know how to proceed elsewhere. So if someone can tell me how to properly display my last 10 elements in my dictionary, I would very much appreciate it. Thank You.