I have this Python 3 script that uses the 'collections' module to simply print out common words in a block of text and the number of times those words appear in that text.
word_counter = collections.Counter(text)
for word, count in word_counter.most_common(10):
print(word, ": ", count)
For example, it may print this out:
- red : 5
- sun : 2
- planet: 10
- moon : 7
- hydrogen : 22
I have another script that uses the matplotlib library and generates a bar plot graph:
words = ('red', 'Sun', 'planet', 'moon', 'hydrogen')
y_pos = np.arange(len(words))
wordCount = [5,2,10,7,22]
plt.bar(y_pos, wordCount, align='center', alpha=0.5)
plt.xticks(y_pos, words)
plt.ylabel('Count')
plt.title('Common Word Count')
plt.savefig('wordcount.png')
plotImage = "wordcount.png"
htmldata = """
<div>
<img src="{plotImage}" />
</div>""".format(plotImage = plotImage)
print(htmldata)
So you can see I put static data in the that script for the words and the wordCount.
Is there a way to combine the two scripts I have so that they work together? So the "words" and "wordCount" variable would be populated with data from the 'for word, count...' loop?
That way I could feed it whatever text or paragraph I wanted and not have to hard-code any values.
I tried to add these two lines under the 'for word, count ... ' loop:
myWords = myWords + word + ","
myCount += [count]
But that throws this error:
ValueError: shape mismatch: objects cannot be broadcast to a single shape
Thanks!