use this, but find in a text file, I am not sure how to do this.
len(max(words, key=len))
does anyone know how I can accomplish this?
also how do I find out how many times a 6 or 7 length word appears in a text file?
use this, but find in a text file, I am not sure how to do this.
len(max(words, key=len))
does anyone know how I can accomplish this?
also how do I find out how many times a 6 or 7 length word appears in a text file?
All you have to do is identify the expected input, which I assume is mainly words
, and think about how you can read a file that outputs what would be expected as words
.
I take a wild guess that words
could safely be a list
of str
. So now that we have identified the data structure of input, let's try to read a sample file that eventually gives you this data structure as output, as words
.
Assume you have a plain file with content, named sample.txt
:
a
bc
def
Your code to read it could be (very barebone)"
with open('sample.txt') as f:
words = f.readlines()
print len(max(words, key=len))
Now keep in mind that you may encounter various obstacles such as different file format, clean out empty lines from the text file, etc etc, and you're welcome to read the official Python documentation to dive deeper. Hope this gets you a good starting point.
Sounds like you need help opening and reading the text file:
with open('words.txt', 'r') as words_file:
words = words_file.read().split()
print len(max(words, key=len))
First, you read the file. Then, you get a list of words from the text by splitting on spaces, which works like this:
>> "This is a test.".split()
['This', 'is', 'a', 'test']
You should note that this doesn't handle punctuation (the longest word in "This is a test." would be "test.", or 5 chars), so if you need to filter out punctuation, that would be a separate step.
To your follow up edit,
with open('textfile.txt') as f:
words = f.read().split()
sizes = list(map(len,words))
print('Maximum word length: {}'.format(max(sizes)))
print('6 letter count: {}'.format(sizes.count(6)))
from itertools import chain
with open('somefile') as fin:
words = (line.split() for line in fin)
all_words = chain.from_iterable(words)
print max(all_words, key=len)
What this does is take the input file, build a generator that splits lines by whitespace, then chains that generator for input to max
Given your edit, then:
from itertools import chain
from collections import Counter
with open('somefile') as fin:
words = (line.split() for line in fin)
all_words = chain.from_iterable(words)
word_lengths = Counter(len(word) for word in all_words)
print word_lengths
And work from that...