I would like to read a text file word by word when I need it. Like the ifstream
in C++. I mean, I want to open the file, then I read the next word from it when I need it, and then close it. How do I do that?
Asked
Active
Viewed 101 times
-1

WhoCares
- 225
- 1
- 5
- 16
-
@DerekO No. Anyway, thanks for ban. – WhoCares Jan 26 '22 at 22:24
-
Welcome back to Stack Overflow. As a refresher, please read [ask] and https://meta.stackoverflow.com/questions/261592. – Karl Knechtel Jan 26 '22 at 22:56
1 Answers
0
You can write a generator function that'll—
- Read the contents of the file as lines.
- Find and save all the words in an iterator.
- Yield words from the iterator one by one.
Consider this file foo.txt
:
This is an example of speech synthesis in English.
This is an example of speech synthesis in Bangla.
The following code returns the words one by one. However, it still reads the entire file at once and not word by word. That's because you'll have to track the cursor position line by line and then word by word. This can become even more expensive than reading the entire file at once or reading it chunk by chunk.
# In < Python3.9 import Generator from the 'typing' module.
from collections.abc import Generator
def word_reader(file_path: str) -> Generator[str, None, None]:
"""Read a file from the file path and return a
generator that returns the contents of the file
as words.
Parameters
----------
file_path : str
Path of the file.
Yields
-------
Generator[str, None, None]
Yield words one by one.
"""
with open(file_path, "r") as f:
# Read the entire file as lines. This returns a generator.
r = f.readlines()
# Aggregate all the words from all the sentences in another generator.
words = (word for sentence in r for word in sentence.split(" ") if word)
# This basically means: 'for word in words; yield word'.
yield from words
if __name__ == "__main__":
wr = word_reader("./foo.txt")
for word in wr:
# Doing some processing on the final words on a line.
if word.endswith(".\n"):
word = word.replace(".\n", "")
print(word)
This prints:
This
is
an
example
of
speech
synthesis
in
English
...
You can read the file chunk by chunk and then call this function to yield the words one by one.

Redowan Delowar
- 1,580
- 1
- 14
- 36