In my situation, I have a main processing Python script that creates a class (FileIterator) which will iterate through a large data file line by line.
class FileIterator:
def read_data(self, input_data):
with open(input_data, 'r') as input:
for line in input:
<perform operation>
What I am trying to do is to replace "perform operation" with a return command (or substitute) to return the line back to the main script so that I can do operations on the line outside of the FileIterator.
main_process.py
import FileIterator
import Operations
def perform_operations():
iterator = FileIterator()
operator = Operations()
line = iterator.read_data('largedata.txt')
operator.do_something(line)
Is there a suitable replacement for read_data() that will still allow me to read line by line without storing the whole entire file into memory AND be able to either save the line value into the object attribute self.line or return it to the calling script?
Please let me know if more details about the design is necessary to reach a solution.
EDIT: What I'm looking for is to limit FileIterator's responsibility to reading large files. The script that manages FileIterator should be responsible for taking each line and feeding these lines to the class Operations (for simplicity since I will have multiple classes that will need to act on this line).
Think of this design as an assembly line structure where the FileIterator's job is to chop up the file. There are other workers that will take the results from FileIterator and perform other tasks to it.
EDIT 2: Changing title because I feel it was misleading and people are upvoting the answer that was basically just a copy paste of my question.