0

My thought was to get rid of how users are constantly using seek(0) to reset the text file reading.

So instead I've tried to create a MyReader that's an collections.Iterator and then using .reset() to replace seek(0) and then it continues from where it last yielded by retaining a self.iterable object.

class MyReader(collections.Iterator):
    def __init__(self, filename):
        self.filename = filename
        self.iterable = self.__iterate__()

    def __iterate__(self):
        with open(self.filename) as fin:
            for line in fin:
                yield line.strip()

    def __iter__(self):
        for line in self.iterable:
            yield line

    def __next__(self):
        return next(self.iterable)

    def reset(self): 
        self.iterable = self.__iterate__()

The usage would be something like:

$ cat english.txt
abc
def
ghi
jkl

$ python

>>> data = MyReader('english.txt')
>>> print(next(data))
abc
>>> print(next(data))
def
>>> data.reset()
>>> print(next(data))
abc

My question is does this already exist in Python-verse somewhere? Esp. if there's already a native object that does something like this, I would like to avoid reinventing the wheel =)

If it doesn't exist? Does the object look a little unpythonic? Since it says it's an iterator but the true Iterator is actually the self.iterable and the other functions are wrapping around it to do "resets".

alvas
  • 115,346
  • 109
  • 446
  • 738
  • Did you even test your snippet? It is totally not executable. Missing `self` in `__init__`, `filename` in `__iterate__` is undefined. – Sraw Feb 12 '18 at 10:15
  • Sorry I was abstracting from a bigger class. Now it should work. – alvas Feb 12 '18 at 10:17
  • 3
    File-like objects already do something like this --you get the same output if you do `data = open('english.txt')` and `data.seek(0)` instead of `data.reset()`. What does using `data.reset()` instead of `data.seek(0)` buys you? – Stop harming Monica Feb 12 '18 at 10:37
  • Ah so instead of `Iterator` I could have inherit from `File`? Hmmm, the object does many more things though =) – alvas Feb 12 '18 at 10:47

2 Answers2

3

I think it depends on what is your real situation. Let's say if you just want to get rid of file.seek(0), it can be simple:

class MyReader:
    def __init__(self, filename, mode="r"):
        self.file = open(filename, mode)

    def __enter__(self):
        return self

    def __exit__(self, exc_type, exc_value, traceback):
        self.close()

    def __iter__(self):
        self.file.seek(0)
        for line in self.file:
            yield line.strip()

    def close(self):
        self.file.close()

You can even use it like a normal context manager:

with MyReader("a.txt") as a:
    for line in a:
        print(line)
    for line in a:
        print(line)

output:

sdfas
asdf
asd
fas
df
asd
f
sdfas
asdf
asd
fas
df
asd
f
Sraw
  • 18,892
  • 11
  • 54
  • 87
1

I have a couple of criticisms of your MyReader class. I was going to post an alternative that's a context manager but Sraw beat me to it. ;)

You shouldn't use names that start and end with double underscores like __iterate__. Such names are essentially reserved for the language implementors, and if an official __iterate__ magic method is added to the language your code will break. If you want a private method, you could name it _iterate.

There is a little problem with that __iterate__ method: its with block is only exited when the file has been completely read for the current self.iterable, so if the MyReader instance gets reset then you have an old open file sitting around, consuming a file descriptor. Sure, it'll get closed eventually, when the program exits (or you delete the MyReader instance), but it's messy IMHO.

Also, I'm not totally happy with the yield line.strip(). Sure, it's convenient most of the time when you're reading a text file, but in some cases the caller may want to look at any leading or trailing white space, and you've taken that option away from them.

BTW, that __iter__ method is redundant: your class still does what its supposed to do if you eliminate that method.

PM 2Ring
  • 54,345
  • 6
  • 82
  • 182