0

I'm storing the call to a class Task in an array in a .dat file. I'd like to read this file and reconstruct the class calls.

Here's the class that I'm using right now:

class Task:

    def __init__(self, name, timespent):
        self.name = name
        self.timespent = timespent

    def __repr__(self):
        return repr('Task("%s",%s)'%(self.name, self.timespent))

Here's the reading from the file:

task_list = []
with open("task_list2.dat", "r") as file:
    task_list = eval(file.readline())

Here's the writing to the file:

with open("task_list2.dat", "w") as outFile:
    print(repr(task_list), file = outFile)

And here's the contents of the file: ['Task("class",20)'] Where "class" is the name of the task.

I understand that the problem has to do with the single quotes around 'Task("class",20)' but I have no clue how to get rid of them. The error message I get says something along the lines of: "str object has no attribute 'name'"

How can I remove those quotes so that I can reconstruct the classes the next time that I read the file?

Ivan Kelber
  • 336
  • 4
  • 13
  • Remove the initial quotes in your `repr` function. – SevenBits Nov 07 '13 at 00:10
  • 1
    i'm curious, instead of evaling what is in the file (which is very dangerous if someone where to replace your file contents with malicious code) why not store the task to be executed/arguments as JSON and parse that? – Mike McMahon Nov 07 '13 at 00:10
  • @MikeMcMahon: The problem with JSON is that it isn't a complete serializer for anything but float, bool, unicode, NoneType, and lists and dicts made up recursively of the above. In particular, if you want to serialize objects of your own classes, you have to write a serializer on top of it. Using pickle, YAML, jsonpickle, or some other format that handles custom types means you don't need to do that. – abarnert Nov 07 '13 at 00:18
  • 2
    @JoranBeasley: He's not asking how to remove the double quotes around `class`, he's asking how to remove the single quotes around the whole thing. – abarnert Nov 07 '13 at 00:19
  • @abarnert thanks ... I was confused about that :P – Joran Beasley Nov 07 '13 at 00:22
  • @abarnert absolutely, but the point here is that calling `eval(filecontents)` is very unsafe and just about anything would be better/safer. – Mike McMahon Nov 07 '13 at 00:32
  • @MikeMcMahon: Yes, a "Why you don't want to use `repr`/`eval`, and what options you have instead" would be a great FAQ article somewhere. – abarnert Nov 07 '13 at 01:55
  • Since the same thing just came up again elsewhere, I wrote a [blog post](http://stupidpythonideas.blogspot.com/2013/11/repr-eval-bad-idea.html) about it. But I may have given short shrift to some of the other alternatives between JSON and pickle. – abarnert Nov 07 '13 at 23:28

1 Answers1

5

You really, really, really don't want to try to use repr and eval as a serialization format.

If you just used, say, pickle, you wouldn't have this problem at all:

with open("task_list2.dat", "wb") as outFile:
    pickle.dump(task_list, outFile)

with open("task_list2.dat", "rb") as file:
    task_list = pickle.load(file)

Much simpler, yes?


But if you want to know how to solve the immediate problem instead of making it irrelevant: You've got multiple problems in your __repr__ method, all of which need to be fixed if you want it to be round-trippable.

  • You generate a string representation… and then call repr on it. You want to return the string representation, not a string representation of the string representation. Just leave out the repr.
  • You should always delegate to the repr of sub-objects, not the str. If you're using %-formatting, that means using %r rather than %s.
  • Don't try to add quotes around things. That may happen to work if the object itself has no quotes, backslashes, invisible characters, etc. in it, but why rely on that? If you think you need quotes, it's pretty much always a sign that you broke the previous rule, and you should fix that instead.

Here's how you can write a round-trippable repr for this class:

def __repr__(self):
    return 'Task(%r, %r)' % (self.name, self.timespent))

And you can verify that it does what you want:

>>> t = Task('task name', 23.4)
>>> t
Task('task name', 23.4)
>>> eval(repr(t))
Task('task name', 23.4)

Of course in your particular example, just fixing the first problem (removing the spurious call to repr) would have gotten rid of the single quotes and made that particular example work. You could also hack around that on the read side by calling eval twice. Or, for this particular example, even by calling eval(s[1:-1]) or eval(s.strip("'")). But any "fix" like that is just going to make it harder to debug the general problems you're going to run into once you have, e.g., a name that isn't as simple as a single all-ASCII-letter word.

abarnert
  • 354,177
  • 51
  • 601
  • 671