Using Memoize in python (2.7)

Question

I do not want to extract files on the disk but keep the final .txt in memory and parse the file. I can't find anything using Memoize in python 2.7.

.zip -> .gz -> .txt(data needs to be parsed)

My second choice it unzip and parse the .txt file data. Any thoughts?

I think you can unzip part of the zip, related question here: https://stackoverflow.com/questions/339053/how-do-you-unzip-very-large-files-in-python — scriptboy, Feb 09 '18 at 06:35
@scriptboy unzip is ok, Basically I want to avoid extracting to the disk but keeping it in memory and parse the text file. — Vikas Periyadath, Feb 09 '18 at 06:37
What about write to io.BytesIO? https://docs.python.org/2/library/io.html#buffered-streams — Haochen Wu, Feb 09 '18 at 06:41
@HaochenWu I just want to know how we can solve it using memoize. So any way i will deeply go into your link, because I don't know much about those buffering . Thanks — Vikas Periyadath, Feb 09 '18 at 06:46
Do you have more context? Memoize is mostly used to store the return value of a function for some fixed args. You will still need something to host the return value, which I suggest to use io.BytesIO here. — Haochen Wu, Feb 09 '18 at 06:50

Haochen Wu · Answer 1 · 2018-02-09T06:59:20.110

2

You can unzip the file and write it to an io.BytesIO object, which is essentially an in memory file.

https://docs.python.org/2/library/io.html#buffered-streams

You can then use any function that works for a regular file such as read, seek etc.

This case you get a virtual file that works for any format. If you are certain about the txt is the only thing you are going to use. io module also provides other pure text streams.

edited Feb 09 '18 at 06:59

answered Feb 09 '18 at 06:46

Haochen Wu

1,753
1
17
24

So when extract the file where it is going to keep ? in disk ? – Vikas Periyadath Feb 09 '18 at 06:47
No, It's an object in memory, but you can treat it as an opened file in the sense that you can use file functions. – Haochen Wu Feb 09 '18 at 06:49
Basically you can use what @scriptboy posted, but instead of decompress to disk, decompress to this in memory object. If you have the code you decompress the zip, I can show you how to use this instead of a regular file. – Haochen Wu Feb 09 '18 at 06:54
So basically it will be a chunk of RAM right ? and it will be deleted after task completion ?. – Vikas Periyadath Feb 09 '18 at 06:59
Thanks for your effort, I will try this one .upvoted. – Vikas Periyadath Feb 09 '18 at 07:05

Using Memoize in python (2.7)

1 Answers1