0

I have a program that returns rows of allowed arrays (as follows) which are stored in a file (say output.txt), using python program.py>output.txt on the windows 7 command line:

0 , 0 , 1 , 1 , 6
0 , 0 , 1 , 2 , 1
0 , 0 , 1 , 4 , 1
0 , 0 , 1 , 6 , 1
0 , 0 , 2 , 0 , 0

I later use output.txt as input for another program, which reads it row by row into an array. The output.txt file is about 1.5 GB. I can use 7-zip to reduce it to 33 MB.

I risk running out of memory for the larger output data sizes. I'm aware that we can read zipped files in Python so is it possible to get zipped outputs as well?

My program has multiple nested loops, so I can make a small zip file after a cycle in the innermost loop and append it to another which gradually becomes larger as the loop traverses ahead.

DS R
  • 235
  • 2
  • 13
  • Are you simply asking how to compress the result before writing it to a file, or how to compress part of the result during your computations before running out of memory? – Reti43 Oct 12 '17 at 10:25
  • "Compress part of the result during computations before running out of memory" sums it up perfectly. I'm afraid of running out of hard disk space if I let the uncompressed output writing continue. – DS R Oct 12 '17 at 10:29
  • 2
    You should not burden your application with such things. The UNIX philosophy would be something along the lines of `python program.py | gzip > output.gz`. – deceze Oct 12 '17 at 10:30
  • 1
    It wasn't clear because I'd expect someone to run out of memory before running out of hdd space. And "is it possible to get zipped outputs" seems to ask how to simply compress an output, not compress chunks before running out of memory. Regardless, to compress your output `import gzip` and `gzip.compress(bytestream)`. – Reti43 Oct 12 '17 at 10:33
  • Thanks I'll give this a try. – DS R Oct 12 '17 at 10:38
  • 1
    [How to append to a gzip file](https://stackoverflow.com/questions/18097107/python-gzip-appending-to-file-on-the-fly) might also be of interest. If you show us exactly how your loops are structured and how you store your data in a python object, we could give explicits answer with code. – Reti43 Oct 12 '17 at 10:44
  • My current code: https://drive.google.com/file/d/0B4l-nvGOApEoZlNhczIwQ2VNUDg/view?usp=sharing Generates a sequence of 10 element arrays based on a certain rule and I run it on the WIndows command line – DS R Oct 12 '17 at 10:49
  • 1
    Zipping strings in Python is easy. Still, I tend to agree with deceze. Let the shell handle the compression. Doing compression / decompression within the application makes sense if the application's task requires manipulation of data stored in compressed files, eg an epub file editor. Or if you're processing image / media files, which generally used compressed file formats, but even then you should be using a library that handles the compression for you. – PM 2Ring Oct 12 '17 at 10:52

0 Answers0