0

I have a large zip file of size nearly 8GB, which consists of many (~5.6 million) small files (~1-20 kb) each. I tried extracting this with linux unzip and its too slow. Some other answers suggested other linux packages that could do faster, but I can't install any of those since I don't have sudo access to the machine.

I was wondering if there was a way to use multiple cores in python to do this faster? I am a bit new to this and I checked out another question on the same topic, (python-parallel-processing-to-unzip-files), but the author mentioned it did not improve the speed.

Anyone have suggestions on what I should do?

1 Answers1

2

Rolling your own unzip code is unlikely to result in improved performance compared to the libraries/programs typically used for that. If there are other programs that you think would unzip the file faster than the standard unzip command, you can install them from source without sudo. Just build them, but then skip the actual "install" step. The executable for the program should be available locally in some build directory.

Will Da Silva
  • 6,386
  • 2
  • 27
  • 52