I have a big gzip file (11GB) and I want to print as fast as possible the line that I want with Python. I have tried to do it with linecache.getline()
, but as the own function open the file, you are not able to open it with gzip
.
Asked
Active
Viewed 740 times
-1

Sergi Aguiló
- 27
- 3
-
1Let me restate your post in the form of a question, and you tell me if it matches what you want: "I have a very large text file, compressed with gzip. I expect there to be a line in the uncompressed file, and I need to verify that it's there as quickly as possible. I have tried
, but this doesn't work because – Jordan Singer Feb 22 '19 at 13:56. What am I doing wrong?"
1 Answers
0
linecache
expects to get a textfile. A file that has been compressed using gzip
is not a textfile. To do what you want requires two steps. (1) Unzip the file so that you have a textfile. (2) Use linecache
on the textfile. You can do both of those things in Python, but only one after the other.
I understand that you want to get at a specific line without having to decompress then entire zipfile. But that is not how zipfile compression works. There is unlikely to be anything in the compressed data that corresponds to the notion of a line of text.

BoarGules
- 16,440
- 2
- 27
- 44