13

1) extract from large zip file

I want to extract files from a large zip file (30Gb+) on the linux server. There is enough free disk space.

I've tried jar xf dataset.zip. However, there's an error that push button is full, and it failed to extract all of the files.

I tried unzip, but zipfile corrupt.

Archive:  dataset.zip 
warning [dataset.zip]:  35141564204 extra bytes at beginning or within zipfile
(attempting to process anyway)
error [dataset.zip]:  start of central directory not found;
zipfile corrupt.
 (please check that you have transferred or created the zipfile in the
appropriate BINARY mode and that you have compiled UnZip properly)

I tried zip -FF dataset.zip --out data.zip, and there's an error that entry too big:

zip error: Entry too big to split, read, or write (Poor compression resulted in unexpectedly large entry - try -fz)

Is there anyway I can efficiently extract files from really large zip file?

2) extract certain files from a large zip file

If I only want some certain files from this large zip file, is there anyway I can extract only these files? For example, data1.txt from dataset.zip? It seems that I can't use any zip or unzip command (always have the zipfile corrupt problem).

Thanks!

Irene W.
  • 679
  • 1
  • 6
  • 15
  • Do you have enough free disk space where the unzipped files are being placed? Are any of the files, once unzipped, large enough to exceed the maximum single file size for your file system? – Eric J. Jul 17 '15 at 17:47
  • There is enough free disk space. I don't need all of the files for now. Is there anyway I can only extract certain files from the unzipped files? – Irene W. Jul 17 '15 at 17:53

6 Answers6

34

I've solved the problem. It turns out to be a zip corruption problem. I first fixed the file with:

zip -FF filename1.zip --out filename2.zip -fz

then unzip the fixed zipfile:

unzip filename2.zip

and have successfully extracted all the files!

Many thanks to Fattaneh Talebi for the help!

Irene W.
  • 679
  • 1
  • 6
  • 15
7

you can extract specific file from zip

$ unzip -j "zipedfile.zip" "file.txt"

file.txt is the file you want to extract from zipedfile.zip

Fattaneh Talebi
  • 727
  • 1
  • 16
  • 42
  • thanks for the answer. however, there's still error that: `Archive: msdata.zip warning [msdata.zip]: 35141564204 extra bytes at beginning or within zipfile (attempting to process anyway) error [msdata.zip]: start of central directory not found; zipfile corrupt. (please check that you have transferred or created the zipfile in the appropriate BINARY mode and that you have compiled UnZip properly)` I don't know if it's because of the filesize. – Irene W. Jul 17 '15 at 18:28
  • your welcome, first check your zip file to see if it is corrupt or not, and also run this command: "file yourzipfilename.zip" to see the type, then paste it here – Fattaneh Talebi Jul 17 '15 at 18:58
  • I looked at this url: http://ubuntuforums.org/showthread.php?t=1517262 there was a sentence that I think it's your answer: The computer doesn't know where the index of all the files start so thus it can't find the files inside the zip file. BUT you would better too look at that yourself because I'm not sure. – Fattaneh Talebi Jul 17 '15 at 19:04
  • `$ file msdata.zip` `msdata.zip: Zip archive data, at least v2.0 to extract` – Irene W. Jul 17 '15 at 19:05
  • so it's zip, I think your problem is corruption. let's look at that url. – Fattaneh Talebi Jul 17 '15 at 19:10
  • i tried to fix zipfile with `zip -FF`, however it turns out to be : ` Central Directory found... zip warning: Entry too big:MicrosoftAcademicGraph/PaperAuthorAffiliations.txt zip error: Entry too big to split, read, or write (Poor compression resulted in unexpectedly large entry - try -fz)` – Irene W. Jul 17 '15 at 19:45
  • i also tried to download the file again. didn't work out either. – Irene W. Jul 17 '15 at 19:47
  • it says that try -fz, lets try it. zip -fz – Fattaneh Talebi Jul 17 '15 at 19:47
  • i'm not so sure what -fz does. but it gives me this: `$ zip -fz msdata.zip` `Could not find: msdata.z01` ` Hit c (change path to where this split file is) ` `q (abort archive - quit)` `or ENTER (try reading this split again): ` – Irene W. Jul 17 '15 at 19:55
  • 1
    i tried `zip -FF msdata.zip --out outfile.zip -fz' and it worked!! now i've successfully extracted all of the files! thank u very much!!! – Irene W. Jul 17 '15 at 20:21
1

I had the similar kind of problem and it got solved by unar command.

unar file.zip

Usman
  • 21
  • 1
0

try extracting directories to retain control and know where you left off. eg: tar tv --wildcards -f siteRF.tar './Movies/*'

jobeard
  • 129
  • 6
0

I tried all the steps mentioned above to unzip the file, but failed miserably.

My last resort was to copy my zip file (11.1GB) into a hard drive and unzip it using 7 zip on Windows 8 OS.

Worked like a charm :D

Vinu Joseph
  • 965
  • 9
  • 11
-3

I also solved it in similar manner like Irene W did. It was a corrupted zip. I first fixed the file with:

zip -FF original_corrupted.zip --out fixed_file.zip -fz

then unzip the fixed zip file:

unzip fixed_file.zip
  • How is this different from the accepted answer? – GStav Jun 02 '21 at 10:43
  • Hi @GStav I didnt try to get someone else credit for that answer rather I mentioned that I also resolved the issue in the similar way. Because I dont have enough reputation to vote or comment so I mentioned in this way. – Bharat Balothia Jun 04 '21 at 15:39