0

I have a zip file that contains a tar.gz file. I would like to access the content of the tar.gz file but without unzipping it

I could list the files in the zip file but of course when trying to untar one of those files bash says : "Cannot open: No such file or directory" since the file does not exist

for file in $archiveFiles;
        #do echo ${file: -4};
        do
          if [[ $file == README.* ]]; then
            echo "skipping readme, not relevant"
          elif [[ $file == *.tar.gz ]]; then
            echo "this is a tar.gz, must extract"
            tarArchiveFiles=`tar -tzf $file`
            for tarArchiveFile in $tarArchiveFiles;
                do echo $tarArchiveFile
                done;

          fi
    done;

Is this possible to extract it "on the fly" without storing it temporarily. I have the impression that this is doable in python

laloune
  • 548
  • 1
  • 9
  • 26
  • @kvantour in the answers given, you never store the whole uncompressed tar file in memory or anywhere. It *streams* through the processes, so only a small chunk needs to be in memory at a time. – slim May 24 '19 at 15:31

2 Answers2

2

You can't do it without unzipping (obviously), but I assume what you mean is, without unzipping to the filesystem.

unzip has -c and -p options which both unzip to stdout. -c outputs the filename. -p just dumps the binary unzipped file data to stdout.

So:

unzip -p zipfile.zip path/within/zip.tar.gz | tar zxf - 

Or if you want to list the contents of the tarfile:

unzip -p zipfile.zip path/within/zip.tar.gz | tar ztf - 

If you don't know the path of the tarfile within the zipfile, you'd need to write something more sophisticated that consumes the output of unzip -c, recognises the filename lines in the output. It may well be better to write something in a "proper" language in this case. Python has a very flexible ZipFile library function, and most mainstream languages have something similar.

slim
  • 40,215
  • 13
  • 94
  • 127
1

You can pipe an individual member of a zip file to stdout with the -p option

In your code change

tarArchiveFiles=`tar -tzf $file`

to

tarArchiveFiles=`unzip -p zipfile $file | tar -tzf -`

replace "zipfile" with the name of the zip archive where you sourced $archiveFiles from

pmqs
  • 3,066
  • 2
  • 13
  • 22