102

I have a tar.gz-file with the following structure:

folder1/img.gif
folder2/img2.gif
folder3/img3.gif

I want to extract the image files without the folder hierarchy so the extracted result looks like:

/img.gif
/img2.gif
/img3.gif

I need to do this with a combination of Unix and PHP. Here is what I have so far, it works to extract them to the specified directory but keeps the folder hierarchy:

exec('gtar --keep-newer-files -xzf images.tgz -C /home/user/public_html/images/',$ret);
Gary van der Merwe
  • 9,134
  • 3
  • 49
  • 80
Ben Jackson
  • 1,427
  • 3
  • 14
  • 24
  • I assume you're wanting something other than, manually iterate over each directory, move the files up to your path, and delete the empty folders? I don't know there's a `--flatten` option hidden anywhere, but I could be mistaken. – Jared Farrish Jan 12 '13 at 17:21

5 Answers5

173

You can use the --strip-components option of tar.

 --strip-components count
         (x mode only) Remove the specified number of leading path ele-
         ments.  Pathnames with fewer elements will be silently skipped.
         Note that the pathname is edited after checking inclusion/exclu-
         sion patterns but before security checks.

I create a tar file with a similar structure to yours:

$tar -tf tarfolder.tar
tarfolder/
tarfolder/file.a
tarfolder/file.b

$ls -la file.*
ls: file.*: No such file or directory

Then extracted by doing:

$tar -xf tarfolder.tar --strip-components 1
$ls -la file.*
-rw-r--r--  1 ericgorr  wheel  0 Jan 12 12:33 file.a
-rw-r--r--  1 ericgorr  wheel  0 Jan 12 12:33 file.b
ericg
  • 8,413
  • 9
  • 43
  • 77
  • 3
    Does strip-components have a maximum number you can use? And what happens if the .tar only contains a hierarchy of one folder but strip-components is 2? Additionally, does strip-components change the names of those image files or just remove the folders? – Ben Jackson Jan 12 '13 at 18:25
  • 2
    I would suggest playing with it and finding out for your specific situation whether it will work for you. – ericg Jan 12 '13 at 19:05
  • 4
    I tried using a higher number than the directories structure contained and it deleted the files too. So you must know the exact number of directories to strip. – Weston Ganger Sep 03 '16 at 19:04
  • 1
    This is excellent! Just one thing though, what if we do not know how many components are to be stripped? and we only want to get the files and no folders at all? – Dhiraj May 20 '19 at 09:27
  • Caution: If there is a folder with no subfolder (already flat), this will mean nothing is extracted --> so be careful when applying it to a mixture of folders with and without structure – user1725306 Nov 28 '21 at 10:22
33

This is almost possible with tar alone, using the --transform flag, except that there's no way to delete the left over directories as far as I can tell.

This will flatten the entire archive:

tar xzf images.tgz --transform='s/.*\///'

The output will be

folder1/
folder2/
folder3/
img.gif
img2.gif
img3.gif

You will then need to delete the directories with another command, unfortunately.

ford
  • 10,687
  • 3
  • 47
  • 54
  • 2
    On RHEL 6.2 the [accepted answer](http://stackoverflow.com/a/14295994/86263) doesn't work, but this answer does (even when _creating_ an archive). :) yay! – bitcycle Mar 20 '14 at 18:12
  • I had been hunting for this for a while now. Great job! What if I don't want any folder at all and just the files to be extracted? – Dhiraj May 20 '19 at 09:31
  • 1
    This is great. As of now (version 1.29) it doesn't even create the directories during extraction. – Gerald Schneider Mar 24 '20 at 08:29
  • tar 1.23 is not creating the dirs either. – beluchin Jan 29 '21 at 19:54
22

Check the tar version e.g.

$ tar --version

If version is >= than tar-1.14.90 use --strip-components

tar xvzf web.dirs.tar.gz -C /srv/www --strip-components 2

else use --strip-path

tar xvzf web.dirs.tar.gz -C /srv/www --strip-path 2
Minimul
  • 4,060
  • 2
  • 21
  • 18
3

Find img*.gif in any sub folder of mytar.tar.gz and extract to ./

tar -zxf mytar.tar.gz --absolute-names --no-anchored img*.gif --transform='s:.*/::'

Find img*.gif in any of the 3 folders listed in this specific question in mytar.tar.gz and extract to ./

tar -zxf mytar.tar.gz --absolute-names --no-anchored img*.gif --transform='s:^folder[1-3]/::'

Pancho
  • 2,043
  • 24
  • 39
  • Thanks! fyi the `--absolute-names` option is not doing anything when extracting afaict. I also chose to use `--wildcards '*/filename-pattern'` rather than `--no-anchored 'filename-pattern'` to ensure I did not match foldernames in a deep hierarchy. Thanks for pointing me in this direction! – Barumpus Dec 12 '22 at 23:49
2

Based on @ford's answer. This one will extract it to the my_dirname folder. So that we can properly clear the empty folders without affected currently existing files.

tar xzf images.tgz --transform='s/.*\///' -C my_dirname
find my_dirname -type d -empty -delete
Weston Ganger
  • 6,324
  • 4
  • 41
  • 39