80

When I invoke add() on a tarfile object with a file path, the file is added to the tarball with directory hierarchy associated. In other words, if I unzip the tarfile the directories in the original directories hierarchy are reproduced.

Is there a way to simply adding a plain file without directory info that untarring the resulting tarball produce a flat list of files?

Martin Thoma
  • 124,992
  • 159
  • 614
  • 958
theactiveactor
  • 7,314
  • 15
  • 52
  • 60

6 Answers6

103

Using the arcname argument of TarFile.add() method is an alternate and convenient way to match your destination.

Example: you want to archive a dir repo/a.git/ to a tar.gz file, but you rather want the tree root in the archive begins by a.git/ but not repo/a.git/, you can do like followings:

archive = tarfile.open("a.git.tar.gz", "w|gz")
archive.add("repo/a.git", arcname="a.git")
archive.close()
orodbhen
  • 2,644
  • 3
  • 20
  • 29
diabloneo
  • 2,607
  • 2
  • 18
  • 17
  • This is a better approach since the accepted answer will not work if you are trying to add directories. – Ganesh Hegde Jun 08 '16 at 09:59
  • 4
    `arcname ="a.git"` will create a folder `a.git` inside the archive. You can use `arcname =""` to archive files in the `repo/a.git` directory without creating a folder. – Comrade Che Jul 24 '19 at 10:56
  • @ComradeChe 's answer gives wrong result: resulting tar contains then just single file. Instead provide clear filename (without path) as arcname for each file added into the tar. – Alexey Antonenko Oct 12 '22 at 15:08
61

You can use tarfile.addfile(), in the TarInfo object, which is the first parameter, you can specify a name that's different from the file you're adding.

This piece of code should add /path/to/filename to the TAR file but will extract it as myfilename:

tar.addfile(tarfile.TarInfo("myfilename.txt"), open("/path/to/filename.txt"))
S.A.
  • 1,819
  • 1
  • 24
  • 39
Wim
  • 11,091
  • 41
  • 58
  • 45
    Also, it also works for `tar.add()`! To add whole tree, but with a diferent name, just do: `tar.add('/path/to/dir/to/add/', arcname='newdirname')` and then the tarfile will contain a directory with named 'newdirname', and with all it's contents untouched. – Armando Pérez Marqués Oct 31 '10 at 01:22
  • 21
    And if you want to save the files without all the directory structure. do `arcname='.'` – Giacomo Tagliabue Jun 20 '16 at 18:01
  • What is this file() function? How to import it? – Amith Chinthaka Jun 14 '17 at 12:06
  • `file()` is Python 2 only, `open()` is equivalent and works in both Python 2 and 3. I edited my answer to use `open` instead. – Wim Jun 15 '17 at 12:48
  • 3
    For some reason in my machine this creates only tar archive with empty files (the files are there, but empty). – Roland Pihlakas Jul 24 '18 at 13:23
  • 2
    Using `arcname='.'` gave me a `IsADirectoryError` when I tried to unzip and extract the content. Using the answer by @diabloneo below worked though. – rer Sep 24 '18 at 19:05
  • @RolandPihlakas You might have forgotten the second argument to `addfile` like me. You'll need to do `tar.addfile( tar.gettarinfo( filename ), open( filename, 'rb' ) )` or `tarfile.add( filename )` like in the other answer. – mxmlnkn Apr 12 '20 at 18:20
8

Maybe you can use the "arcname" argument to TarFile.add(name, arcname). It takes an alternate name that the file will have inside the archive.

Lauro Moura
  • 750
  • 5
  • 15
3

thanks to @diabloneo, function to create selective tarball of a dir

def compress(output_file="archive.tar.gz", output_dir='', root_dir='.', items=[]):
    """compress dirs.

    KWArgs
    ------
    output_file : str, default ="archive.tar.gz"
    output_dir : str, default = ''
        absolute path to output
    root_dir='.',
        absolute path to input root dir
    items : list
        list of dirs/items relative to root dir

    """
    os.chdir(root_dir)
    with tarfile.open(os.path.join(output_dir, output_file), "w:gz") as tar:
        for item in items:
            tar.add(item, arcname=item)    


>>>root_dir = "/abs/pth/to/dir/"
>>>compress(output_file="archive.tar.gz", output_dir=root_dir, 
            root_dir=root_dir, items=["logs", "output"])
muon
  • 12,821
  • 11
  • 69
  • 88
  • You should always guard a os.chdir with try finally going back to the old working directory as library code isn't expected to change the working directory. – schlamar Feb 23 '21 at 09:36
0

Here is the code sample to tar list of files in folder without adding folder:

    with tarfile.open(tar_path, 'w') as tar:
        for filename in os.listdir(folder):
            fpath = os.path.join(folder, filename)
            tar.add(fpath, arcname=filename)
Alexey Antonenko
  • 2,389
  • 1
  • 18
  • 18
-3

If you want to add the directory name but not its contents inside a tarfile, you can do the following:

(1) create an empty directory called empty (2) tf.add("empty", arcname=path_you_want_to_add)

That creates an empty directory with the name path_you_want_to_add.

Allen M
  • 1,423
  • 9
  • 15
  • The original post asked to include file(s) with no directory. Your answer answers a different question. Please modify your answer to answer the original post’s question. Or add this as a comment instead of an answer. – Allen M Feb 27 '21 at 04:51