I would like to filter subdirectories (skip them) while creating tar(gz) file with tarfile (python 3.4).
Files on disk:
- /home/myuser/temp/test1/
- /home/myuser/temp/test1/home/foo.txt
- /home/myuser/temp/test1/thing/bar.jpg
- /home/myuser/temp/test1/lemon/juice.png
- /home/myuser/temp/test1/
Tried to compress /home/myuser/temp/test1/
by tarfile.add()
.
I use with- and without-path modes. With full path it's OK, but with short path I have this problem:
directory exclusion doesn't work because tarfile.add() passes the arcname
parameter to filter method - not name
parameter!
archive.add(entry, arcname=os.path.basename(entry), filter=self.filter_general)
Example:
file: /home/myuser/temp/test1/thing/bar.jpg
-> arcname = test1/thing/bar.jpg
So because of /home/myuser/temp/test1/thing
element in exclude_dir_fullpath
, the filter method should exclude this file, but it can not because filter method gets test1/thing/bar.jpg
.
How could I access tarfile.add()'s 'name' parameter in filter method?
def filter_general(item):
exclude_dir_fullpath = ['/home/myuser/temp/test1/thing', '/home/myuser/temp/test1/lemon']
if any(dirname in item.name for dirname in exclude_dir_fullpath):
print("Exclude fullpath dir matched at: %s" % item.name) # DEBUG
return None
return item
def compress_tar():
filepath = '/tmp/test.tar.gz'
include_dir = '/home/myuser/temp/test1/'
archive = tarfile.open(name=filepath, mode="w:gz")
archive.add(include_dir, arcname=os.path.basename(include_dir), filter=filter_general)
compress_tar()