0

I have below files in a directory.

file001
file002
.
.
file009

I need to compress them into one and remove original/source files (file001 .. file009) so I can free up some disk space.

This is what I did: Archived all files into one using below command

tar -cvf file00.tar file00*

Then compressed the archived file using below command

xz file00.tar

previously I had archived and compressed using single command

tar -cJvf file00.tar.xz file00*

xz has done a fine job, compressed 10GB file into less than 400MB but I have several problems with these methods:

  1. Old/source files are not removed
  2. xz takes huge time

My question is, is there any way I can archive and compress multiple files into one using single command that'll also remove source files? Is there any other compression tool that's efficient as xz but works faster?

I've seen in some other sites that I can use multiple cores/threads to boost xz process but haven't tried myself.

Thanks in advance.

osgx
  • 90,338
  • 53
  • 357
  • 513
Hasan Rumman
  • 577
  • 1
  • 6
  • 16
  • You may try [`pixz`](https://github.com/vasi/pixz) which is parallel variant of `xz` tool (there is also `pxz` parallel variant of xz): `tar -Ipixz -cf foo.tpxz foo/` (there is `-I compressor` option of tar to select non-standard compressor). – osgx Mar 29 '17 at 00:45
  • You could try zstdmt. For 4 cores at high compression, it gives something like that : `tar -cvf - file00* | zstdmt -19 -T4 > file00.tar.zst` – Cyan Apr 09 '17 at 01:04

1 Answers1

3

You may try pixz which is parallel variant of xz tool (or pxz parallel variant of xz)):

tar -Ipixz -cf foo.tpxz foo/` 

There is -I compressor option of tar to select non-standard compressor, and it should be before f option as f option requires the string argument.

And for deletion there is option in GNU Tar https://www.gnu.org/software/tar/manual/html_node/remove-files.html (why it is in the "4.4 Options Used by --extract" section??)

Removing Files

See The section is too terse. Something more to add? An example, maybe?

--remove-files

Remove files after adding them to the archive.

I didn't test it, but it was listed in https://serverfault.com/questions/283355/correct-way-of-using-the-remove-files-option-with-tar (and https://superuser.com/questions/96860) again used before -f option:

tar --remove-files -cvfj archive.tar.bz2 archive/

So, combining both options (after installation of pixz; this is still can be opened by classic xz tool, and can be renamed as archive.tar.xz too):

tar  -Ipixz --remove-files -cJvf file00.tpxz file00*

or with xz:

tar  -Ixz --remove-files -cJvf file00.tar.xz file00*

PS: There are other parallel compressors: pigz (gzip format), pbzip2 and lbzip2 (bzip2 format, not so compact and slower than xz and gzip). There are also fast compressors like lz4/lz5 or facebook's zstd. And there is lrzip which can use several threads for compression and can find repetitions of data over long distance, and then compress using lzma (by default, xz-like method) or classic gz or faster LZO or very slow and effective ZPAQ.

Community
  • 1
  • 1
osgx
  • 90,338
  • 53
  • 357
  • 513