8

The problem: I have a back-end process that at some point he collect and build a big tar file. This tar receive few directories and an exclude files. the process can take up to few minutes and i want to report in my front-end process (GUI) about the progress of the taring process (This is a big issue for a user that press download button and it seems like nothing is happening...).

i know i can use -v -R in the tar command and count files and size progress but i am looking for some kind of tar pre-run mode / dry run to help me evaluate either the expected number of files or the expected tar size.

the command I am using: tar -jcf 'FILE.tgz' 'exclude_files' 'include_dirs_and_files'

10x for everyone who is willing to assist.

yanger
  • 227
  • 1
  • 3
  • 14

3 Answers3

15

You can pipe the output to the wc tool instead of actually making a file.

With file listing (verbose):

[git@server]$ tar czvf - ./test-dir | wc -c
./test-dir/
./test-dir/test.pdf
./test-dir/test2.pdf
2734080

Without:

[git@server]$ tar czf - ./test-dir | wc -c
2734080
doub1ejack
  • 10,627
  • 20
  • 66
  • 125
  • 3
    Good catch. For info to future readers, the `-` tells tar to output to stdout (where it'll be piped to wc). – dr_ Oct 14 '16 at 07:35
  • 1
    To confirm: Is this solution a "tar pre-run mode / dry run to help me evaluate either the expected number of files or the expected tar size" (without actually zipping/creating the tar)? – GuSuku Jul 17 '19 at 19:45
1

Why don't you run a

DIRS=("./test-dir" "./other-dir-to-test")
find ${DIRS[@]} -type f | wc -l

beforehand. This gets all the files (-type f) one per line and counts the number of files. DIRS is an array in bash, so you can store the folders in a variable

If you want to know the size of all the stored files, you can use du

DIRS=("./test-dir" "./other-dir-to-test")
du -c -d 0 ${DIRS[@]} | tail -1 | awk -F ' ' '{print $1}'

This prints the disk usage with du, calculates a grand total (-c flag), gets the last line (example 4378921 total), and uses just the first column with awk

yunzen
  • 32,854
  • 11
  • 73
  • 106
  • 1
    I note that tar and find don't always agree on the number of files; e.g., on my Mac, tar seems to skip files indexed for Finder preview wheres find does not. – wcochran Aug 14 '17 at 18:41
0

You can use a specific trick of tar. If you output to /dev/null, it doesn't read the file content, so:

root@grigio:/bidone# time tar cpf /dev/null --totals Amerighuccio/
Total bytes written: 26924871680 (26GiB, 11GiB/s)

real    0m2,320s
user    0m0,914s
sys     0m1,401s