4

I have a directory with 27K files, I would like to tar them into multiple INDEPENDENT tar files, each will have 5000 files and the last one, will obviously have 2K (27K is not dividable by 5).

What is the fastest/easiest way to do this?

voretaq7
  • 79,879
  • 17
  • 130
  • 214
soulSurfer2010
  • 307
  • 4
  • 10

2 Answers2

9

First create files with the filenames for each archive:

find <directory> | split -l 5000 - files.

Then create the tars:

for f in files.*; do tar -cf $f.tar --files-from $f; done

Untested but the basic idea should work.

Rob Wouters
  • 1,927
  • 1
  • 11
  • 4
  • 1
    This is probably the quickest (in terms of typing) / most portable (should work on any unix - `find`, `split` and `for` loops in Bourne shells are everywhere) solution :) – voretaq7 Jan 26 '12 at 07:47
  • Use "find -type f" so that the pathnames passed into tar don't include directories. Alternative: for small split sizes, you could could use xargs insteads of split and avoid all the temp files. – Liudvikas Bukys Jan 26 '12 at 14:25
0
#!/bin/bash
files=( dir/* )

n=1
for ((i = 0; i < ${#files[@]}; i+=5000)); do 
  IFS=$'\n' tar cvzf foo_$((n++)).tgz --files-from - <<<"${files[*]:i:5000}"
done

Explanation

Create an array called files that contains all contents of directory dir. Iterate over the array increasing the iterator 5000 at a time. Use the offset parameter expansion on the array to do a sliding window of up to 5000 items at a time "${files[@]:i:5000}" and pass that to tar and increment the file name by one $((n++)) on each call.

SiegeX
  • 567
  • 1
  • 6
  • 16