1

I'm having trouble with getting my data from my shared hosting. I've approximately 20 GB folder that contains about 40.000 images. I tried archive that folder by splitting archive files:

tar -cvpj 'home/public_html/images/'/ | split -d -b 100m - images.tar.bz2.

It is working fine but the problem is the process takes too long and I guess my hosting provider is killing the process in the middle of archiving process.

So I couldn't use archived files because when I try to extract it is giving corrupt message, when I run again this command it tries archive again all files and overwriting the previously archived files.

And now I decided to archive it by their date for example every archive contains only the files which are uploaded in one month. Unfortunately, I've tried couple of commands but I could not find a way.

I found the question: How do you only tar files in a directory based on a specific file name? but I need with specific date range.
How can I archive/compress and filter files by their date pattern? Or are there any other ways to get my files from there I tried using cPanel but it skipped that folder.

Jama
  • 121
  • 1
  • 1
  • 4

1 Answers1

6

I think I would actually do this using find and then pass that input into tar. Using your example, let's assume you want files that are between 60 and 90s days old.

find /home/public_html/images -type f -daystart -mtime -90 -and -mtime +60 -print0 | xargs -0 tar -Ajf images_60-90.tar.bz2

This will list all the files that were last modified more than 60 days ago and less than 90 days ago and place those in the tarball named images_60-90.tar.bz2. My use of -print0 and xargs are mostly to protect yourself from files with spaces in the names, and in case there are so many files that they go over the command line maximum length (which can be found by running the command getconf ARG_MAX). I haven't tested that command, and I don't know what happens if you use the append option when the file doesn't exist, so you may have to do more tweaking.

If, however, you know that there are no spaces in any file names, and there will be fewer files than the value of ARG_MAX, you can simply your command a bit.

find /home/public_html/images -type f -daystart -mtime -90 -and -mtime +60 tar -cjf images_60-90.tar.bz2

Scott Pack
  • 14,907
  • 10
  • 53
  • 83