0

This is the structure of my backup:

  • Backups are stored to a directory named cron_hourly_backup
  • Inside that directory a directory is created each day which is named with ddmmyyyy format.
  • In each of these directories there are 5-6 db backups which are dumped every hour through a cron-job, and every hour's backup files have unique name by using time stamp (ex: db1_000000.zip .... db5_000000.zip upto db1_230000.zip ... db5_230000.zip)

Now I want to programmatically delete all backup files older than 1 day (OR, keep today's and yesterday's all backup), But keep one latest db (of all 5 dbs) for each day. How can I achieve this?

Currently I'm doing this:

find . -type f \( -name "*_00*" \
-o -name "*_01*"-o -name "*_02*" \
-o -name "*_03*" -o -name "*_04*" \
-o -name "*_05*" -o -name "*_06*" \
-o -name "*_07*" -o -name "*_08*" \
-o -name "*_09*" -o -name "*_10*" \
-o -name "*_11*" -o -name "*_12*" \
-o -name "*_13*" -o -name "*_14*" \
-o -name "*_14*" -o -name "*_15*" \
-o -name "*_16*" -o -name "*_17*" \
-o -name "*_18*" -o -name "*_19*" \
-o -name "*_20*" -o -name "*_21*" \
-o -name "*_22*"  \) -delete

This works great, problem is

  1. if 23rd hour backup is not available for any day, then I will lose all files of that day.
  2. It will also delete today's and yesterday's backups.

Any suggestions on how to solve the above 2 issues is much appreciated.

arco444
  • 22,002
  • 12
  • 63
  • 67
Ash
  • 15
  • 7
  • Anything modifying them after creation? `find . -type f -name "db*" -mtime +1 -delete` ? – arco444 Sep 12 '17 at 11:37
  • No, but there are 5-6 db so I'm using * to include them all – Ash Sep 12 '17 at 11:42
  • Okay, so I can use -mtime for my 2nd problem. Any solution for 1st? I want to check the latest DB of each day, and delete all remaing backups. – Ash Sep 12 '17 at 11:45
  • I think this will be tricky without writing a script to do it. Might be possible using some trickery with an `-exec` in your find command but will become difficult to read – arco444 Sep 12 '17 at 11:50

1 Answers1

0

Not sure what "But keep one latest db (of all 5 dbs) for each day" means. If it means "for each day keep only the last (in lexicographic order) file", and if you have the coreutils date utility, a bash script like this could work (not tested):

#/usr/bin/env bash

declare -a l                         # array of backup files
bd=cron_hourly_backup                # backup dir
td=$( date +"%d%m%Y" )               # today
yd=$( date -d yesterday +"%d%m%Y" )  # yesterday
for n in "$bd"/*; do
    if [ ! -d "$n" ]; then
        continue # skip if not a directory
    fi
    if [[ "$n" == "$bd/$td" || "$n" == "$bd/$yd" ]]; then
        continue # skip if today or yesterday
    fi
    l=( $( ls "$n" ) ) # populate array
    # loop over all backup files except the last one
    for (( i = 0; i < ${#l[@]} - 1; i += 1 )); do
        echo "rm -f $n/${l[i]}" # comment when OK
#       rm -f "$n/${l[i]}"      # uncomment when OK
    done
done

If you want to keep the last of each dbN_* with 1<=N<=6, then you can use one more loop level (not tested):

#/usr/bin/env bash

declare -a l                         # array of backup files
bd=cron_hourly_backup                # backup dir
td=$( date +"%d%m%Y" )               # today
yd=$( date -d yesterday +"%d%m%Y" )  # yesterday
for n in "$bd"/*; do
    if [ ! -d "$n" ]; then
        continue # skip if not a directory
    fi
    if [[ "$n" == "$bd/$td" || "$n" == "$bd/$yd" ]]; then
        continue # skip if today or yesterday
    fi
    for (( j = 1; j <= 6; j += 1 )); do
        l=( $( ls "$n/db${j}_"* ) ) # populate array
        # loop over all backup files except the last one
        for (( i = 0; i < ${#l[@]} - 1; i += 1 )); do
            echo "rm -f ${l[i]}" # comment when OK
#           rm -f "${l[i]}"      # uncomment when OK
        done
    done
done
Renaud Pacalet
  • 25,260
  • 3
  • 34
  • 51