-1

I need to automate a clean-up of a Linux based FTP server that only holds backup files.

In our "\var\DATA" directory is a collection of directories. Any directory here used for backup begins with "DEV". In each "DEVxxx*" directory are the actual backup files, plus any user files that may have been needed in the course of maintenance on these devices.

We only want to retain the following files - anything else found in these "DEVxxx*" directories is to be deleted:

The newest two backups:  ls -t1 | grep -m2 ^[[:digit:]{6}_Config]  
The newest backup done on the first of the month:  ls -t1 | grep -m1 ^[[:digit:]{4}01_Config] 
Any file that was modified less than 30 days ago:  find -mtime -30  
Our good configuration file:  ls verification_cfg

Anything that doesn't match the above should be deleted.

How can we script this?

I'm guessing a BASH script can do this, and that we can create a cron job to run daily to perform the task.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Calab
  • 351
  • 4
  • 20

2 Answers2

1

Something like this perhaps?

{ ls -t1 | grep -m2 ^[[:digit:]{6}_Config] ;
  ls -t1 | grep -m1 ^[[:digit:]{4}01_Config] ;
  find -mtime -30 ;
  ls -1 verification_cfg ;
} | rsync -a --exclude=* --include-from=- /var/DATA/ /var/DATA.bak/
rm -rf /var/DATA
mv /var/DATA.bak /var/DATA
Ansgar Wiechers
  • 193,178
  • 25
  • 254
  • 328
  • Looks good, except that during the process, there is a slight chance that we try to access one of our wanted files and it will only exist in /var/DATA.bak. Isn't there a way to rm all files, except those listed? – Calab Sep 18 '12 at 11:12
  • Try `rsync -a --delete /var/DATA.bak/ /var/DATA/` instead of `rm -rf /var/DATA`. – Ansgar Wiechers Sep 18 '12 at 11:21
0

For what it's worth, here is the bash script I created to accomplish my task. Comments are welcome.

#!/bin/bash

# This script follows these rules:
#
#  - Only process directories beginning with "DEV"
#  - Do not process directories within the device directory
#  - Keep files that match the following criteria:
#     - Keep the two newest automated backups
#     - Keep the six newest automated backups generated on the first of the month
#     - Keep any file that is less than 30 days old
#     - Keep the file "verification_cfg"
#
#  - An automated backup file is identified as six digits, followed by "_Config"
#    e.g.  20120329_Config


# Remember the current directory
CurDir=`pwd`

# FTP home directory
DatDir='/var/DATA/'
cd $DatDir

# Only process directories beginning with "DEV"
for i in `find . -type d -maxdepth 1 | egrep '\.\/DEV' | sort` ; do
 cd $DatDir

 echo Doing "$i"
 cd $i

 # Set the GROUP EXECUTE bit on all files
 find . -type f -exec chmod g+x {} \;

 # Find the two newest automated config backups
 for j in `ls -t1 | egrep -m2 ^[0-9]{8}_Config$` ; do
  chmod g-x $j
 done

 # Find the six newest automated config backups generated on the first of the month
 for j in `ls -t1 | egrep -m6 ^[0-9]{6}01_Config$` ; do
  chmod g-x $j
 done

 # Find all files that are less than 30 days old
 for j in `find -mtime -30 -type f` ; do
  chmod g-x $j
 done

 # Find the "verification_cfg" file
 for j in `find -name verification_cfg` ; do
  chmod g-x $j
 done

 # Remove any files that still have the GROUP EXECUTE bit set
 find . -type f -perm -g=x -exec rm -f {} \;

done

# Back to the users current directory
cd $CurDir
Calab
  • 351
  • 4
  • 20