3

I'm creating a simple datebase backup solution for a client using web hosting at DataFlame.

The web hosting account provides access to cron but not a shell.

I have a database backup script creating regular backups and I want to automatically remove those more than N days old.

I attempted to use

find -v $backup_dir -mtime +$keep_days -name "*db.tar.gz" -delete

however the user executing the script does not have permission to run find.

Can you suggest how to implement this without using the find command?

  • 2
    If it is a web host, you must have access to one of the popular web app programming languages like PHP? Why not just hack up a quick PHP script that recursively walks a directory and performs actions on each object based on whatever rules you define. – Zoredache Mar 27 '12 at 23:58
  • Do you have access to modify your db backup script? If so, which language is it in? – Yanick Girouard Mar 28 '12 at 00:41

4 Answers4

1

One hack-ish idea might be to incorporate the days since epoch (i.e., date +%s divided by 86400) mod by your $keep_days into the name of the file.

In that case, you won't have to remove older backup files. You would just overwrite the old ones, once the days-since-epoch modulo $keep_days number comes up again.

Something like this:

#!/bin/bash

keepdays=60
epochdays=$(expr $(date +%s) / 86400)

backupindx=$(expr $epochdays % $keepdays)

backupfile=/path/to/backup/file.${backupindx}.db.tgz

So, today, 27 March 2012, you'd have the backupfile file.7.db.tgz, which will be overwritten in 60 days.

cjc
  • 24,916
  • 3
  • 51
  • 70
1

Here's an abridged version of the script I used in the end, based on @cjc 's answer.

#!/bin/sh

# Script to backup ... database.
#
# A rolling backup is used. The size of period backed up is configurable.
# The period size is expressed in terms of an arbitary time unit "timeunit".
#
# Files are saved with format:
# <date>-...-<index within period>.sql.tar.gz
#
# Author: Calum J. Eadie

### Configuration

backup_dir=/home/..../backups
# The size of a timeunit. Eg. 300 for a timeunit that is 5 minutes long.
seconds_per_timeunit=$(expr 60 \* 60 \* 6) # 6 hours
# The size of backup period in timeunits.
keep_timeunits=$(expr 4 \* 30) # 30 days

### Script

# Form files names

date_string=`date +%Y-%m-%d-%H-%M-%S`
# Time since unix epoch in timeunits
epoch_timeunits=$(expr $(date +%s) / $seconds_per_timeunit)
# Index unique to timeperiod.
backup_index=$(expr $epoch_timeunits % $keep_timeunits)
raw=$backup_dir/$date_string-...-$backup_index.sql
compressed=$raw.tar.gz

# Remove old backup

rm -v $backup_dir/*-...-$backup_index.sql.tar.gz

# Create new backups

mysqldump -u ... -p... --databases ... --add-drop-database --add-drop-table > $raw
tar czf $compressed $raw
rm $raw
0

Are you sure you don't have access to find? Or is it just not in the path? Try running it as /usr/bin/find, see if that works for you.

If that doesn't work, you're in a tough spot. If find isn't available, and you don't have access to a shell, you're left with no idea what is available. If Perl was available, it would be a fairly trivial perl script to perform the above task. Using just shell . . . I'd probably try something like this:

ls -t /backup_dir/*.db.tar.gz | sed "1,5d" | xargs rm -f

Where 5 is the number of days you want to keep (modify as needed).

Short explanation: ls -t does a listing of the directory, sorted by time (newest first). The sed "1,5d" tells sed to remove the first 5 lines from the input (corresponding to our 5 newest files). Finally, xargs combines the listing and tells rm remove them.

Christopher Cashell
  • 9,128
  • 2
  • 32
  • 44
  • Never, ever, ever, ever attempt to parse the output of `ls`. Use `stat` if available. http://mywiki.wooledge.org/ParsingLs Different flavors of `stat` use different switches, but most of them have `printf`-like formatting codes available. – JeffG Mar 28 '12 at 00:18
  • 1
    @JeffG Sorry, gonna have to call BS on you on that one. It's very true that there are concerns and risks when using `ls`. Particularly, there is a definite risk running it on untrusted or uncontrolled files. However, that does not mean you should *never* do it. Sometimes you don't have a choice, and sometimes it *is* the best option. In this case, it's a Database backup directory, and it's *relatively* safe to assume, particularly when you are selecting what files to delete based on name, that you can trust the names of the files (including trusting them to be free of "bad" characters). – Christopher Cashell Mar 28 '12 at 01:37
  • I see you didn't read the link... `find` works recursively (without `-maxdepth 1`) so it's safe to assume *nothing*. – JeffG Mar 28 '12 at 22:16
  • I did read the link. I agree with the general sentiment expressed in it. I absolutely disagree with your **thou shalt never** take on it. Best practices are not absolutes, and there's always exceptions. As for `find`'s recursion, the original request specifies a backup directory and DB backup tarballs. He's using find to get files older than N days, not for the recursive filesystem scan. – Christopher Cashell Mar 28 '12 at 23:23
  • "I absolutely disagree with your thou shalt never take on it. Best practices are not absolutes," I agree, absolutely! Except for the part about the `find` recursion. Can't assume that there were no folders, because it's unspecified; and can't guarantee the structure of the backups won't change in the future to add folders. The request was a replacement for the `find` statement, so I provided one which more closely emulates this. This isn't important in a one-off situation, but say he has many servers, and needs a `find` replacement for one of them. Gotta keep it same as possible. – JeffG Mar 29 '12 at 11:28
0

You said you had no shell access. Assuming you meant no interactive shell access, this Bash script recursively deletes db tarballs over 3 days (calculated by taking the current date and subtracting 3 days worth of seconds and comparing it to the mod time of the tarball).

#!/bin/bash

for linkname; do
  if [[ -d "$linkname" ]]; then
    "$0" "$linkname"
  elif [[ "$linkname" =~ '^.*db\.tar\.gz$' ]]; then
    if (( $(stat --format=%y "$linkname") < ( $(date +%s) - 259200) )); then
      rm -f "$linkname" 
    fi
  fi
fi
JeffG
  • 1,194
  • 6
  • 18