4

I've create a bash script to scan whole server for virus via clamav. The script has been running via cron every night. Because of this I want to scan only the files that has been added last 24 hours. For now I am using this command in my script:

find /home -type f -mmin -1440  -print0 | xargs -0 -r clamscan --infected

But it's too slow, is the find command the reason of being slow? If so what is the better way to scan only last 24 hours files with clamscan? Does clamav have any option to doing this?

Any Help would be much appreciated.

Ehsan
  • 247
  • 2
  • 5
  • 1
    Is it `find` that's being slow or `clamav`? How long does the command take without the `xargs/clam` pipe? And what does "too slow" means anyway? – Sven Feb 18 '13 at 12:50
  • The `find` command without the `xargs/clam` take about 10 minutes and the whole command take about 2 hours on my server. I think maybe if clamav has an option for my purpose, it would be faster than this. – Ehsan Feb 18 '13 at 13:11
  • You need to use clamdscan. clamscan is initialising the engine for every single file. – Patrick Dec 17 '15 at 00:04
  • 1
    Not trying to resurrect anything, but why only scan the files in the last 24 hours? Virus definitions are being updated ALL the time, and a file that is clean now, may not necessarily be clean -later-. – Barry Chapman Nov 26 '16 at 07:13

3 Answers3

8

I stumbled onto this page, when I was looking for a clamscan script. I followed above advice and got it working with:

#!/usr/bin/bash
# Create Hourly Cron Job With Clamscan

# Directories to scan
scan_dir="/home"

# Temporary file
list_file=$(mktemp -t clamscan.XXXXXX) || exit 1

# Location of log file
log_file="/var/log/clamav/hourly_clamscan.log"

# Make list of new files
if [ -f  "$log_file" ]
then
        # use newer files then logfile
        find "$scan_dir" -type f -cnewer "$log_file" -fprint "$list_file"
else
        # scan last 60 minutes
        find "$scan_dir" -type f -cmin -60 -fprint "$list_file"
fi

if [ -s "$list_file" ]
then
        # Scan files and remove (--remove) infected
        clamscan -i -f "$list_file" --remove=yes > "$log_file"

        # If there were infected files detected, send email alert
        if [ `cat $log_file | grep Infected | grep -v 0 | wc -l` != 0 ]
        then
                HOSTNAME=`hostname`
                echo "$(egrep "FOUND" $log_file)" | mail -s "VIRUS PROBLEM on $HOSTNAME" -r     clam@nas.local you@yourhost.com
        fi
else
        # remove the empty file, contains no info
        rm -f "$list_file"
fi
exit

It was an hourly script in my case, but should work for daily (modify the second find).

Royco
  • 101
  • 1
  • 3
  • 1
    The speed improvement is nothing short of shocking. I was averaging 10s per PDF file uploaded to the server by users. Running clamscan as an -exec of find was causing */30 min cron jobs to stack up during the day. Now it's taking ~30sec to scan 250 odd files at at time. – danielcraigie Aug 12 '16 at 15:34
  • I'd suggest using mmin instead of cmin. This way if a file is created clean, and later modified as a virus, it should be rescanned. – Wranorn May 02 '17 at 21:29
2

Depending on how many files are actually affected, I don't think that 2 hours is that long for a virus scan. Anyway, you could try to improve the speed the following way:

Output the find result into a file instead of piping it into xargs and then use clamscan with with the --file-list=FILE option. This would possibly improve the run time because clamav would only need to start and initialize once` instead of multiple times. Please leave a comment and tell me how much this sped things up if at all.

Another option (or an additional one) would be to limit your scan to certain vulnerable file types, but personally, I don't like this approach.

Sven
  • 98,649
  • 14
  • 180
  • 226
0

I haven't tested this out yet, but I'm planning on integrating my clamscan run with my backup run. My backup tool produces a list of files modified in order to perform an incremental backup, so why recompute the same file list twice?

I use dirvish to create my backups, which uses rsync underneath. In the end, I get a log.bz2 giving me a report of all files backed-up including the list of files that got backed-up.

This genclamfilelist.sh script will extract the file list from the log.bz2 of the latest backup and print it out:

#!/bin/sh

AWK=/usr/bin/awk
BUNZIP2=/bin/bunzip2
HEAD=/usr/bin/head
HOSTNAME=/bin/hostname
LS=/bin/ls
SED=/bin/sed

SNAPSHOT_HOME=/path/to/dirvish/snapshots

   for vaultHome in ${SNAPSHOT_HOME}/*; do

      # vault naming convention: <hostname>-<sharename>
      vaultName="`echo ${vaultHome} | ${SED} -e 's/^.*\/\([^\/]\+\)$/\1/'`"
      vaultHost="`echo ${vaultName} | ${SED} -e 's/\([^\-]\+\)\-.*$/\1/'`"

      # only proceed if vault being considered is for the same host
      if [ "${vaultHost}" = "`${HOSTNAME}`" ]; then
         logfile="`${LS} -1t ${vaultHome}/20??????-???? \
                      | ${HEAD} -1 \
                      | ${SED} -e 's/^\(.*\)\:$/\1/'`/log.bz2"

         if [ -f ${logfile} ]; then
            ${BUNZIP2} -c ${logfile} | ${AWK} '
               /^$/ {
                  if (start) {
                     start=0
                  }
               }

               {
                  if (start) {
                     print $0
                  }
               }

               /^receiving\ file\ list\ \.\.\.\ done$/ {
                  start=1
               }' | ${SED} -e "s/^\(.*\)$/\/\1/"
         fi
         # else skip - no log file found, probably backup didn't run or failed
      fi
      # else skip - another vault
   done

exit 0

This /etc/cron.d/clamav cron script will use the file list:

# /etc/cron.d/clamav: crontab fragment for clamav
CLAMAV_FILELIST=/tmp/clamav_filelist_`/bin/hostname`.txt

# run every night
0 19 * * *     root      /usr/bin/test -f ${CLAMAV_FILELIST} && /usr/bin/clamscan --any-desired-options --file-list=${CLAMAV_FILELIST} && /bin/rm ${CLAMAV_FILELIST}

Since I use dirvish, I modified its /etc/dirvish/dirvish-cronjob to call the first script to generate the file list for use by the last script:

# ...
/usr/sbin/dirvish-expire --quiet && /usr/sbin/dirvish-runall --quiet rc=$?

# v--- BEGIN ADDING NEW LINES
touch /tmp/clamav_filelist_`hostname`.txt
chmod 400 /tmp/clamav_filelist_`hostname`.txt
/usr/local/bin/genclamfilelist.sh >> /tmp/clamav_filelist_`hostname`.txt
# ^--- END ADDING NEW LINES

umount /mnt/backup0 || rc=$?
# ...
jia103
  • 121
  • 1
  • 1
  • 3