1

I have several cron tasks, each of which leaves a separate log file. Successful tasks generate no output, so I'm getting a lot of empty logs.

I'd like to clean it up automatically every day. Requesting find to look for size=0 is easy, however I'd like to be sure I'm not removing a log that has been just created by a running task but not closed yet.

Is there a way to tell find to skip open files, or I need to resort to lsof?

emesik
  • 113
  • 4
  • 1
    Assuming your cron tasks complete in less than 24 hours, one simple approach might be to wait and only delete zero-byte files that are more than a day old: `find folderName -empty -mtime +1 -exec rm {} \;` – Jim L. Nov 20 '19 at 20:16

1 Answers1

2

According to my knowledge there is no straightforward way to do it with find.

Solution One

Generate a list of open files in the target folder lsof.lst. and generate the find list of that folder. then display files in the find.lst that are not in the lsof.lst list.

to generate the lsof.lst use the following command:

lsof +D folderName | awk '{ if(NR>1)print $9 }' | sort | uniq > lsof.lst

and then the following command to show the files not currently open in the same folder:

find folderName | grep -v -f lsof.ls

Solution Two

You could also do it in one go like this:

find folderName | grep -v -E `lsof +D folderName | awk '{ if(NR>1)print $9 }' | sort | uniq | awk '{print $0}' ORS='|' | sed 's/.$//'`

Explanation

Now i will try to explain the command so that you can improve it or change it or use the several command line tools in the future.

find folderName will generate a list of all files in that folder and subfolders. the output of the find command is piped to grep that is here being used with the -v switch to exclude the items mentioned in the -E parameters from the piped output of the find command. The result will be the output of find minus the items mentioned in the -E parameter.

The trick here is to generate the list of open files and put it in the format that grep -v -E expects and can work with. grep -E takes a list of strings separated by '|'.

lsof +D FolderName will generate the list of open files in that folder, but the list includes a header, and many columns, one of which is the filename, and it could contain duplicates. So we use awk '{ if(NR>1)print $9 }' to do two things, remove the first line with if(NR>1) and print only the column containing the filename which is with print $9. The result is a list of filenames of the open files in that folder without the header.

To remove duplicates, the output is piped to sort and then uniq, and the next command awk '{print $0}' ORS='|' turns the list into a sentence separated by '|', and the last command remove the last '|' as it is excessive.

enclosing that command with the backquotes '' executes that command in that spot and feeds the output to thegrep -v -E` command.

Ouss
  • 158
  • 7