-1

I want to use cron job, that once per three day will clean and sort maillog.

My job looks like

 /bin/sed -i /status=/!d /var/log/maillog | 
    (/bin/grep "status=bounced" /var/log/maillog | /bin/grep -E -o --color "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" | /bin/sort -u >> /root/unsent.log) | 
    (/bin/grep "status=deferred" /var/log/maillog | /bin/grep -E -o --color "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" | /bin/sort -u >> /root/deferred.log) | 
    (/bin/grep "status=sent" /var/log/maillog | /bin/grep -E -o --color "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" | /bin/sort -u >> /root/sent.log) | 
/bin/sed -i "/status=/d" /var/log/maillog

Job works fine and do 3 step:

  1. Delete from maillog all lines that don't contain "status="
  2. Sort sent, bounced, deffered in different logs.
  3. Delete from maillog all lines that contain "status"

After this job my maillog is fully clean and sorted to 3 logs.

But Postfix doesn't want to write next records to maillog.

I delete sed command, and Postfix writes next records fine.

Why sed command blocks maillog after execution cron job?

Jason
  • 197
  • 1
  • 1
  • 7
  • I don't get why you have such a long pipe-line. You're processing '/var/log/maillog' in each section connected from the first, which becuase it has `-i`, will produce no output, right? so why all the connecting pipes? I would think each of these `(..)` sections can stand on their own, they don't need the `( )` wrappers, and don't need connecting pipes, becuase any output they produce is going into a file. AND your first step is redundant, beucase your middle 3 steps, trap individual 'status=STR` sets. But I'm just getting up and a little blurry so maybe I'm missing something. – shellter Apr 05 '12 at 14:15
  • As I understand, there will be more clever to make 3 jobs instead of one long. – Jason Apr 06 '12 at 04:10
  • I may be missing a key element (reason) to your design, but I can't see any good reason to have 1 long one, especially as you're creating a bunch of extra sub processes that serve no purpose. Having all the extra sub-processes doesn't really matter, except that you're setting your self bad habits. If you get to the point that you're trying to service really large systems, and time is of the essence, then you're 'spending' system resources (the extra sub-processes for no good reason). Best to really understand what the minimal solution is and then add features because they are really needed. – shellter Apr 06 '12 at 04:18

2 Answers2

1

sed -i will unlink the file it modifies, so syslog/postfix will continue writing to a nonexistent file.

From http://en.wikipedia.org/wiki/Sed:

Note: "sed -i" overwrites the original file with a new one, breaking any links the original may have had

It is more common to process log files after rotating them out of place with a tool like logrotate or savelog, so that syslog can continue writing uninterrupted.

If you must edit /var/log/maillog in place, you can add a line to the end of your cron job to reload syslog when you are done. Note that you can lose log lines written to the file while your script is running if you do this. The command will depend on what distribution / operating system you are running. On ubuntu, which uses rsyslog, it would be reload rsyslog >/dev/null 2>&1.

A B
  • 8,340
  • 2
  • 31
  • 35
0

I've reformatted your original code to highlight the pipe-lines you added

 /bin/sed -i /status=/!d /var/log/maillog \
 | (/bin/grep "status=bounced" /var/log/maillog \
     | /bin/grep -E -o --color "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" \
     | /bin/sort -u >> /root/unsent.log\
   ) \
 | (/bin/grep "status=deferred" /var/log/maillog \
   | /bin/grep -E -o --color "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" \
   | /bin/sort -u >> /root/deferred.log\
   ) \
 | (/bin/grep "status=sent" /var/log/maillog \
   | /bin/grep -E -o --color "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" \
   | /bin/sort -u >> /root/sent.log \
   ) \
 | /bin/sed -i "/status=/d" /var/log/maillog

As @alberge noted, you could very likely lose log messages with all of this sed -i processing on the same file.

I propose a different approach:

I would move the maillog to a dated filename, (the assumption here is that Postfix, will create a new file with the standard name that it 'likes' to use (/var/log/maillog).

Then your real goal seems to be to extract various categories of messages to separately named files, i.e. unsent.log, deferred.log, sent.log AND then you're discarding any lines that don't contain the string status= (although you do that first).

Here's my alternate (please read the whole message, don't copy/paste/excute right away!).

 logDate=$(/bin/date +%Y%m%d.%H%M%S)
 /bin/mv /var/log/maillog /var/log/maillog.${logDate}

 /bin/grep "status=bounced" /var/log/maillog.${logDate} \
 | /bin/grep -E -o --color "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" \
 | /bin/sort -u \
 >> /root/unsent.log.${logDate} 

 /bin/grep "status=deferred" /var/log/maillog.${logDate} \
 | /bin/grep -E -o --color "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" \
 | /bin/sort -u \
 >> /root/deferred.log.${logDate}

/bin/grep "status=sent" \
| /bin/grep -E -o --color "\b[a-zA-Z0-9.-]+@[a-zA-Z0-9.-]+\.[a-zA-Z0-9.-]+\b" \
| /bin/sort -u \
>> /root/sent.log.${logDate} 

To test that this code is working, replace the 2nd line ( /bin/mv .... ) with

  /bin/cp /var/log/maillog /var/log/maillog.${logDate}

Copy/paste that into a terminal window, confirm that the /var/log/maillog.${logDate} was copied correctly, then copy/paste each section, 1 at a time and check that the expected output is created in each of the /root logfiles.

(If you get error messages for any of these blocks, make sure there are NO space/tab chars after the last '\' char on each of the continued lines. OR you can fold each of those 3 pipelines back into one line, removing the '\' chars as you go.

(Note that to create each of the /root logfiles, I don't use any connecting sections via pipes surrounded by sub-processes. But, in other situations, I do use this sort of technique for advanced problems, so don't throw the technique away, just use it when it is really required ;-)!

After you confirm that all of this is working as you needed, then you extend the script to do a final cleaning up :

/bin/rm  /var/log/maillog.${logDate}

I've added ${logDate} to each of your output files, but as I see you're using sort -u >> you may want to remove that 'extension' to your sub-logfile names (unsent.log, deferred.log, sent.log) And just let those files get grow naturally. In either case, you'll have to comeback at some point and determine how far back you want to keep this data, and develop a plan and method for how you'll clean up these logfiles when they're not useful. I think someone mentioned logrotate package. You might want to look into that as your long-term solution.


This solution avoids a lot of extra processes being created, and it eliminates (mostly) the possibility of lost log records. I'm think you might lose all or part of a record if Postfix is writing to the logfile in the same split-second as you are moving the file. But your solution would have similar problems AND more opportunities for that to happen.

If I have misunderstood the intention of your design, using the nested ( .... ) | ( .... ) sub-processes, sorry! Consider updating your post to include why you are using that techinque.

I hope this helps.

shellter
  • 36,525
  • 7
  • 83
  • 90
  • I saw that Postfix doesn't worry, when I do mv /var/log/maillog /var/log/maillog.20120406.090745. It doesn't make another maillog file, but simply continues write log to maillog.20120406.090745. I think I'm armed your scenario with separate jobs and rotating maillog time to time. Thanks a lot , Shellter. – Jason Apr 06 '12 at 05:19
  • ( .... ) | ( .... ) - I was sure that it helps me to put all three step into one process with sub-processes and reduces time of this operation. But now I see that may be I'm wrong with my thoughts. – Jason Apr 06 '12 at 05:35