3

I am trying to extract conversations from a Postfix log file based on the client that initiated them. This is the awk script that extracts the matching message IDs:

awk '/client.host.name/ && !(/timeout/||/disconnect/) { sub(":","",$6);print $6}' maillog

This is using a standard Postfix maillog as input (see below for sample data). What I think I'd like to do is a multi-pass search of the file using the results of the first search, but I'm not sure if this is the right approach. Something similar to:

awk '/client.host.name/ && !(/timeout/||/disconnect/) {sub(":","",$6);msgid=$6} $0 ~ msgid {print $0}' maillog

But, naturally, this doesn't work as expected. I'm assuming I need to do one of the following things:

  1. Pipe the output from the first awk into a second awk or grep (not sure how to use piped input as a regex).
  2. Assign the first result set to an array and use the array as a search set. Something like:
    awk '/app02/ && !(/timeout/ || /connect/) { sub(":","",$6);msgid[$6]=$6; } END { for(x in msgid) { print x; } }' maillog
    I'm not sure how I'd proceed inside the for loop though. Is there a way in awk to "rewind" the file and then grab all lines that match any element within an array?
  3. Scrap the whole deal and try it using Perl.

So, for the awk gurus... is there any way to accomplish what I'm looking for using awk?

Sample data:

Jul 19 05:07:57 relay postfix/smtpd[5462]: C48F6CE83FA: client=client.dom.lcl[1.2.3.4]
Jul 19 05:07:57 relay postfix/cleanup[54]: C48F6CE83FA: message-id=<20100719100757.C48F6CE83FA@relay.dom.lcl>
Jul 19 05:07:57 relay postfix/qmgr[12345]: C48F6CE83FA: from=<root@dom.lcl>, size=69261, nrcpt=6 (queue active)
Jul 19 05:08:04 relay postfix/smtp[54205]: C48F6CE83FA: to=<recip1@example.org>, relay=in.example.org[12.23.34.5]:25, delay=0.7, delays=0.05/0/0.13/0.51, dsn=2.0.0, status=sent (250 ok: Message 200012345 accepted)
Jul 19 05:14:08 relay postfix/qmgr[12345]: C48F6CE83FA: removed`
Justin ᚅᚔᚈᚄᚒᚔ
  • 15,081
  • 7
  • 52
  • 64
  • I'm unclear on your goal: To find the IDs from postfix in the first query then pull the entire conversation with the second? If you are looking for the Msgid, you could output that in your original awk script. – AlG Jul 20 '10 at 14:39
  • Yes, I'm basically trying to pull all of the conversations for a certain client from the logs. So, find the msgid for all messages from the client (postfix/smtpd[123]: 123456789AB: client=client.host.name[10.20.30.40]), then use that ID to pull the entire conversation (i.e. grep '123456789AB' maillog). I hope that makes sense, and I edited the question to be more clear. – Justin ᚅᚔᚈᚄᚒᚔ Jul 20 '10 at 14:45
  • Could you please put up an example of what the output looks like, meaning the postfix mail log, since I have no idea. – Anders Jul 20 '10 at 14:49
  • @Anders, sample data added to the main question. Hope that helps. – Justin ᚅᚔᚈᚄᚒᚔ Jul 20 '10 at 15:25

2 Answers2

2

You can use an array. Something roughly like this:

awk '/client.host.name/ && !(/timeout/||/disconnect/) {sub(":","",$6);msgid[$6]=1} {if ($FIELD in msgid) print}' maillog

Where you'll have to substitute the field number which contains the data since I don't know it.

Edit: Moved a left brace.

Edit2:

Here's a version specific to your sample data:

awk '/client.dom.lcl/ && !(/timeout/||/disconnect/) {sub(":","",$6); msgid[$6] = 1} {if (gensub(":", "", 1, $6) in msgid) print}' sampledata

Edit2:

Here's a simplified version:

awk '{id = gensub(":", "", 1, $6)} /client.dom.lcl/ && !(/timeout/||/disconnect/) {msgid[id] = 1} {if (id in msgid) print}' sampledata
Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
0

You ask for awk but I have a perl script which is little more robust: https://github.com/brablc/postfix-tools/blob/master/pflogrep

You can use is as grep:

pflogrep infractor@example.com /var/log/maillog

Or you can feed the output to pflogsumm and get nice statistics:

pflogrep infractor@example.com /var/log/maillog | pflogsumm
brablc
  • 1,621
  • 18
  • 17