I have an internet facing postfix mail filter (debian lenny) that sits in front of all our other mailservers on our network, and scans our mail using amavisd-new, clamav, spamassassin, and policy-weightd.
This server was set up and configured using the document found here: http://www200.pair.com/mecham/spam/spamfilter20090215.html (I also set up the bayesean and AWL lists with MySQL, and installed policy-weightd as described on the same site)
These servers (I have 2) have been running great for a couple of years now (on Debian Etch), but this latest install locks up about once per day (at different times) somehow, and I can't figure out why.
Details of problem
- The mail queues up on the server, and running mailq lists a bunch of items with (delivery temporarily suspended: conversation with 127.0.0.1[127.0.0.1] timed out while receiving the initial server greeting)
- Running amavisd-nanny freezes, and I have to log out of the ssh session. Running amavisd-nanny on a working system will show me the state of each amavisd process, and occasionally find stuck processes (what causes these stuck proccesses??) and terminate them. (I have set up a cron job to run amavisd-nanny hourly to clear these stuck processes, however, even that isn't enough to keep things running)
- ps -ef|grep amavisd lists all of my amavisd processes (12 of them) with (ch#-accept) after them. On a working system these say either (virgin child) or (ch#-avail)
- Memory, Diskspace, or number of postfix processes do not appear to be the problem.
What should I be doing to further diagnose my problem? I am not looking for a workaround, I want to determine what is going wrong and fix it.