1

Long ago I implemented site-wide bayes filtering as per http://wiki.apache.org/spamassassin/SiteWideBayesSetup.

I don’t think it ever worked, and I certainly find that my spam scores are always negative, with BAYES_00 suggesting that Bayes wasn’t used at all.

Here is what I have in my local.cf file:

bayes_path /etc/mail/spamassassin/bayes/bayes
bayes_file_mode 0777

When I run sa-learn I find instead that the tokens are stored in individual home directories.

What is the correct method to get this working?

Supplementary Question: if I can get this working, can I combine the various bayes_tok and other files?

Manngo
  • 14,066
  • 10
  • 88
  • 110

1 Answers1

3

If you get BAYES_00 results, then Bayes is indeed working as it has classified the email as being ham. A neutral result would be BAYES_50. You just need to train the Bayes database properly.

If sa-learn creates/updates bayes files under your home directory, then it is either not reading the desired local.cf file, or the bayes_path gets overridden by a user-specific configuration file (e.g. /root/.spamassassin/user_config).

You could try one of the following:

  • run sa-learn under the same user account as spamassassin is executed
  • specify an explicit path to sa-learn, i.e.

    sa-learn --dbpath /etc/mail/spamassassin/bayes/bayes
    
  • use the -D option to see what is really going on, i.e. which configuration files are being read, etc.

If/when you get it working you can generally not combine the various database files. There are at least a bayes_toks and a bayes_seen file, because one contains the tokens learned and the other has email Message-Id:s and associated training status (spam/ham). Then there can be an optional bayes_journal if you use deferred syncing.

Further details available in the manpage for sa-learn:
https://spamassassin.apache.org/full/3.4.x/doc/sa-learn.html

krisku
  • 3,916
  • 1
  • 18
  • 10