I would like to create yet another spam detection for my CMS. Currently I do see three options:
- use a simple php class and store tokens in MySQL
- install spamassassin and use a php-connector
- something big like mahout
I do not like the MySQL approach, because I fear that it will grow very big with the time and degrade the performance of the whole system. The spamassassin approach seems to be more attractive, but everywhere on the internet people are writing that SA's rules are focussed on mails and headers and that this is not an ideal way to go. Last but not least i am aware of mahout, but I fear it might be a bit too big and create a lot of administration overhead.
Is there something nice, small and efficient that could be run on a linux server and accessed from php?