I was wondering if there is any good and clean object-oriented programming (OOP) implementation of Bayesian filtering for spam and text classification? This is just for learning purposes.
6 Answers
I definitely recommend Weka which is an Open Source Data Mining Software written in Java:
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
As mentioned above, it ships with a bunch of different classifiers like SVM, Winnow, C4.5, Naive Bayes (of course) and many more (see the API doc). Note that a lot of classifiers are known to have much better perfomance than Naive Bayes in the field of spam detection or text classification.
Furthermore Weka brings you a very powerful GUI…

- 12,406
- 8
- 49
- 61
Maybe https://ci-bayes.dev.java.net/ or http://www.cs.cmu.edu/~javabayes/Home/node2.html?
I never played with it either.

- 36,908
- 70
- 97
- 130

- 7,042
- 7
- 44
- 67
Here is an implementation of Bayesian filtering in C#: A Naive Bayesian Spam Filter for C# (hosted on CodeProject).

- 40,752
- 27
- 129
- 174
In French, but you should be able to find the download link :) PHP Naive Bayesian Filter

- 35,564
- 14
- 82
- 119