In Lingpipe's EM tutorial they said that it is possible to run the algorithm with no supervised data:
It is possible to train a classifier in a completely unsupervised fashion by having the initial classifier assign categories at random. Only the number of categories must be fixed. The algorithm is exactly the same, and the result after convergence or the maximum number of epochs is a classifier.
But their class, TradNaiveBayesClassifier
required a labeled and an unlabeled corpora to run. How can I modify it to run with no labelled data?