1

I'm working with the Mallet library for a project in Java.

I have 15,000 documents with 400 tokens each. I tried using ParallelTopicModel. But I would like to have a set of topics that contain both single tokens and sequences of tokens (e.g. "Java" as well as "Java Developer").

I am considering using LDA-HMM. What class of Mallet can I use?

Then I'll turn every topic into nodes of a Bayesian network, to receive as evidence a token or sequence of tokens, and make inferences. Which Java library can I use for that?

Thanks in advance. Francesco

erisco
  • 14,154
  • 2
  • 40
  • 45

0 Answers0