I'm using the mallet topic-modeling tool and have some difficulties to make it stable (the topics that I get are not seemed very logic).
I worked with your tutorial and that one: https://programminghistorian.org/en/lessons/topic-modeling-and-mallet#getting-your-own-texts-into-mallet and I got some questions on that:
- Is there some best practices for get that model to work? Except the optimize command (what is a good number for that)? What is good number for iterations command?
- I import my data with the import dir command. In that dir there are my files. Is it matter if those files contain a text with new lines or just a very long line?
- I read about the hLDA model. When I tried to run it I saw that the only output is the state.txt output that is not very clear. I expect for an output like the topic-modeling model (topic_keys.txt, doc_topics.txt) how can I get them?
- When should I use the hLDA rather then the topic-modeling?
Thanks a lot for your help!