Assign more weight to certain documents within the corpus - LDA - Gensim

Asked Dec 19 '18 at 13:15

Active Dec 19 '18 at 13:15

Viewed 203 times

I am using LDA for topic modelling but unfortunately my data is heavily skewed. I have documents from 10 different categories and would like each category to equally contribute to the LDA topics.

However, each category has a varying number of documents (one category for example holds more than 50% of the entire documents, while several categories hold only 1-2% of the documents).

What would be the best approach to assign weights to these categories, so they equally contribute to my topics? If I run the LDA without doing so, my topics will be largely based on the category, which holds over 50% of the documents in the corpus. I am exploring up-sampling but would prefer a solution that directly assigns weight in LDA.

asked Dec 19 '18 at 13:15

Mia

1

Did you find any workaround to assign different weights to the documents? – Yahia Jul 31 '20 at 19:43

Assign more weight to certain documents within the corpus - LDA - Gensim

0 Answers0