Based on 37,000 article texts, I implemented LDA mallet
topic modeling. Each article was properly categorized and the dominant topic of each was determined.
Now I want to create a dataframe that shows each topic's percentages for each article, in Python.
I want the data frame to look like this:
no | Text | Topic_Num_1 | Topic_Num_2 | .... | Topic_Num_25
01 | article text1 | 0.7529 | 0.0034 | .... | 0.0011
02 | article text2 | 0.3529 | 0.0124 | .... | 0.0001
.... (37000 x 27 row)
How would I do this?
+
All the code I've been doing is based on the following site.
http://machinelearningplus.com/nlp/topic-modeling-gensim-python
How can I see the all probability list of the topics of every single article?