I am using Gensim's Mallet wrapper for topic modeling -
LdaMallet(path_to_mallet_binary, corpus=corpus, num_topics=100, id2word=words, workers=6, random_seed=2)
While the above worked surprisingly fast, the step (see below) to obtain the topic distribution for each document (n=40,000) is taking a very long time.
#Store topic distributuon for all documents
all_topics=[]
for x in tqdm(range(0, len(doc_list))):
all_topics.append(lda_model[corpus[x]])
It has taken ~18 hours to complete 30,000 documents. Not sure what I am doing incorrectly. Is there a way to get topic distribution for all documents much faster?