I want to try some kind of prediction stuff similar to this one: https://www.quora.com/How-do-I-use-LDA-Latent-Dirichlet-Allocation-for-document-classification-preferably-with-solutions-that-can-be-implemented-in-R
I think that I will have to merge my raw data with the topic_doc_distr table using the doc_id as unique identifier, but I actually don't know how.
/edit: Will the doc_id be persistent or is it getting obsolet after the Corpus creation / data frame transformation?
I've tried the following R-Code, but I don't know how to add the doc_id in there.
test <- doc_topic_distr
Any clues?