Join doc_topic_distr with DTM raw data using doc_id

Question

I think that I will have to merge my raw data with the topic_doc_distr table using the doc_id as unique identifier, but I actually don't know how.

/edit: Will the doc_id be persistent or is it getting obsolet after the Corpus creation / data frame transformation?

I've tried the following R-Code, but I don't know how to add the doc_id in there.

test <- doc_topic_distr

Any clues?

score 0 · Answer 1 · answered Aug 30 '19 at 12:11

0

Solved it like this:

newDF <- merge(x=df_old, y=df_additions, by="doc_id",all=TRUE)

with df_old: raw files df_additions: doc-topic-distr as data frame

answered Aug 30 '19 at 12:11

Flocke Haus

1 Answers1