I want to generate topics from my text at the level of phrases, rather than at the level of words using LDA (latent Dirichlet allocation). How can I do that in R?
LDA interprets the documents as bag-of-words and produces topics with constituting words. For example, a sample output from text "Arsenal won FA cup in two consecutive years in 2014 and 2015. They are the kings of North London.", could yield topic [Arsenal - 50%, FA - 20%, cup - 10%, london - 10%, king - 10%]
I want it to return the topic at the level of phrases, i.e., [Arsenal, fa cup, north london]