I am having problem with getting the right text after stemming in R. Eg. 'papper' should show as 'papper' but instead shows up as 'papp', 'projekt' becomes 'projek'.
The frequency cloud generated thus shows these shortened versions which loses the actual meaning or becomes incomprehensible.
What can I do to get rid of this problem? I am using the latest version of snowball(0.6.0).
R Code:
library(tm)
library(SnowballC)
text_example <- c("projekt", "papper", "arbete")
stem_doc <- stemDocument(text_example, language="sv")
stem_doc
Expected:
stem_doc
[1] "projekt" "papper" "arbete"
Actual:
stem_doc
[1] "projek" "papp" "arbet"