Trying to do some text mining with R without removing any special characters. For example in the following "LKC" and "LKC_" should be different words. Instead it is dropping the _ and making it the same word. How can I accomplish this?
library(tm)
special = c("OLAC_ LA LAC LAC_ LAC_E AC AC_ AC_E AC_ET",
")LK )LKC )LKC- LK LKC LKC-",
"LAC_ LAC_E LKC LKC-")
bagOfWords <- Corpus(VectorSource(special))
mydocsDTM <- DocumentTermMatrix(bagOfWords, control = list(removePunctuation = FALSE,
preserve_intra_word_contractions = FALSE,
preserve_intra_word_dashes = FALSE,
removeNumbers = FALSE,
stopwords = FALSE,
stemming = FALSE
))
inspect(mydocsDTM)