I've been trying to preserve punctation like "-" "(" "/" "'"
when tokenizing word.
data = tibble(title = "Computer-aided detection (1 / 2)")
data %>% unnest_tokens(input = title,
output = słowo,
token = "ngrams",
n = 2)
I want output to be like this:
computer-aided
aided detection
detection (1
(1 / 2)
Any suggestions?