I am trying to extract characters before and after the "/" character using R.
For example, I can get the tags with the following:
s <- "hello/JJ world/NN"
# get the tags
sapply(s, function(x){gsub("([a-z].*?)/([A-z].*?)", "\\2", x)})
which returns
"JJ NN"
However, when I try to extract the characters before the "/" or the "tokens", using the following:
sapply(s, function(x){gsub("([a-z].*?)/([A-z].*?)", "\\1", x)})
I get
"helloJ worldN"
How can I get "hello world" and why is the first letter of the tag slipping in there?