I'm using cloud NL to analyze text from Google Speech and it seems to be having trouble with tokenizing contractions for example
"I don't like you"
comes back as tokens whose content_text are:
"I" "do" "n't" "like" "you"
escaping quotes did not help, in this case it came back as
"I" "don" "\'t" "like" "you"
but I found that removing apos' did and the tokens
I dont like you
came back with "dont" as a verb (correct enough)
Is this the correct workaround for now?