2

Is it possible to make the NLTK PunktSentenceTokenizer to split sentences that do not have whitespace between each other?

from nltk.tokenize.punkt import PunktSentenceTokenizer

sent_tokenizer = PunktSentenceTokenizer()
print(sent_tokenizer.tokenize('Sky is blue.Metal is black.'))

'''
Output:

['Sky is blue.Metal is black.']
'''
revy
  • 3,945
  • 7
  • 40
  • 85

0 Answers0