I want to form an ngram using nltk corpus reuters. I tested my code to form ngrams on a small corpus saved on my local disk as a text file using:
import nltk
file = open('dummytext.txt', encoding = 'utf8').read()
Now that my ngram probability code makes sense to me. I want to use the nltk corpus reuters which is a huge corpus so when i do the following:
import nltk
from nltk.corpus import reuters
file = reuters.words()
The processing to form unigrams goes on for eternity
How to unpack the nltk corpus as string in a variable to form ngrams using nltk?