make new sentence with a n-gram model using nltk

Question

I made 2 and 3-gram models from my text file.

from nltk import *
text = open('Alice in Wonderland.txt', 'r').read()
table = string.maketrans('', '')
text = text.translate(table, string.punctuation)
tokens = word_tokenize(text.lower())
bigram = nltk.bigrams(tokens)
trigram = nltk.trigrams(tokens)

but how can I generate new sentences using these models?

score 2 · Answer 1 · answered Nov 08 '15 at 22:53

Currently, NLTK's generate() function is being deprecated because it is broken, see https://github.com/nltk/nltk/issues/1180

But a state-of-art alternative is text generation using Recurrent Neural Nets, e.g. https://github.com/karpathy/char-rnn (Note: Unlike traditional Ngram based hidden markov model, the char-RNN doesn't use ngrams information.)

Alternatively you can implement your own hidden markov model, see http://fulmicoton.com/posts/shannon-markov/

make new sentence with a n-gram model using nltk

1 Answers1