I have just started to learn python. I want to write a program in NLTK that breaks a text into unigrams, bigrams. For example if the input text is...
"I am feeling sad and disappointed due to errors"
... my function should generate text like:
I am-->am feeling-->feeling sad-->sad and-->and disappointed-->disppointed due-->due to-->to errors
I have written code to input text into the program. Here's the function I'm trying:
def gen_bigrams(text):
token = nltk.word_tokenize(review)
bigrams = ngrams(token, 2)
#print Counter(bigrams)
bigram_list = ""
for x in range(0, len(bigrams)):
words = bigrams[x]
bigram_list = bigram_list + words[0]+ " " + words[1]+"-->"
return bigram_list
The error I'm getting is...
for x in range(0, len(bigrams)):
TypeError: object of type 'generator' has no len()
As the ngrams
function returns a generator, I tried using len(list(bigrams))
but it returns 0 value, so I'm getting the same error. I have referred to other questions on StackExchange but I am still not getting around how to resolve this. I am stuck at this error. Any workaround, suggestion?