'NoneType' object is not iterable for collocation function

Question

I am new to NLTK and trying to return the collocation output. I am getting the output and along with it, I am getting none as well. Below is my code, input and output.

import nltk
from nltk.corpus import stopwords


def performBigramsAndCollocations(textcontent, word):
    stop_words = set(stopwords.words('english'))
    pattern = r'\w+'
    tokenizedwords = nltk.regexp_tokenize(textcontent, pattern)
    for i in range(len(tokenizedwords)):
        tokenizedwords[i] = tokenizedwords[i].lower()
    tokenizedwordsbigrams = nltk.bigrams(tokenizedwords)
    tokenizednonstopwordsbigrams = [ (w1, w2) for w1, w2 in tokenizedwordsbigrams if w1 not in stop_words and w2 not in stop_words]
    cfd_bigrams = nltk.ConditionalFreqDist(tokenizednonstopwordsbigrams)
    mostfrequentwordafter = cfd_bigrams[word].most_common(3)
    tokenizedwords = nltk.Text(tokenizedwords)
    collocationwords = tokenizedwords.collocations()
    return mostfrequentwordafter, collocationwords


if __name__ == '__main__':
    textcontent = input()

    word = input()


    mostfrequentwordafter, collocationwords = performBigramsAndCollocations(textcontent, word)
    print(sorted(mostfrequentwordafter, key=lambda element: (element[1], element[0]), reverse=True))
    print(sorted(collocationwords))

input :Thirty-five sports disciplines and four cultural activities will be offered during seven days of competitions. He skated with charisma, changing from one gear to another, from one direction to another, faster than a sports car. Armchair sports fans settling down to watch the Olympic Games could be for the high jump if they do not pay their TV licence fee. Such invitationals will attract more viewership for sports fans by sparking interest among sports fans. She barely noticed a flashy sports car almost run them over, until Eddie lunged forward and grabbed her body away. And he flatters the mother and she kind of gets prissy and he talks her into going for a ride in the sports car.

sports

output:
sports car; sports fans.

[('fans', 3), ('car', 3), ('disciplines', 1)]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-191-40624b3de987> in <module>
     43     mostfrequentwordafter, collocationwords = performBigramsAndCollocations(textcontent, word)
     44     print(sorted(mostfrequentwordafter, key=lambda element: (element[1], element[0]), reverse=True))
---> 45     print(sorted(collocationwords))

TypeError: 'NoneType' object is not iterable

Can you please help me to resolve the issue

I think the error is on line 44 when you use lambda. can you try running this instead and tell me the output. `print(sorted(mostfrequentwordafter, key=lambda element: ((element[1], element[0]), reverse=True)))` — Seaver Olson, Aug 09 '20 at 03:08

score 1 · Answer 1 · answered Aug 22 '20 at 12:31

1

collocations() is buggy and causing error in nltk. I faced the issue recently and able to resolve the issue by using collocation_list(). Try this approach.

collocationwords = tokenizedwords.collocation_list()

answered Aug 22 '20 at 12:31

Dharmendra Singh

11
5

harshad_ · Answer 2 · 2020-09-11T03:18:35.810

Use the below code it should work.

def performBigramsAndCollocations(textcontent, word):
    
    from nltk.corpus import stopwords
    from nltk import ConditionalFreqDist
    tokenizedword = nltk.regexp_tokenize(textcontent, pattern = r'\w*', gaps = False)
    tokenizedwords = [x.lower() for x in tokenizedword if x != '']
    tokenizedwordsbigrams=nltk.bigrams(tokenizedwords)
    stop_words= stopwords.words('english')
    tokenizednonstopwordsbigrams=[(w1,w2) for w1 , w2 in tokenizedwordsbigrams if (w1 not in stop_words and w2 not in stop_words)]
    cfd_bigrams=nltk.ConditionalFreqDist(tokenizednonstopwordsbigrams)
    mostfrequentwordafter=cfd_bigrams[word].most_common(3)
    tokenizedwords = nltk.Text(tokenizedwords)
    collocationwords = tokenizedwords.collocation_list()

    return mostfrequentwordafter ,collocationwords

Hello! Please add an explanation to show what you fixed. Thank you! — Sheikh, Sep 09 '20 at 17:49
Use .collocation_list() instead of using .collocations() then it would work. — harshad_, Sep 11 '20 at 03:15

score 1 · Answer 3 · edited Jan 17 '22 at 19:27

1

collocation_list() alone was not helping. I tried the below and it worked for me.

collocationwords1 = tokenizedwords.collocation_list()

collocationwords=list()
for item in collocationwords1:
    newitem=item[0]+" "+item[1]
    collocationwords.append(newitem)

edited Jan 17 '22 at 19:27

Peter Csala

17,736
16
35
75

answered Jan 17 '22 at 17:44

Suchitra U

11
1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Jan 17 '22 at 19:27

score 0 · Answer 4 · answered Sep 14 '22 at 06:21

def performBigramsAndCollocations(textcontent, word):

from nltk.corpus import stopwords
from nltk import ConditionalFreqDist
tokenizedword = nltk.regexp_tokenize(textcontent, pattern = r'\w*', gaps =False)
tokenizedwords = [x.lower() for x in tokenizedword if x != '']
tokenizedwordsbigrams=nltk.bigrams(tokenizedwords)
stop_words= stopwords.words('english')
tokenizednonstopwordsbigrams=[(w1,w2) for w1 , w2 in tokenizedwordsbigrams if (w1 not in stop_words and w2 not in stop_words)]
cfd_bigrams=nltk.ConditionalFreqDist(tokenizednonstopwordsbigrams)
mostfrequentwordafter=cfd_bigrams[word].most_common(3)
tokenizedwords = nltk.Text(tokenizedwords)
collocationwords1 = tokenizedwords.collocation_list()

collocationwords=list()
for item in collocationwords1:
    newitem=item[0]+" "+item[1]
    collocationwords.append(newitem)


return mostfrequentwordafter ,collocationwords

##this code worked for me

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). — user11717481, Sep 17 '22 at 20:36

score -1 · Answer 5 · answered Aug 09 '20 at 19:23

-1

key transforms the collections items before it is ran. key= really means as I run through this list I will- so when you use key=lambda element: (element[1], element[0]) you are asking it to run twice. instead try something like this. Note that this may not be exactly correct as it is 7 am and I just woke up I will edit it later if it does not work for you.

mylist = [0,1]
print(sorted(mostfrequentwordafter, key=lambda element: (element[mylist]), reverse=True))

answered Aug 09 '20 at 19:23

Seaver Olson

450
3
16

if this does not work please take a look at `https://stackoverflow.com/questions/8966538/syntax-behind-sortedkey-lambda` – Seaver Olson Aug 09 '20 at 19:24
It doesn’t make sense to offer a solution and then say that “well, if it doesn’t work, try this one instead”. If you’re not sure, don’t throw pasta at the wall. – Abhijit Sarkar Aug 09 '20 at 19:32

'NoneType' object is not iterable for collocation function

5 Answers5