1

I am writing a code that opens a link and collects words surrounding a substring j into Res, and then collects all the nouns in Res as follows:

j="Green Index" #defining word to be looked for
sub = '(\w*)\W*(\w*)\W*(%s)\W*(\w*)\W*(\w*)' % j #defining substring including word

allnouns=[]
link="http://greenindex.timberland.com/" #defining link to search for word
f=requests.get(link)
str1=f.text

for i in re.findall(sub, str1, re.I):  #collecting all terms found together
    print(" ".join([x for x in i if x != ""]))
    Res=(" ".join([x for x in i if x != ""]))#creating each sentence Res
    Results.append(Res)  #putting all sentences Res in one list Results

    sentences = nltk.sent_tokenize(Res) #here is where I hit an error
    nouns = []

    for sentence in sentences:
        for word,pos in nltk.pos_tag(nltk.word_tokenize(str(sentence))):
            if (pos == 'NN' or pos == 'NNP' or pos == 'NNS' or pos == 
            'NNPS'):
             nouns.append(word)
             allnouns.append(nouns)

I hit an error right before my second loop:

TypeError: Can't convert 'list' object to str implicitly

I checked and type(Res)=class str and I tried to split Res also thinking it might help, sentences = nltk.sent_tokenize(Res.split) but same error. How can I get around it?

El_1988
  • 339
  • 3
  • 13

0 Answers0