0

So, I was learning nlp with nltk3 and while practicing on one of the examples I got stuck while counting the named entities in a sentence. Apparently, nltk has been updated and .node has been removed from tree structure. Here is my code:

import sys
f=open('nyt.txt','r')
news_content=f.read()
import nltk
results=[]
for sent_no,sent in enumerate(nltk.sent_tokenize(news_content)):
    tokens=nltk.word_tokenize(sent)
    no_of_tokens=len(tokens)
    tagged=nltk.pos_tag(tokens)
    nouns=len([word for word,pos in tagged if pos in ["NN","NNP"]])
    ners=nltk.ne_chunk(tagged,binary=True)
    no_of_ners=len([chunk for chunk in ners if hasattr(chunk,'node')])
    score=(nouns+no_of_ners)/float(no_of_tokens)
    results.append((sent_no,no_of_tokens,no_of_ners,nouns,score,sent))
results.sort(key=lambda x:x[4])
print(results[5]) 

On executing I get error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\ssisharm\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
    execfile(filename, namespace)
  File "C:\Users\ssisharm\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)
  File "C:/Users/ssisharm/Documents/Python Scripts/news_summary.py", line 19, in <module>
    no_of_ners=len([chunk for chunk in ners if hasattr(chunk,'node')])
  File "C:/Users/ssisharm/Documents/Python Scripts/news_summary.py", line 19, in <listcomp>
    no_of_ners=len([chunk for chunk in ners if hasattr(chunk,'node')])
  File "C:\Users\ssisharm\Anaconda3\lib\site-packages\nltk\tree.py", line 202, in _get_node
    raise NotImplementedError("Use label() to access a node label.")
NotImplementedError: Use label() to access a node label.

I need to access the named entities and count them. Could someone please help?

alexis
  • 48,685
  • 16
  • 101
  • 161
Siddharth
  • 73
  • 8
  • You could try looking at the error message you got, which reads: `Use label() to access a node label.`. The name `node` has been changed to `label()` in `nltk` version 3. – alexis Jun 03 '17 at 15:21
  • Thanks alexis! That's what I was searching for. – Siddharth Jun 03 '17 at 15:57

0 Answers0