Questions tagged [named-entity-recognition]

Named-entity recognition (NER) (also known as entity identification and entity extraction) is a subtask of information extraction that seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.

Named-entity recognition (NER) (also known as entity identification and entity extraction) is a subtask of information extraction that seeks to locate and classify atomic elements in text into predefined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.

Most research on NER systems has been structured as taking an unannotated block of text, such as this one:

Jim bought 300 shares of Acme Corp. in 2006.

And producing an annotated block of text that highlights where the named entities are, such as this one:

<ENAMEX TYPE="PERSON">Jim</ENAMEX>bought<NUMEX TYPE="QUANTITY">300</NUMEX>shares of<ENAMEX TYPE="ORGANIZATION">Acme Corp.</ENAMEX> in <TIMEX TYPE="DATE">2006</TIMEX>.

In this example, the annotations are marked using XML ENAMEX elements, following the format developed for the Message Understanding Conference in the 1990s.

State-of-the-art NER systems for English produce near-human performance. For example, the best system entering MUC-7 scored 93.39% of F-measure while human annotators scored 97.60% and 96.95%.

Source:http://en.wikipedia.org/wiki/Named-entity_recognition

1456 questions
8
votes
1 answer

Named entities as a feature in text categorization?

With existing text categorization (supervised) techniques why don't we consider Named Entities (NE) in the text as a feature in training and testing? Do you think we can improve precision with using NEs as a feature?
7
votes
2 answers

Spacy 3 Confidence Score on Named-Entity recognition

I need to get a confidence score for the tags predicted by NER 'de_core_news_lg' model. There was a well known solution to the problem in the Spacy 2: nlp = spacy.load('de_core_news_lg') doc = nlp('ich möchte mit frau Mustermann in der Musterbank…
Keyvan Sadri
  • 71
  • 1
  • 2
7
votes
1 answer

Named Entity Recognition in aspect-opinion extraction using dependency rule matching

Using Spacy, I extract aspect-opinion pairs from a text, based on the grammar rules that I defined. Rules are based on POS tags and dependency tags, which is obtained by token.pos_ and token.dep_. Below is an example of one of the grammar rules. If…
7
votes
4 answers

TypeError: Tensors in list passed to 'values' of 'ConcatV2' Op have types [bool, float32] that don't all match

I'm trying to reproduce the notebook for entity recognition using LSTM that i found on this link: https://medium.com/@rohit.sharma_7010/a-complete-tutorial-for-named-entity-recognition-and-extraction-in-natural-language-processing-71322b6fb090 When…
Paolopast
  • 197
  • 1
  • 11
7
votes
3 answers

Extracting names from a text file using Spacy

I have a text file which contains lines as shown below: Electronically signed : Wes Scott, M.D.; Jun 26 2010 11:10AM CST The patient was referred by Dr. Jacob Austin. Electronically signed by Robert Clowson, M.D.; Janury 15 2015 11:13AM…
Slickmind
  • 442
  • 1
  • 7
  • 15
7
votes
1 answer

Creating relations in sentence using chunk tags (not NER) with NLTK | NLP

I am trying to create custom chunk tags and to extract relations from them. Following is the code that takes me to the cascaded chunk tree. grammar = r""" NPH: {+} # Chunk sequences of DT, JJ, NN PPH: {} …
Rohan
  • 3,296
  • 2
  • 32
  • 35
7
votes
1 answer

Dealing with the "StanfordTokenizer will be deprecated in version 3.2.5" Warning

I was testing the StanfordNERTagger using the NLTK wrapper and this warning appeared: DeprecationWarning: The StanfordTokenizer will be deprecated in version 3.2.5. Please use nltk.tag.corenlp.CoreNLPPOSTagger or nltk.tag.corenlp.CoreNLPNERTagger…
Anoroah
  • 1,987
  • 2
  • 20
  • 31
7
votes
1 answer

How good is GATE for NLP?

I am trying to build a NLP app which essentially has to do Named Entity Recognition (NER). I came across GATE. From what i understand it is a framework to build NLP apps. I tested ANNIE, the IE system distributed with GATE but the NER results for my…
uzair_syed
  • 313
  • 3
  • 16
7
votes
1 answer

Named Entity Extraction - for Currency

I have a pretty simple problem - recognize money/currency in text. Sample test case: "Pocket money should NOT exceed INR 4000 (USD 100) per annum." Fails on the default Stanford parser - online - (with the 7 class model, including Currency)…
user2849678
  • 613
  • 7
  • 15
7
votes
9 answers

Algorithms recognizing physical address on a webpage

What are the best algorithms for recognizing structured data on an HTML page? For example Google will recognize the address of home/company in an email, and offers a map to this address.
7
votes
2 answers

How can Stanford CoreNLP Named Entity Recognition capture measurements like 5 inches, 5", 5 in., 5 in

I'm looking to capture measurements using Stanford CoreNLP. (If you can suggest a different extractor, that is fine too.) For example, I want to find 15kg, 15 kg, 15.0 kg, 15 kilogram, 15 lbs, 15 pounds, etc. But among CoreNLPs extraction rules, I…
7
votes
3 answers

NLTK: why does nltk not recognize the CLASSPATH variable for stanford-ner?

This is my code from nltk.tag import StanfordNERTagger st = StanfordNERTagger('english.all.3class.distsim.crf.ser.gz') And i get NLTK was unable to find stanford-ner.jar! Set the CLASSPATH environment variable. This is what my .bashrc looks…
chapman
  • 71
  • 1
  • 1
  • 2
7
votes
2 answers

Extract Person Name from unstructure text

I have a collection of bills and Invoices, so there is no context in the text (i mean they don't tell a story). I want to extract people names from those bills. I tried OpenNLP but the quality of trained model is not good because i don't have…
anas.khayata
  • 145
  • 1
  • 6
7
votes
1 answer

How do I use IOB tags with Stanford NER?

There seem to be a few different settings: iobtags iobTags entitySubclassification (IOB1 or IOB2?) evaluateIOB Which setting do I use, and how do I use it correctly? I tried labelling like this: 1997 B-DATE volvo B-BRAND wia64t …
Neil McGuigan
  • 46,580
  • 12
  • 123
  • 152
6
votes
2 answers

unsupervised Named entity recognition (NER) with custom controlled vocabulary for crosslink-suggestions in Java

I'm looking for a Java library that can do Named entity recognition (NER) with a custom controlled vocabulary, without needing labeled training data first. I searched some on SE, but most questions are rather unspecific. Consider the following…
Geert-Jan
  • 18,623
  • 16
  • 75
  • 137