Questions tagged [opennlp]

Apache's libraries for natural language processing (NLP).

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also included maximum entropy and perceptron based machine learning.

More about Natural Language Processing :

Natural language processing (NLP) is the ability of a computer program to understand human speech as it is spoken.

Apache OpenNlp is often used with Apache Flink(a document query library).

Relevant Links,

http://searchcontentmanagement.techtarget.com/definition/natural-language-processing-NLP https://opennlp.apache.org/docs/.

Cornerstone books: https://www.manning.com/books/taming-text

684 questions
7
votes
1 answer

Extract actions on objects from a sentence in R

I want to extract actions done on objects from a list of sentences in R. To give a small overview. S = “The boy opened the box. He took the chocolates. He ate the chocolates. He went to school” I am looking for combinations as follows: Opened…
Priyanka Basu
  • 397
  • 1
  • 9
7
votes
2 answers

OpenNLP lemmatization example

Does anyone know where I can find an example of how to use the SimpleLemmatizer() class in the OpenNLP library, and where I can find a sample english dictionary? It appears to be missing from the documentation.
pYr0
  • 159
  • 2
  • 9
7
votes
3 answers

How to predict correct country name for user provided country name?

I am planning to do some data tuning on my data. Situation-I have a data which has a field country. It contains user input country names( It might contain spelling mistakes or different country names for same country like US/U.S.A/United States for…
AngryLeo
  • 390
  • 4
  • 23
7
votes
2 answers

How to extract sentences containing specific person names using R

I am using R to extract sentences containing specific person names from texts and here is a sample paragraph: Opposed as a reformer at Tübingen, he accepted a call to the University of Wittenberg by Martin Luther, recommended by his great-uncle…
Frown
  • 259
  • 1
  • 12
7
votes
3 answers

How to use OpenNLP to get POS tags in R?

Here is the R Code: library(NLP) library(openNLP) tagPOS <- function(x, ...) { s <- as.String(x) word_token_annotator <- Maxent_Word_Token_Annotator() a2 <- Annotation(1L, "sentence", 1L, nchar(s)) a2 <- annotate(s, word_token_annotator, a2) a3 <-…
user4599
  • 95
  • 1
  • 1
  • 9
7
votes
2 answers

Extract Person Name from unstructure text

I have a collection of bills and Invoices, so there is no context in the text (i mean they don't tell a story). I want to extract people names from those bills. I tried OpenNLP but the quality of trained model is not good because i don't have…
anas.khayata
  • 145
  • 1
  • 6
6
votes
1 answer

How to recognize Indian names via NER in OpenNLP?

I am using OpenNLP models for Name-entity recognition, but the problem is that it will only recognize US and UK based names (foreign names), so I need to recognize Indian names. How is it possible?
Sagar Patel
  • 4,993
  • 1
  • 8
  • 19
6
votes
2 answers

How to "update" an existing Named Entity Recognition model - rather than creating from scratch?

Please see the tutorial steps for OpenNLP - Named Entity Recognition : Link to tutorial I am using the "en-ner-person.bin" model found here In the tutorial, there are instructions on Training and creating a new model. Is there any way to "Update"…
sky
  • 2,531
  • 4
  • 17
  • 15
6
votes
1 answer

OpenNLP: foreign names does not get recognized

I just started using openNLP to recognize names. I am using the model (en-ner-person.bin) that comes with open NLP. I noticed that while it recognizes us, uk, and european names, it fails to recognize Indian or Japanese names. My questions are (1)…
Shirish Kumar
  • 1,532
  • 17
  • 23
6
votes
2 answers

How to perform Paragraph boundary detection in NLP frameworks?

I am working on extracting names of people from various ads appearing in English newspapers . However , i have noticed that I need to identify the boundary of an Ad , before extracting the names present in it ,since I need only the first occurring…
kiran
  • 339
  • 4
  • 18
6
votes
2 answers

Analyse the sentences and extract person name, organization and location with the help of NLP

I need to solve the following using NLP, can you give me pointers on how to achieve this using OpenNLP API a. How to find out if a sentence implies a certain action in the past, present or future. (e.g.) I was very sad last week - past I feel…
SST
  • 2,054
  • 5
  • 35
  • 65
6
votes
1 answer

Named Entity recognition with openNLP (default model)

Can anyone point out the algorithm(s) used by openNLP NameFinder module? The code is complex and only sparsely documented and playing with it as a black box (with the default model provided) gives me the impression that it is mostly heuristic. Here…
ScienceFriction
  • 1,538
  • 2
  • 18
  • 29
6
votes
1 answer

Custom Feature Generation in OpenNLP Namefinder API

I am trying to use the Custom Feature generation of OpenNLP for Named Finder API. http://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html I went through the documentation but I was not able to understand how to specify the different…
5
votes
2 answers

Text processing in Java

Now this is a tricky problem for which I'm not able to figure out a good solution. Suppose we have a String in Java:- "He ate 3 apples today." Now the digit 3 can be easily identified in Java using isNumeric function or using regular expressions.…
Manan Pancholi
  • 103
  • 1
  • 8
5
votes
1 answer

Nullpointer Exception with OpenNLP in NameFinderME class

I am using OpenNLP to extract named entities from a given text. It gives me the following error while running the code on large data. When I run it on small data it works fine. java.lang.NullPointerException at…
1 2
3
45 46