Questions tagged [opennlp]

Apache's libraries for natural language processing (NLP).

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also included maximum entropy and perceptron based machine learning.

More about Natural Language Processing :

Natural language processing (NLP) is the ability of a computer program to understand human speech as it is spoken.

Apache OpenNlp is often used with Apache Flink(a document query library).

Relevant Links,

http://searchcontentmanagement.techtarget.com/definition/natural-language-processing-NLP https://opennlp.apache.org/docs/.

Cornerstone books: https://www.manning.com/books/taming-text

684 questions
3
votes
1 answer

opennlp vs corenlp : Market reach - popularity

I am just doing the comparative study of open source NLP tools, and got an idea about the features/services of openNLP and coreNLP engines. In the recent past, I see that no contribution made for openNLP forum, where as coreNLP forum is still going…
ShreeVidhya
  • 123
  • 9
3
votes
1 answer

openNLP categorize content return always first category

I'm testing with openNLP library to implemented automation in categorizing content but i have trouble. I'm using this code and it returns always the first category that i have in my training data which i'm passing full article from any news site. …
Emrah Mehmedov
  • 1,492
  • 13
  • 28
3
votes
2 answers

How to get training dataset of OpenNLP models?

I am using the following models of OpenNLP: en-parser-chunking.bin en-ner-person.bin en-ner-location.bin en-ner-organization.bin I want to append my data in the training dataset on which these models are trained. So please tell me from where I can…
3
votes
1 answer

Find space separated names using Apache OpenNLP

I am using NER of Apache Open NLP. I have successfully trained my custom data. And while using the name finder, I am splitting the given string based on white space and passing the string array as given below. NameFinderME nameFinder = new…
Hari Ram
  • 317
  • 4
  • 21
3
votes
0 answers

Build custom named entity recognition (NLP) models

I am trying to extract names of people from the text using OpenNLP in R. However whenever I use Indian names, the model fails to detect the names. Hence I understood that I need to build custom model. I have built my own en-ner-customperson.bin…
Hardik Gupta
  • 4,700
  • 9
  • 41
  • 83
3
votes
0 answers

Determine what tree bank type can come next

I am use Apache NLP and its POSTaggerME. I have it breaking down words into their Penn Treebank tag set values. Is there any functionality out there (doesn't have to be in Apache NLP) that lets you know what kind of word can come next using the…
user489041
  • 27,916
  • 55
  • 135
  • 204
3
votes
1 answer

getType() in opennlp.tools.util.Span class?

I am using opennlp opennlp.tools.chunker.ChunkerME implementation for finding chunks. In this class i am calling chunkAsSpans(..) method which returns Span[]. So, this Span instance has getType() getter method which is returning types like NP, VP…
Sarang
  • 547
  • 8
  • 20
3
votes
1 answer

Creating and training a model for OpenNlp using BRAT?

I may need to create a custom training set for OpenNLP, and this will require me to manually annotate a lot of entries. To make things easier, a GUI solution may be the best idea (manually writing annotation tags it's not cool), and I've just…
StepTNT
  • 3,867
  • 7
  • 41
  • 82
3
votes
7 answers

How to implement BOT engine like WIT.AI for on an on-premise solution?

I want to build a chatbot for a customer service application. I tried SaaS services like Wit.Ai, Motion.Ai, Api.Ai, LUIS.ai etc. These cognitive services find the "intent" and "entities" when trained with the typical interactions model. I need to…
Mac
  • 497
  • 5
  • 22
3
votes
1 answer

How to train Chunker in Opennlp?

I need to train the Chunker in Opennlp to classify the training data as a noun phrase. How do I proceed? The documentation online does not have an explanation how to do it without the command line, incorporated in a program. It says to use…
zoozoofreak
  • 65
  • 1
  • 11
3
votes
2 answers

Error in OpenNLP package - dataframe coercing

I am trying to run a basic sentence annotation function and I keep running into the same error. The code I tried to use is: s <- as.String(cleandata) #cleandata is my data.It is a character class. sent_ann <- Maxent_Sent_Token_Annotator() a2 <-…
3
votes
1 answer

extract NP-VP-NP from Stanford dependency parse tree

I need to extract triplets of the form NP-VP-NP from the dependency parse tree produced as the output of lexalized parsing in Stanford Parser. Whats the best way to do this. e.g. If the parse tree is as follows: (ROOT (S (S (NP (NNP…
Sonika
  • 105
  • 7
3
votes
1 answer

Creating a simple concept graph from unstructured text using NLP techniques

I need to parse unstructured text and convert relevant concepts into format so that all the triplets can be merged to form a graph. e.g. If I have 2 sentences like A improves B and B improves C, i should be able to create a graph like A ---> B…
Sonika
  • 105
  • 7
3
votes
1 answer

Named Entity Extraction in Elasticsearch

I am new to elasticsearch. I am exploring the possibility of extracting the entity from the content and index that in elasticsearch. I tried install and map the openNLP plugin in elasticsearch but ran into issues like no handler class found etc. I…
pmb.hyd
  • 41
  • 1
  • 4
3
votes
1 answer

How to determine if a sentence is a statement using OpenNLP or any other library?

I am wondering if it's possible to determine if a sentence is a Question or a Statement using Apache OpenNLP or any other library? If so, I'm looking for some pointers on how to achieve this.