Questions tagged [opennlp]

Apache's libraries for natural language processing (NLP).

The Apache OpenNLP library is a machine learning based toolkit for the processing of natural language text. It supports the most common NLP tasks, such as tokenization, sentence segmentation, part-of-speech tagging, named entity extraction, chunking, parsing, and coreference resolution. These tasks are usually required to build more advanced text processing services. OpenNLP also included maximum entropy and perceptron based machine learning.

More about Natural Language Processing :

Natural language processing (NLP) is the ability of a computer program to understand human speech as it is spoken.

Apache OpenNlp is often used with Apache Flink(a document query library).

Relevant Links,

http://searchcontentmanagement.techtarget.com/definition/natural-language-processing-NLP https://opennlp.apache.org/docs/.

Cornerstone books: https://www.manning.com/books/taming-text

684 questions
5
votes
2 answers

Identify a list of items using Natural Language Processing

Is there a way for NLP parsers to identify a list? For example, "a tiger, a lion and a gorilla" should be identified as a list (I don't need it to be identified as a list of animals; just a list would be sufficient). My ultimate aim is to link a…
user6732360
5
votes
1 answer

installing R package openNLP in R

I am trying to install openNLP package in R (mac) and keep getting the following error message > install.packages("openNLP") trying URL 'https://rweb.crmda.ku.edu/cran/bin/macosx/mavericks/contrib/3.2/openNLP_0.2-6.tgz' Content type…
Ashin Mukherjee
  • 267
  • 1
  • 4
  • 12
5
votes
1 answer

could not find function tagPOS

Trying to tag a sentence using openNLP. library(openNLP) str <- "this is a the first sentence." tagged_str <- tagPOS(str) Getting the following error: Error: could not find function "tagPOS" Any suggestions? Thanks.
user4359255
5
votes
1 answer

Is UIMA provides only a wrapper or is it like StandfordCore NLP and GATE?

The Standford Core NLP and the GATE provides the various NLP operation like NER, POS tagging. There are some of the NLP operation like Tokenizer, Snowball Stemmer available as a UIMA component. So, Is UIMA comparable with the StandfordCore NLP/GATE…
Gaurav
  • 531
  • 1
  • 4
  • 15
5
votes
0 answers

Get next word (or POS) suggestion for a given sentence. Autocomplete a sentence

I have to implement auto-suggestion feature in my desktop based java application. The requirement is as follow: A user will give a sentence as input and i have to return the next possible Part-Of-Speech as suggestion. Eg: 1. UserInput: Mike wants…
thekosmix
  • 1,705
  • 21
  • 35
5
votes
2 answers

traning OPenNLP error

I am trying to train a Name entity model using OpenNLP, but getting this error dont know what is missing. i am new to to this OPENNLP, any one please help, can provide Train.txt file if needed lineStream =…
Ashfaq
  • 197
  • 1
  • 2
  • 16
5
votes
0 answers

Error in Parts of Speech Tagging using openNLP

I have an Ubuntu Quantal 12.10 Server 64-bit instance. I am using openNLP for POS Tagging of sentences. I am using POS tagging using openNLP with “Parallel Lapply setup”. It is running fine in RStudio environment. But in Ubuntu environment it is…
Siddharth
  • 51
  • 2
5
votes
2 answers

Remove stop words from the parsed content using OpenNLP

I have parsed the document using OpenNLP parser code provided in this link and I got the following output: (TOP (S (NP (NN Programcreek)) (VP (VBZ is) (NP (DT a) (ADJP (RB very) (JJ huge) (CC and) (JJ useful)) (NN website))))) From this I want to…
user2598214
  • 51
  • 1
  • 2
5
votes
1 answer

How to calculate probabilities from confusion matrices? need denominator, chars matrices

This paper contains confusion matrices for spelling errors in a noisy channel. It describes how to correct the errors based on conditional properties. The conditional probability computation is on page 2, left column. In footnote 4, page 2, left…
necromancer
  • 23,916
  • 22
  • 68
  • 115
4
votes
1 answer

How to implement incremental learning in NLP

We are building a system wherein, we would have a initial very small amount of trained data to start with. The job is to Classify the incoming data(Document, for our case) into 2 categories: Category A & B. Data is document , so the user needs to…
4
votes
2 answers

OpenNLP: Unable to locate the model file for Lemmatizer

Summary: Unable to find the model file used for Lemmatizer (english-lemmatizer.bin) Details: OpenNLP Tools Models appears to be a comprehensive repository for the various models used by the different components of the Apache OpenNLP library. …
Sandeep
  • 1,245
  • 1
  • 13
  • 33
4
votes
1 answer

OpenNLP classifier output

At the moment I'm using the following code to train a classifier model : final String iterations = "1000"; final String cutoff = "0"; InputStreamFactory dataIn = new MarkableFileInputStreamFactory(new…
Patrick
  • 331
  • 3
  • 18
4
votes
1 answer

Apache Open NLP vs NLTK

We have a spring boot application integrated with Node.js and socket.io chat application , to which we want to integrate Natural language processing. Not getting any direction on which of these two Apache-OpenNlp or NLTK would be a better choice for…
Sharanya K M
  • 1,805
  • 4
  • 23
  • 44
4
votes
1 answer

Apache OpenNLP: java.io.FileInputStream cannot be cast to opennlp.tools.util.InputStreamFactory

I am trying to build a custom NER using Apache OpenNLP 1.7. From the documentation available Here, I have developed the following code import java.io.BufferedOutputStream; import java.io.FileInputStream; import java.io.FileOutputStream; import…
Hardik Gupta
  • 4,700
  • 9
  • 41
  • 83
4
votes
3 answers

opennlp vs stanford nlptools vs berkeley

Hi the aim is to parse a sizeable corpus like wikipedia to generate the most probable parse tree,and named entity recognition. Which is the best library to achieve this in terms of performance and accuracy? Has anyone used more than one of the…
Sharmila
  • 1,637
  • 2
  • 23
  • 30