Questions tagged [nlp]

Natural language processing (NLP) is a subfield of artificial intelligence that involves transforming or extracting useful information from natural language data. Methods include machine-learning and rule-based approaches.

Natural language processing (NLP) is a subfield of artificial intelligence that involves transforming or extracting useful information from natural language data. Methods include machine-learning and rule-based approaches. It is often regarded as the engineering arm of Computational Linguistics.

NOTE: If you want to use this tag for a question not directly concerning implementation, then consider posting on Data Science, or Artificial Intelligence instead; otherwise you're probably off-topic. Please choose one site only and do not cross-post to more than one - see Is cross-posting a question on multiple Stack Exchange sites permitted if the question is on-topic for each site? (tl;dr: no).

NLP tasks

Beginner books on Natural Language Processing

Popular software packages

20185 questions
6
votes
0 answers

How do I add a new dictionary database to cTAKES

How do I add a new database to the cTAKES pipeline to perform lookup from? How do I specify what columns to look up and how to annotate the text with the returned hits? I have gone through the DictionaryLookupAnnotatorDB.xml and LookupDesc_Db.xml…
supriyo_basak
  • 505
  • 1
  • 7
  • 24
6
votes
3 answers

English verb inflector

Does anybody know of an English verb inflector that I can use on a lexicon of verbs (in present-participle) that can give me other inflected forms of the verbs? For example: I give it I get ========= …
Ken Bloom
  • 57,498
  • 14
  • 111
  • 168
6
votes
2 answers

How to "update" an existing Named Entity Recognition model - rather than creating from scratch?

Please see the tutorial steps for OpenNLP - Named Entity Recognition : Link to tutorial I am using the "en-ner-person.bin" model found here In the tutorial, there are instructions on Training and creating a new model. Is there any way to "Update"…
sky
  • 2,531
  • 4
  • 17
  • 15
6
votes
1 answer

How to implement category based text tagging using WordNet or related to wordnet?

How to tag text using wordnet by word's category (java as a interfacer ) ? Example Consider the sentences: 1) Computers need keyboard , moniter , CPU to work. 2) Automobile uses gears and clutch . Now my objective is , the example sentences have…
Ever Think
  • 683
  • 8
  • 22
6
votes
3 answers

Extracting Important words from a sentence using Node

I admit that I havent searched extensively in the SO database. I tried reading the natural npm package but doesnt seem to provide the feature. I would like to know if the below requirement is somewhat possible ? I have a database that has list of…
Vaya
  • 560
  • 6
  • 20
6
votes
1 answer

How to rank features by their importance in a Weka classifier?

I use Weka to successfully build a classifier. I would now like to evaluate how effective or important my features are. Fot this I use AttributeSelection. But I don't know how to ouput the different features with their corresponding importance. I…
6
votes
2 answers

Python and NLTK: How to analyze sentence grammar?

I have this code which should show the syntactic structure of the sentence according to defined grammar. However it is returning an empty []. What am I missing or doing wrong? import nltk grammar = nltk.parse_cfg(""" S -> NP VP PP -> P NP NP ->…
Helena
  • 921
  • 1
  • 15
  • 24
6
votes
1 answer

How to download datasets for sklearn? - python

In NLTK there is a nltk.download() function to download the datasets that are comes with the NLP suite. In sklearn, it talks about loading data sets (http://scikit-learn.org/stable/datasets/) and fetching datas from http://mldata.org/ but for the…
alvas
  • 115,346
  • 109
  • 446
  • 738
6
votes
1 answer

Stanford CoreNLP remove/stop red information print outs

I'm using Stanford's CoreNLP Java API and while running it prints out information in red. It just fills up the command lines when i don't want to see it. is there anyway of disabling this feature? Example of the red info lines: Searching for…
Greg
  • 754
  • 9
  • 18
6
votes
1 answer

OpenNLP: foreign names does not get recognized

I just started using openNLP to recognize names. I am using the model (en-ner-person.bin) that comes with open NLP. I noticed that while it recognizes us, uk, and european names, it fails to recognize Indian or Japanese names. My questions are (1)…
Shirish Kumar
  • 1,532
  • 17
  • 23
6
votes
2 answers

Multi-label classification for large dataset

I am solving a multilabel classification problem. I have about 6 Million of rows to be processed which are huge chunks of text. They are tagged with multiple tags in a separate column. Any advice on what scikit libraries can help me scale up my…
6
votes
2 answers

How to perform Paragraph boundary detection in NLP frameworks?

I am working on extracting names of people from various ads appearing in English newspapers . However , i have noticed that I need to identify the boundary of an Ad , before extracting the names present in it ,since I need only the first occurring…
kiran
  • 339
  • 4
  • 18
6
votes
1 answer

Getting additional information (Active/Passive, Tenses ...) from a Tagger

I'm using the Stanford Tagger for determining the Parts of Speech. However, I want to get more information out of the text. Is there a possibility to get further information like the tense of the sentence or if it is in active/passive? So far, I'm…
David Müller
  • 5,291
  • 2
  • 29
  • 33
6
votes
3 answers

How to Normalize Names

I am using pandas dataframes and I have data where I have customers per company. However, the company titles vary slightly but ultimately affect the data. Example: Company Customers AAAB 1,000 AAAB Inc. 900 The AAAB Inc. 20 AAAB the INC …
Alexis_Kiwis
  • 101
  • 2
  • 6
6
votes
3 answers

Converting an English Statement into a Questi0n

(Apologies for the title. Stack overflow doesn't allow the word "Question" in titles.) How would one go about writing an algorithm to convert an english statement into a question? Where would one even begin? For example: "The ingredients for an…
nobillygreen
  • 1,548
  • 5
  • 19
  • 27