Questions tagged [nlp]

Natural language processing (NLP) is a subfield of artificial intelligence that involves transforming or extracting useful information from natural language data. Methods include machine-learning and rule-based approaches.

Natural language processing (NLP) is a subfield of artificial intelligence that involves transforming or extracting useful information from natural language data. Methods include machine-learning and rule-based approaches. It is often regarded as the engineering arm of Computational Linguistics.

NOTE: If you want to use this tag for a question not directly concerning implementation, then consider posting on Data Science, or Artificial Intelligence instead; otherwise you're probably off-topic. Please choose one site only and do not cross-post to more than one - see Is cross-posting a question on multiple Stack Exchange sites permitted if the question is on-topic for each site? (tl;dr: no).

NLP tasks

Beginner books on Natural Language Processing

Popular software packages

20185 questions
6
votes
1 answer

How to ignore certain characters while doing diff in google-diff-match-patch?

I'm using google-diff-match-patch to compare plain text in natural languages. How can I make google-diff-match-patch to ignore certain characters? (Some tiny differences which I don't care.) For example, given text1: give me a cup of bean-milk.…
weakish
  • 28,682
  • 5
  • 48
  • 60
6
votes
1 answer

How to use Wordnet in SQL

How to use Wordnet in SQL database. Does it exists anywhere can someone give me step by step procedure
harsha
  • 131
  • 2
  • 7
6
votes
2 answers

most efficient edit distance to identify misspellings in names?

Algorithms for edit distance give a measure of the distance between two strings. Question: which of these measures would be most relevant to detect two different persons names which are actually the same? (different because of a mispelling). The…
seinecle
  • 10,118
  • 14
  • 61
  • 120
6
votes
2 answers

How google recognises 2 words without spaces?

I want to understand how google handles no space between 2 words. For example there are 2 words - word1 and word2. I write in search box 'word1word2', it says do you mean 'word1 word2' or just understands to look for 'word1 word2'. Any information…
John
  • 3,821
  • 2
  • 18
  • 25
6
votes
3 answers

python nltk keyword extraction from sentence

"First thing we do, let's kill all the lawyers." - William Shakespeare Given the quote above, I would like to pull out "kill" and "lawyers" as the two prominent keywords to describe the overall meaning of the sentence. I have extracted the…
waigani
  • 3,570
  • 5
  • 46
  • 71
6
votes
3 answers

How to implement Knowledge graph

I'm looking forward to implement something like google direct answers which uses knowledge graph, is there any useful resource can I read ? also Where can I find data for that?
Lisa
  • 3,121
  • 15
  • 53
  • 85
6
votes
2 answers

Tokenization, and indexing with Lucene, how to handle external tokenize and part-of-speech?

i would like to build my own - here am not sure which one - tokenizer (from Lucene point of view) or my own analyzer. I already write a code that tokenize my documents in word (as a List < String > or a List < Word > where Word is a class with only…
user1340802
  • 1,157
  • 4
  • 17
  • 36
6
votes
1 answer

Negating sentences using POS-tagging

I'm trying to find a way to negate sentences based on POS-tagging. Please consider: include_once 'class.postagger.php'; function negate($sentence) { $tagger = new PosTagger('includes/lexicon.txt'); $tags = $tagger->tag($sentence); foreach…
Pr0no
  • 3,910
  • 21
  • 74
  • 121
6
votes
1 answer

Latent Semantic Analysis in Python discrepancy

I'm trying to follow the Wikipedia Article on latent semantic indexing in Python using the following code: documentTermMatrix = array([[ 0., 1., 0., 1., 1., 0., 1.], [ 0., 1., 1., 0., 0., 0., 0.], …
Jmjmh
  • 2,016
  • 1
  • 13
  • 11
6
votes
3 answers

How to detect if a event/action occurred from a text?

I was wondering if there's a NLP/ML technique for this. Suppose given a set of sentences, I watched the movie. Heard the movie is great, have to watch it. Got the tickets for the movie. I am at the movie. If i have to assign a probability to…
excray
  • 2,738
  • 4
  • 33
  • 49
6
votes
1 answer

Wordnet edit tree structure

I'm developing an application that uses the Wordnet conceptual hierarchy for its operation. I found that some words I need are missing in the database. Is there an API or tool, or any other way I can insert new words, edit the structure etc.? (I'm…
lahiru madhumal
  • 1,185
  • 2
  • 12
  • 30
6
votes
1 answer

Using Sentiwordnet 3.0

I plan on using Sentiwordnet 3.0 for Sentiment classification. Could someone clarify as to what the numbers associated with words in Sentiwordnet represent? For e.g. what does 5 in rank#5 mean? Also for POS what is the letter used to represent…
Amal Antony
  • 6,477
  • 14
  • 53
  • 76
6
votes
2 answers

Splitting string containing letters and numbers not separated by any particular delimiter in PHP

Currently I am developing a web application to fetch Twitter stream and trying to create a natural language processing by my own. Since my data is from Twitter (limited by 140 characters) there are many words shortened, or on this case, omitted…
akhy
  • 5,760
  • 6
  • 39
  • 60
5
votes
3 answers

Algorithm to compare similarity of ideas (as strings)

Consider an arbitrary text box that records the answer to the question, what do you want to do before you die? Using a collection of response strings (max length 240), I'd like to somehow sort and group them and count them by idea (which may be just…
Kristian
  • 21,204
  • 19
  • 101
  • 176
5
votes
1 answer

Find Synonyms for multi-word phrases

Is it possible for the python library NLTK to suggest/create synonyms for groups of words? For example; for the word/group "main course" can I use NLTK to get the synonyms "main dish", "main meal", "dinner" etc.? Heres my code that works for single…
sazr
  • 24,984
  • 66
  • 194
  • 362