Questions tagged [stanford-nlp]

Questions on the open source natural language processing software from the Stanford University NLP Group, in Java, Python, and C, including Stanford CoreNLP, Stanza, and GloVe.

The Stanford NLP Group from Stanford University makes available multiple pieces of open source software. Stanford CoreNLP provides a suite of Java tools for statistical natural language processing. It includes the Stanford Parser, phrase structure and dependency parsers for English, Chinese, German, Arabic, French, and Spanish; the Stanford part-of-speech (POS) tagger for these languages; Stanford NER for named-entity recognition; and several other libraries for NLP tasks. Stanza is a new library in Python which covers tokenization, POS tagging and dependency parsing, supporting over 60 human languages using Universal Dependencies representations, and NER and sentiment analysis. GloVe is a widely used tool in C for neural word vectors, accompanied by several sets of word vectors for English.

Licences

The various tools are all available open source. Licensing varies; the Java tools are licensed under the GNU General Public License (v2 or later). Commercial licences are also available from the Stanford NLP Group. Stanza and GloVe are licensed under the Apache License (v2).

3304 questions
184
votes
10 answers

Java Stanford NLP: Part of Speech labels?

The Stanford NLP, demo'd here, gives an output like this: Colorless/JJ green/JJ ideas/NNS sleep/VBP furiously/RB ./. What do the Part of Speech tags mean? I am unable to find an official list. Is it Stanford's own system, or are they using…
Nick Heiner
  • 119,074
  • 188
  • 476
  • 699
99
votes
18 answers

How to use Stanford Parser in NLTK using Python

Is it possible to use Stanford Parser in NLTK? (I am not talking about Stanford POS.)
ThanaDaray
  • 1,693
  • 5
  • 22
  • 28
93
votes
3 answers

How to train the Stanford Parser with Genia Corpus?

I have some problems to create a new model for Stanford Parser. I have also downloaded the last version from Stanford: http://nlp.stanford.edu/software/lex-parser.shtml And here, Genia Corpus in 2 formats, xml and ptb (Penn Treebank). Standford…
nathan
  • 987
  • 6
  • 7
33
votes
7 answers

NLTK vs Stanford NLP

I have recently started to use NLTK toolkit for creating few solutions using Python. I hear a lot of community activity regarding using Stanford NLP. Can anyone tell me the difference between NLTK and Stanford NLP? Are they two different libraries?…
RData
  • 959
  • 1
  • 13
  • 33
30
votes
10 answers

Stanford nlp for python

All I want to do is find the sentiment (positive/negative/neutral) of any given string. On researching I came across Stanford NLP. But sadly its in Java. Any ideas on how can I make it work for python?
90abyss
  • 7,037
  • 19
  • 63
  • 94
29
votes
3 answers

How can a tree be encoded as input to a neural network?

I have a tree, specifically a parse tree with tags at the nodes and strings/words at the leaves. I want to pass this tree as input into a neural network all the while preserving its structure. Current approach Assume we have some dictionary of words…
28
votes
12 answers

How can I split a text into sentences using the Stanford parser?

How can I split a text or paragraph into sentences using Stanford parser? Is there any method that can extract sentences, such as getSentencesFromString() as it's provided for Ruby?
S Gaber
  • 1,536
  • 7
  • 24
  • 43
28
votes
3 answers

Ease of use: Stanford CoreNLP vs. OpenNLP

I looking to use a suite of NLP tools for a personal project, and I was wondering whether Stanford's CoreNLP is easier to use or OpenNLP. Or is there another free package you would reccomend? I haven't really done any NLP before, so I am looking for…
Pratik Thaker
  • 637
  • 2
  • 10
  • 18
28
votes
4 answers

How to Train GloVe algorithm on my own corpus

I tried to follow this. But some how I wasted a lot of time ending up with nothing useful. I just want to train a GloVe model on my own corpus (~900Mb corpus.txt file). I downloaded the files provided in the link above and compiled it using cygwin…
Codir
  • 311
  • 1
  • 3
  • 7
28
votes
3 answers

Is it possible to train Stanford NER system to recognize more named entities types?

I'm using some NLP libraries now, (stanford and nltk) Stanford I saw the demo part but just want to ask if it possible to use it to identify more entity types. So currently stanford NER system (as the demo shows) can recognize entities as…
JudyJiang
  • 2,207
  • 6
  • 27
  • 47
27
votes
6 answers

Extract list of Persons and Organizations using Stanford NER Tagger in NLTK

I am trying to extract list of persons and organizations using Stanford Named Entity Recognizer (NER) in Python NLTK. When I run: from nltk.tag.stanford import NERTagger st =…
user1680859
  • 1,160
  • 2
  • 24
  • 40
26
votes
3 answers

Maven fails to download CoreNLP models

When building the sample application from the Stanford CoreNLP website, I ran into a curious exception: Exception in thread "main" java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger…
Jonny
  • 1,453
  • 16
  • 25
26
votes
3 answers

How to detect that two sentences are similar?

I want to compute how similar two arbitrary sentences are to each other. For example: A mathematician found a solution to the problem. The problem was solved by a young mathematician. I can use a tagger, a stemmer, and a parser, but I don’t…
SahelSoft
  • 615
  • 2
  • 9
  • 22
24
votes
1 answer

pronoun resolution backwards

The usual coreference resolution works in the following way: Provided The man likes math. He really does. it figures out that he refers to the man. There are plenty of tools to do this. However, is there a way to do it backwards? For…
ytrewq
  • 3,670
  • 9
  • 42
  • 71
23
votes
3 answers

Training n-gram NER with Stanford NLP

Recently I have been trying to train n-gram entities with Stanford Core NLP. I have followed the following tutorials - http://nlp.stanford.edu/software/crf-faq.shtml#b With this, I am able to specify only unigram tokens and the class it belongs to.…
1
2 3
99 100