Questions tagged [nltk]

The Natural Language Toolkit is a Python library for computational linguistics.

The Natural Language ToolKit (NLTK) is a Python library for computational linguistics. It is currently available for Python versions 2.7 or 3.2+

NLTK includes a great number of common natural language processing tools including a tokenizer, chunker, a part of speech (POS) tagger, a stemmer, a lemmatizer, and various classifiers such as Naive Bayes and Decision Trees. In addition to these tools, NLTK has built in many common corpora including the Brown Corpus, Reuters, and WordNet. The NLTK corpora collection also includes a few non-English corpora in Portuguese, Polish and Spanish.

The book Natural Language Processing with Python - Analyzing Text with the Natural Language Toolkit by Steven Bird, Ewan Klein, and Edward Loper is freely available online under the Creative Commons Attribution Noncommercial No Derivative Works 3.0 US Licence. A citable paper NLTK: the natural language ToolKit was first published in 2003 and then again in 2006 for researchers to acknowledge the contribution in ongoing research in Computational Linguistics.

NLTK is currently distributed under an Apache version 2.0 licence.

7139 questions

votes

7 answers

What is the best stemming method in Python?

I tried all the nltk methods for stemming but it gives me weird results with some words. Examples It often cut end of words when it shouldn't do it : poodle => poodl article articl or doesn't stem very good : easily and easy are not stemmed in…

python nltk stemming

asked Jul 09 '14 at 07:12

PeYoTlL

3,144
2
17
18

votes

4 answers

Use of PunktSentenceTokenizer in NLTK

I am learning Natural Language Processing using NLTK. I came across the code using PunktSentenceTokenizer whose actual use I cannot understand in the given code. The code is given : import nltk from nltk.corpus import state_union from nltk.tokenize…

python nlp nltk

asked Feb 08 '16 at 16:55

arqam

3,582
5
34
69

votes

10 answers

Python Untokenize a sentence

There are so many guides on how to tokenize a sentence, but i didn't find any on how to do the opposite. import nltk words = nltk.word_tokenize("I've found a medicine for my disease.") result I get is: ['I', "'ve", 'found', 'a', 'medicine',…

python python-2.7 nltk

asked Feb 22 '14 at 00:42

Brana

1,197
3
17
38

votes

5 answers

NLTK and language detection

How do I detect what language a text is written in using NLTK? The examples I've seen use nltk.detect, but when I've installed it on my mac, I cannot find this package.

python nlp nltk detection

asked Jul 05 '10 at 21:30

niklassaers

8,480
20
99
146

votes

4 answers

NLTK WordNet Lemmatizer: Shouldn't it lemmatize all inflections of a word?

I'm using the NLTK WordNet Lemmatizer for a Part-of-Speech tagging project by first modifying each word in the training corpus to its stem (in place modification), and then training only on the new corpus. However, I found that the lemmatizer is not…

python nlp nltk

asked Aug 27 '14 at 18:10

sanjeev mk

4,276
6
44
69

votes

8 answers

How to get synonyms from nltk WordNet Python

WordNet is great, but I'm having a hard time getting synonyms in nltk. If you search similar to for the word 'small' like here, it shows all of the synonyms. Basically I just need to know the following: wn.synsets('word')[i].option() Where option…

python nltk wordnet

asked Oct 08 '13 at 21:20

user2758113

1,001
1
13
25

votes

4 answers

How to tweak the NLTK sentence tokenizer

I'm using NLTK to analyze a few classic texts and I'm running in to trouble tokenizing the text by sentence. For example, here's what I get for a snippet from Moby Dick: import nltk sent_tokenize =…

python nlp nltk

asked Dec 30 '12 at 23:59

Chris Wilson

6,599
8
35
71

votes

7 answers

How do I do dependency parsing in NLTK?

Going through the NLTK book, it's not clear how to generate a dependency tree from a given sentence. The relevant section of the book: sub-chapter on dependency grammar gives an example figure but it doesn't show how to parse a sentence to come up…

python nlp grammar nltk

asked Sep 16 '11 at 10:26

MrD

2,405
3
22
23

votes

6 answers

How to use spacy's lemmatizer to get a word into basic form

I am new to spacy and I want to use its lemmatizer function, but I don't know how to use it, like I into strings of word, which will return the string with the basic form the words. Examples: 'words'=> 'word' 'did' => 'do' Thank you.

python nltk spacy lemmatization

asked Aug 04 '16 at 09:04

yi wang

votes

2 answers

Is there a corpus of English words in nltk?

Is there any way to get the list of English words in python nltk library? I tried to find it but the only thing I have found is wordnet from nltk.corpus. But based on documentation, it does not have what I need (it finds synonyms for a word). I know…

nltk

asked Feb 05 '15 at 08:48

Salvador Dali

214,103
147
703
753

votes

3 answers

How do I create my own NLTK text from a text file?

I'm a Literature grad student, and I've been going through the O'Reilly book in Natural Language Processing (nltk.org/book). It looks incredibly useful. I've played around with all the example texts and example tasks in Chapter 1, like concordances.…

python nltk

asked May 06 '12 at 00:13

Jonathan

10,571
13
67
103

votes

2 answers

How to extract numbers (along with comparison adjectives or ranges)

I am working on two NLP projects in Python, and both have a similar task to extract numerical values and comparison operators from sentences, like the following: "... greater than $10 ... ", "... weight not more than 200lbs ...", "... height in 5-7…

python regex nlp nltk spacy

asked Jul 16 '17 at 07:19

svfat

3,273
1
15
34

votes

3 answers

Python NLTK pos_tag not returning the correct part-of-speech tag

Having this: text = word_tokenize("The quick brown fox jumps over the lazy dog") And running: nltk.pos_tag(text) I get: [('The', 'DT'), ('quick', 'NN'), ('brown', 'NN'), ('fox', 'NN'), ('jumps', 'NNS'), ('over', 'IN'), ('the', 'DT'), ('lazy',…

python machine-learning nlp nltk pos-tagger

asked Jun 13 '15 at 16:52

faceoff

votes

2 answers

Finding Proper Nouns using NLTK WordNet

Is there any way to find proper nouns using NLTK WordNet?Ie., Can i tag Possessive nouns using nltk Wordnet ?

python nltk wordnet

asked Jul 16 '13 at 06:57

Backue

votes

5 answers

Convert words between verb/noun/adjective forms

i would like a python library function that translates/converts across different parts of speech. sometimes it should output multiple words (e.g. "coder" and "code" are both nouns from the verb "to code", one's the subject the other's the object) #…

python nlp nltk wordnet

asked Jan 23 '13 at 21:01

sam boosalis

1,997
4
20
32

Prev 1 2 3

…

99 100 Next