Questions tagged [nltk]

The Natural Language Toolkit is a Python library for computational linguistics.

The Natural Language ToolKit (NLTK) is a Python library for computational linguistics. It is currently available for Python versions 2.7 or 3.2+

NLTK includes a great number of common natural language processing tools including a tokenizer, chunker, a part of speech (POS) tagger, a stemmer, a lemmatizer, and various classifiers such as Naive Bayes and Decision Trees. In addition to these tools, NLTK has built in many common corpora including the Brown Corpus, Reuters, and WordNet. The NLTK corpora collection also includes a few non-English corpora in Portuguese, Polish and Spanish.

The book Natural Language Processing with Python - Analyzing Text with the Natural Language Toolkit by Steven Bird, Ewan Klein, and Edward Loper is freely available online under the Creative Commons Attribution Noncommercial No Derivative Works 3.0 US Licence. A citable paper NLTK: the natural language ToolKit was first published in 2003 and then again in 2006 for researchers to acknowledge the contribution in ongoing research in Computational Linguistics.

NLTK is currently distributed under an Apache version 2.0 licence.

7139 questions

votes

5 answers

What to download in order to make nltk.tokenize.word_tokenize work?

I am going to use nltk.tokenize.word_tokenize on a cluster where my account is very limited by space quota. At home, I downloaded all nltk resources by nltk.download() but, as I found out, it takes ~2.5GB. This seems a bit overkill to me. Could you…

python nltk

asked May 08 '16 at 14:49

petrbel

2,428
5
29
49

votes

4 answers

AttributeError: 'list' object has no attribute 'copy'

I have the following code snippet classifier = NaiveBayesClassifier.train(train_data) #classifier.show_most_informative_features(n=20) results = classifier.classify(test_data) and the error shows in the following line results =…

python list nltk

asked May 05 '16 at 20:02

Amr Ragab

votes

5 answers

how to use word_tokenize in data frame

I have recently started using the nltk module for text analysis. I am stuck at a point. I want to use word_tokenize on a dataframe, so as to obtain all the words used in a particular row of the dataframe. data example: text 1. This is a…

python pandas nltk

asked Oct 13 '15 at 08:44

eclairs

1,515
6
21
26

votes

6 answers

Text mining with PHP

I'm doing a project for a college class I'm taking. I'm using PHP to build a simple web app that classify tweets as "positive" (or happy) and "negative" (or sad) based on a set of dictionaries. The algorithm I'm thinking of right now is Naive Bayes…

php nlp data-mining nltk weka

asked May 06 '10 at 17:17

garyc40

votes

3 answers

nltk NaiveBayesClassifier training for sentiment analysis

I am training the NaiveBayesClassifier in Python using sentences, and it gives me the error below. I do not understand what the error might be, and any help would be good. I have tried many other input formats, but the error remains. The code given…

python nlp nltk sentiment-analysis textblob

asked Dec 29 '13 at 17:00

student001

votes

14 answers

NLTK fails to find the Java executable

I am using NLTK's nltk.tag.stanford, which needs to call the java executable. I set JAVAHOME to C:\Program Files\Java\jdk1.6.0_25 where my jdk is installed, but when run the program I get the error "NLTK was unable to find the java executable! Use…

java python tags config nltk

asked Sep 13 '11 at 15:46

Thomas Chu

votes

8 answers

How do I find the frequency count of a word in English using WordNet?

Is there a way to find the frequency of the usage of a word in the English language using WordNet or NLTK using Python? NOTE: I do not want the frequency count of a word in a given input file. I want the frequency count of a word in general based on…

python nltk wordnet

asked May 08 '11 at 16:26

Apps

votes

1 answer

nltk wordpunct_tokenize vs word_tokenize

Does anyone know the difference between nltk's wordpunct_tokenize and word_tokenize? I'm using nltk=3.2.4 and there's nothing on the doc string of wordpunct_tokenize that explains the difference. I couldn't find this info either in the documentation…

python nltk

asked May 08 '18 at 18:25

tsando

4,557
2
33
35

votes

7 answers

NLTK - AttributeError: module 'nltk' has no attribute 'data'

I used nltk in my code for a few days, but now, when I try to import nltk, I get the error: File "C:\Users\Nada\Anaconda\lib\site-packages\nltk\corpus\reader\plaintext.py", line 42, in PlaintextCorpusReader…

python-3.x import nltk

asked Aug 20 '17 at 19:31

user8451312

votes

6 answers

Extracting all Nouns from a text file using nltk

Is there a more efficient way of doing this? My code reads a text file and extracts all Nouns. import nltk File = open(fileName) #open file lines = File.read() #read all lines sentences = nltk.sent_tokenize(lines) #tokenize sentences nouns = []…

python nltk

asked Nov 07 '15 at 20:54

Rakesh Adhikesavan

11,966
18
51
76

votes

7 answers

How to identify the subject of a sentence?

Can Python + NLTK be used to identify the subject of a sentence? From what I have learned till now is that a sentence can be broken into a head and its dependents. For e.g. "I shot an elephant". In this sentence, I and elephant are dependents to…

python nlp nltk

asked Feb 19 '15 at 22:38

singhalc

votes

6 answers

downloading error using nltk.download()

I am experimenting NLTK package using Python. I tried to downloaded NLTK using nltk.download(). I got this kind of error message. How to solve this problem? Thanks. The system I used is Ubuntu installed under VMware. The IDE is Spyder. After using…

python python-2.7 ubuntu nltk spyder

asked Dec 26 '14 at 14:35

user288609

12,465
26
85
127

votes

2 answers

What would cause WordNetCorpusReader to have no attribute LazyCorpusLoader?

I've got a short function to check whether a word is a real word by comparing it to the WordNet corpus from the Natural Language Toolkit. I'm calling this function from a thread that validates txt files. When I run my code, the first time the…

python multithreading exception attributes nltk

asked Dec 11 '14 at 22:11

Cecilia

4,512
3
32
75

votes

4 answers

Semantic Role Labeling using NLTK

I have a list of sentences and I want to analyze every sentence and identify the semantic roles within that sentence. How do I do that? I came across the PropBankCorpusReader within NLTK module that adds semantic labeling information to the Penn…

python nltk semantic-markup

asked Dec 14 '13 at 19:13

Prahalad Deshpande

4,709
1
20
22

votes

1 answer

NLTK for Persian

How to use functions of NLTK for Persian? For example: 'concordance'. When I use 'concordance', the answer is 'not match', however there is the parameter of concordance in my text. the input is very simple .it contains of "hello سلام".when parameter…

python nlp nltk

asked Jul 16 '13 at 19:04

ikj

Prev 1 2 3

…

99 100 Next