Questions tagged [nltk]

The Natural Language Toolkit is a Python library for computational linguistics.

The Natural Language ToolKit (NLTK) is a Python library for computational linguistics. It is currently available for Python versions 2.7 or 3.2+

NLTK includes a great number of common natural language processing tools including a tokenizer, chunker, a part of speech (POS) tagger, a stemmer, a lemmatizer, and various classifiers such as Naive Bayes and Decision Trees. In addition to these tools, NLTK has built in many common corpora including the Brown Corpus, Reuters, and WordNet. The NLTK corpora collection also includes a few non-English corpora in Portuguese, Polish and Spanish.

The book Natural Language Processing with Python - Analyzing Text with the Natural Language Toolkit by Steven Bird, Ewan Klein, and Edward Loper is freely available online under the Creative Commons Attribution Noncommercial No Derivative Works 3.0 US Licence. A citable paper NLTK: the natural language ToolKit was first published in 2003 and then again in 2006 for researchers to acknowledge the contribution in ongoing research in Computational Linguistics.

NLTK is currently distributed under an Apache version 2.0 licence.

7139 questions

votes

3 answers

tokenize sentence into words python

I want to extract information from different sentences so i'm using nltk to divide each sentence to words, I'm using this code: words=[] for i in range(len(sentences)): words.append(nltk.word_tokenize(sentences[i])) words it works pretty…

asked Jan 14 '22 at 12:34

Hermoine

votes

0 answers

NLTK not getting imported in VS Code

I have just started learning NLP and for that purpose I installed nltk package using pip install nltk in the cmd terminal of VS Code. After I installed it, I tried importing it in the command line itself and I was successful but in the main window…

python import nlp nltk

asked Jan 13 '22 at 04:57

Tanmay Gupta

votes

2 answers

Find all words in a sentence related to a keyword

I have the following text and want to isolate a part of the sentence related to a keyword, in this case keywords = ['pizza', 'chips']. text = "The pizza is great but the chips aren't the best" Expected Output: {'pizza': 'The pizza is…

python nlp nltk spacy

asked Jan 02 '22 at 20:59

Ali

votes

1 answer

Why the the total number in confusion matrix not same as the data input?

Why the total confusion matrix does not have the same number os samples as the dataset? The dataset contains 7514 but the total at confusion matrix not exceed 2000. Here is the code: import re import nltk nltk.download('stopwords') from nltk.corpus…

python machine-learning scikit-learn nlp nltk

asked Dec 16 '21 at 10:26

mino

votes

1 answer

Error with count tags (nltk) in column dataframe

# Extracting definition from different words in each sentence # Extractinf from ecah row the, NOUN, VERBS, NOUN Plural text = data['Omschrijving_Skill_without_stopwords'].tolist() tagged_texts = pos_tag_sents(map(word_tokenize, text)) data['pos'] =…

python nlp data-science nltk

asked Dec 14 '21 at 11:20

Leyla Elkhamlichi

votes

2 answers

How to check if a given english sentence contains all non-meaning words using python?

I want to check in a Python program if a given english sentence contains all non-meaning words. Return true if sentence has all words that have no meaning e.g. sdfsdf sdf ssdf fsdf dsd sd Return false if sentence contains at least one word that has…

python python-3.x dictionary nltk

asked Dec 06 '21 at 08:07

Rohit

6,941
17
58
102

votes

1 answer

How to do chapter analysis from books imported from nltk.corpus.gutenberg.fileids()

I am a newbie using python. Now I am doing natural language processing for a novel, and I choose to load the book from nltk.corpus.gutenberg.fileids(). I just use 'Sense and Sensibility'. Then I want to analyze each chapter. How to split the whole…

python nlp format nltk wordpress-gutenberg

asked Nov 29 '21 at 13:25

Freda Yu

votes

1 answer

Getting synsets of custom hungarian wordnet dictionary with nltk

I am very new to NLP and I might be doing something wrong. I would like to work with a hungarian text where I can get the synset/hyponym/hypernym of some selected words. I am working in python. As Open Multilingual Wordnet does not have hungarian…

python nlp nltk wordnet non-english

asked Nov 25 '21 at 22:30

hunsnowboarder

votes

0 answers

Final Semester project about semantic analysis/information retrieval

I'm moving to my final year at college Engineering Computer Science department and i wanted to have my graduation project in a topic related to information retrieval & semantic analysis. I've had my internship in those topics and i'm very…

artificial-intelligence nlp machine-learning semantics nltk

asked Aug 08 '11 at 19:02

Hady Elsahar

2,121
4
29
47

votes

1 answer

Q : Python Spell Checker using NLTK

So i have this line of code using NLTK library def autospell(text): spells = [spell(w) for w in (nltk.word_tokenize(text))] return " ".join(spells) train_data['Phrase'][:200].apply(autospell) And i got this error telling me that…

python nltk spell-checking

asked Nov 05 '21 at 09:58

Arkan

votes

2 answers

Deleting and updating a string and entity index in a text document for NER training data

I am trying to create a training dataset for NER recognition. For that, I have huge amounts of data that need to be tagged and remove the unnecessary sentences. On removing the unnecessary sentence the index potion must be updated. Last day I saw…

python string nlp nltk spacy

asked Oct 23 '21 at 05:51

imhans33

votes

2 answers

Jython: ImportError: No module named multiarray

When I try to call file and its method using Jython it shows the following error, while my Numpy, Python and NLTK is correctly installed and it works properly if I directly run directly from the Python shell File…

python numpy nlp jython nltk

asked Aug 05 '11 at 12:32

ninja123

votes

2 answers

How to tokenize a string in consecutive pairs using python?

My Input is "I like to play basketball". And the Output I am looking for is "I like", "like to", "to play", "play basketball". I have used Nltk word tokenize but that gives single tokens only. I have these type of statements in a huge database and…

python dataframe nltk

asked Oct 06 '21 at 12:07

Saurabh

votes

1 answer

nltk_data installation gives RuntimeWarning

python3 -m nltk.downloader -d /usr/local/share/nltk_data all Upon running the above command in GCP, I face the following RuntimeWarning 'nltk.downloader' found in sys.modules after import of package 'nltk', but prior to execution of…

python python-3.x google-cloud-platform nltk

asked Oct 05 '21 at 14:20

Tony Stark

votes

1 answer

NLTK doesn't lemmatize uppercase words

I'm trying to change plural words to singular in a string with a mix of upper case and lowercase words. e.g. CARDBOARD BOXES, DIMENSIONS: 19cm H x 10cm W x 30cm D I used NLTK package to do so but it only accept lowercase strings and I don't want to…

python nlp nltk

asked Aug 11 '21 at 04:28

Smiths

Prev 1 2 3

…

100 Next