Unable to tokenise data in python

Question

This is my code, I want to import a CSV file and only tokenize texts from one column. the column is named 'tweet'. I'm unable to get the output for this code

import nltk
import pandas as pd
import numpy

from nltk import sent_tokenize
from nltk import word_tokenize
from nltk import pos_tag


data = pd.read_csv('/Users/yoshithKotla/Desktop/dingdang/finaldid.csv')

Texts = list(data['tweet'].values)

tokenData = [nltk.word_tokenize(tweet) for tweet in Texts]

score 0 · Answer 1 · answered Apr 08 '21 at 04:18

0

The NLTK data package includes a pre-trained Punkt tokenizer for English. Use this

nltk.download('punkt')

answered Apr 08 '21 at 04:18

Samkit_Saraf

1
1
2

doesn't answer my question – Yoshith Kotla Apr 08 '21 at 05:35
I tried the code and it worked fine. Can you specify the problem you are facing or what error you are getting? – Samkit_Saraf Apr 08 '21 at 13:27

Unable to tokenise data in python

1 Answers1