0

This is my code, I want to import a CSV file and only tokenize texts from one column. the column is named 'tweet'. I'm unable to get the output for this code

import nltk
import pandas as pd
import numpy

from nltk import sent_tokenize
from nltk import word_tokenize
from nltk import pos_tag


data = pd.read_csv('/Users/yoshithKotla/Desktop/dingdang/finaldid.csv')

Texts = list(data['tweet'].values)

tokenData = [nltk.word_tokenize(tweet) for tweet in Texts]
Yoshith Kotla
  • 135
  • 1
  • 3
  • 13

1 Answers1

0

The NLTK data package includes a pre-trained Punkt tokenizer for English. Use this

nltk.download('punkt')

Samkit_Saraf
  • 1
  • 1
  • 2