0

I am recently doing project in nlp using python. where I need to pre process a csv file which contains text with many row and column.I could became able to stem only simple sentence only. And couldn't able to stem whole csv file at once. How can i do that?? while trying to stem simple csv file i get the error

import csv
from nltk import PorterStemmer
port = PorterStemmer()

with open('status.csv', 'rb') as f:
    reader = csv.reader(f)
    for row in reader:
        print(port.stem(row))

error was

kiran
  • 21
  • 2
  • @leavesof3 is right, you can also use python pandas package for manipulating csv files and apply stem and other nlp activities. – Gomes Mar 16 '16 at 02:24

1 Answers1

0

So, there's a few steps you'll have to do.

#Part 1
>>> import nltk
>>> from nltk import PorterStemmer
>>> test = 'this sentence is just a tester set of words'
>>> test_tokenize = nltk.word_tokenize(test)
>>> test_tokenize
['this', 'sentence', 'is', 'just', 'a', 'tester', 'set', 'of', 'words']
>>> port = PorterStemmer()
>>> for word in test_tokenize:
...     print port.stem(word)
... 
thi
sentenc
is
just
a
tester
set
of
word

#Part 2
with open('status.csv', 'rb') as f: 
    reader = csv.reader(f)
    for row in reader:
       #reference the column where the text is located
       #text = row[column_index_for_text]
       #then just complete the steps in part 1 to get the stemmed words
leavesof3
  • 421
  • 4
  • 5