I have a number of text file. I would like to use NLTK for preprocessing and printing the vocabulary in a plain text .text format, so that I can distribute those file for the people to use. I did following to do it.I started with taking single file:
file1 = open("path/to/text/file","rU")
raw = file1.read()
tokens = nltk.wordpunct_tokenize(raw)
words = [w.lower for w in tokens]
vocab = sorted(set(tokens))
Now i would like to list of items in vocab into a plain text .txt
human
readable file. How would I do it?