1

I want to list the collocations as reported by the NLTK for Dracula.txt. How do i do this? I am able to find word frequency by adding it to my corpus. I also have a variable DracWords dracWords = mycorpus.words('Dracula.txt') which has the words from the Dracula text. From this i can do Frequency Distributions, but what I want now is to list the collocations from it.

Any help is appreciated.

2 Answers2

1

You can try this:

from collections import Counter

text = 'List the collocations for a txt file'
words = text.split()
nextword = iter(words)
next(nextword)

print(Counter(zip(words, nextword)))

And you will get:

Counter({('txt', 'file'): 1, ('List', 'the'): 1, ('collocations', 'for'): 1, ('for', 'a'): 1, ('the', 'collocations'): 1, ('a', 'txt'): 1})

Hope this helps.

McGrady
  • 10,869
  • 13
  • 47
  • 69
1

Thanks everyone. was able to get it with

nltk.Text(mycorpus.words('Dracula.txt')).collocations()