List the collocations for a txt file

Question

I want to list the collocations as reported by the NLTK for Dracula.txt. How do i do this? I am able to find word frequency by adding it to my corpus. I also have a variable DracWords dracWords = mycorpus.words('Dracula.txt') which has the words from the Dracula text. From this i can do Frequency Distributions, but what I want now is to list the collocations from it.

Any help is appreciated.

Possible duplicate of [How to find collocations in text, python](http://stackoverflow.com/questions/4128583/how-to-find-collocations-in-text-python) — DYZ, Jan 24 '17 at 07:03
nah this one is getting it from a txt added to the corpus already. — Kimberly James, Jan 24 '17 at 07:53

score 1 · Answer 1 · answered Jan 24 '17 at 07:17

You can try this:

from collections import Counter

text = 'List the collocations for a txt file'
words = text.split()
nextword = iter(words)
next(nextword)

print(Counter(zip(words, nextword)))

And you will get:

Counter({('txt', 'file'): 1, ('List', 'the'): 1, ('collocations', 'for'): 1, ('for', 'a'): 1, ('the', 'collocations'): 1, ('a', 'txt'): 1})

Hope this helps.

score 1 · Accepted Answer · answered Jan 24 '17 at 07:52

1

Thanks everyone. was able to get it with

nltk.Text(mycorpus.words('Dracula.txt')).collocations()

answered Jan 24 '17 at 07:52

Kimberly James

106
11

List the collocations for a txt file

2 Answers2