I have a large corpus of text (about 3 GB of plain text).
I want to build a search function.
When the user enters a keyword, I want to display a list of other keywords that are closely related.
For this, I don't want to use any generic synonym dictionary. Instead, I want a function to...
- see which other words keyword 1 usually "goes with" in my corpus
- find what other words these same words are also commonly associated with, other than my keyword 1 (which would be keyword 2, keyword 3, etc.)
Any ideas for approaches, libraries or examples here? I'm also open for suggestions for a better way of doing this.