2

Is it possible to apply association rules algorithm to text?

Suppose I have a database with multiple user comments. I want to know what words are frequently typed together. (eg: when the word "pizza" appears in a comment, the word "Domino's" usually appears too)

Are association rules optimal for this case? Are there any alternatives to association rules that are faster? It doesn't matter which language implements it (but preferably Python or R, can also use RapidMiner).

Example:

  comments
1 I ate Domino's pizza and it's the best
2 Yesterday I ate Domino's pizza
3 I like pizza, but not Domino's

Result should be:

pizza
  Domino's (1.0) => strongest association because the word "Domino's" appeared every time the word "pizza" appeared
  ate (0.66) => the word "ate" appeared 2/3 of the times the word "pizza" appeared
Jeff
  • 833
  • 1
  • 8
  • 17
  • Can you give an example bit of data, and what the desired output should look like? – Remko Duursma Oct 09 '17 at 23:47
  • 1
    Looks like duplicate of [this post](https://stackoverflow.com/questions/27153320/build-word-co-occurence-edge-list-in-r), or better yet search for 'co-occurrence', [this tutorial](http://tidytextmining.com/ngrams.html) has a more in-depth treatment (under 'counting') : – Remko Duursma Oct 11 '17 at 04:26

0 Answers0