0

As a beginner text-miner, I really want to ask for advices/guidelines on graph mining, based on a real need of me: build a keyword-related graph from an initial input keyword.

I know the topic is relatively large, so I want to do it for Twitter first: I have harvested a tweet corpus of the keywords "survey" and "market". I want to mine from that corpus to build a graph of keywords related to "survey" or "market".

I have tried using NodeXL and NLTK but I couldn't do what I want.

ЯegDwight
  • 24,821
  • 10
  • 45
  • 52
karmiphuc
  • 81
  • 1
  • 9
  • NodeXL can show you the most frequent hashtags, words, word pairs, and URLs for the entire graph and for any clusters you have. What weren't you able to do? – edallme Oct 09 '12 at 16:14

1 Answers1

3

I'm not really sure what your goals are, but here are some suggestions.
You have several options on the type of graph you can build.

  • you can build a bipartite graph with tweets on one side and key words on the other.
  • you could build a network where the vertices are tweets and the edges represents a common term
  • or you could build a network where the vertices are key words and edges represent that the keywords appeared in the same tweet
It all depends on what you are trying to discover.

Take a look at http://www.kdnuggets.com/websites/twitter-analytics-data-mining.html for some suggestions

There are also a number of excellent paper on graph-based mining of Twitter publish by the IEEE and/or ACM

BradRees
  • 106
  • 1
  • 8