-1

I'm trying to make a graph using hashtags from Instagram. Each node is a hashtag and has edges to each hashtag it was paired with in a post. I want to filter out the nodes (hashtags) with a low number of occurrences.

I'm able to filter them out by manually setting a limit like 35 or 30, but I want to make it so this limit is calculated using some parameters from the graph.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jehez
  • 1

1 Answers1

0

You do not say how you want to adjust the limit according to which parameters of the graph.

Guess: you want to set the limit so that it eliminates some percentage of the nodes. e.g. you want to eliminate 10% of the nodes, those that have the lowest occurrence count.

  • LOOP over every hashtag
    • Count occurences
    • save in array of pairs ( hashtag, #occurences )
  • Sort pair array in ascending number of #occurences
  • LOOP over sorted pair array
    • delete nodes with hashtag
    • IF required percentage of nodes have been deleted
      • STOP
ravenspoint
  • 19,093
  • 6
  • 57
  • 103