1

I want to implement a graph where nodes are items and outward edges are the popularity of the next item. (e.g. I've done this task, what are the most popular tasks performed after it?) My initial algorithm simply incremented a popularity relationship each time it was traversed, but this yields three problems:

First, there are potentially many (100,000+) items of up to 10-15 Unicode characters, and I'd like to keep the total space as small as possible. Second, each (relationship) number runs the risk of overflowing, and dividing the popularity by two each time a value approaches the edge is time consuming and loses a lot of accuracy as far as the differences in the popularity of other items.

(i.e. Assume a-4->b and a-5->c and a-255->d. With one byte, a-255->d will overflow at the next increment, but dividing the relationships by 2 will give: a-2->b and a-2->c and a-127->d)

Third, it makes it difficult for less popular edges to gain popularity.

I've also thought about a queue-like structure where each transition en-queues the next item and de-queues the oldest one when the queue is full. However, this has the problem of being too dynamic if the queue is too small and of eating up a HUGE amount of space, even if the queue is only ten elements.

Any ideas on algorithms/data structures for approaching this problem?

AaronF
  • 2,841
  • 3
  • 22
  • 32
  • why not [PageRank](https://en.wikipedia.org/wiki/PageRank) of [line graph](https://en.wikipedia.org/wiki/Line_graph)? – Patrick the Cat Nov 25 '15 at 17:58
  • Interesting idea. If I understand properly, though, PageRank uses the count of nodes linking into a node to calculate its importance. I need to rank solely on how often a node is reached from one particular node. – AaronF Nov 25 '15 at 20:32
  • PageRank quantifies importance in terms of flow. That is, if you are random walking on the network, it is the probability that the walker is at that node after infinitely many steps. It is the eigenvector (lambda = 1) of the transition matrix, plus some bias and tuning. Now you want to measure and rank that on edges, hence transform the graph into its dual line graph. – Patrick the Cat Nov 25 '15 at 21:44
  • Oh, right. I see what you're saying. – AaronF Nov 26 '15 at 18:39

0 Answers0