Metropolis Hastings Random Walk SQL Implementation

Question

Is it possible and efficient to implement MHRW algorithm in SQL?

I want to sample a direct large graph with +1 million nodes and this seems to be one of the best ways to do it. The purpose of the algorithm is for undirect graphs, but I think it can work for directed ones too

The algorithm:

v <- initial node
while stop criteria not met do
   select node w uniformly at random from neighbors of v;
   generate uniformly at random  0<= p <= 1
   if p  <=  (degree of v) / (degree of w)
       then v <- w
   else 
       stay at v
   end if
end while

I take the initial node from table1, which contains all nodes and their properties. In table2 I have two columns that display all connections between nodes (and a way to get a nodes degree). The stop criteria would be the size of the sample, ie, while sample <= ~100.000 nodes.

Best regards.

Please show examples of sample data and what you want the results to look like. SQL is more about data than algorithms. — Gordon Linoff, Apr 19 '14 at 12:51
There isn´t a good example for this. I want a sample of nodes and their connections that preserves the properties of the original graph and the algorithm provides that effect. — npereira, Apr 19 '14 at 14:02
Why do you think it's a good idea to execute that algorithm in SQL? Wouldn't it be really slow? — Niklas B., Apr 19 '14 at 16:40
not really answering your question, but have you looked at Neo4J? It's a graph database which would suport this in a more natural way. — Nicolas78, Apr 19 '14 at 23:07
Properties like a given centrality, e.g., degree centrality distribution. I haven´t looked at Neo4J, i´ll do it now! — npereira, Apr 20 '14 at 20:23

Metropolis Hastings Random Walk SQL Implementation

0 Answers0