Networkx graph clustering

Question

in Networkx, how can I cluster nodes based on nodes color? E.g., I have 100 nodes, some of them are close to black, while others are close to white. In the graph layout, I want nodes with similar color stay close to each other, and nodes with very different color stay away from each other. How can I do that? Basically, how does the edge weight influence the layout of spring_layout? If NetworkX cannot do that, is there any other tools can help to calculate the layout?

Thanks

score 8 · Accepted Answer · edited May 23 '13 at 08:48

Ok, lets build us adjacency matrix W for that graph following the simple procedure: if both of adjacent vertexes i-th and j-th are of the same color then weight of the edge between them W_{i,j} is big number (which you will tune in your experiments later) and else it is some small number which you will figure out analogously.

Now, lets write Laplacian of the matrix as L = D - W, where D is a diagonal matrix with elements d_{i,i} equal to the sum of W i-th row.

Now, one can easily show that the value of fLf^T, where f is some arbitrary vector, is small if vertexes with huge adjustments weights are having close f values. You may think about it as of the way to set a coordinate system for graph with i-the vertex has f_i coordinate in 1D space.

Now, let's choose some number of such vectors f^k which give us representation of the graph as a set of points in some euclidean space in which, for example, k-means works: now you have i-th vertex of the initial graph having coordinates f^1_i, f^2_i, ... and also adjacent vectors of the same color on the initial graph will be close in this new coordinate space.

The question about how to choose vectors f is a simple one: just take couple of eigenvectors of matrix L as f which correspond to small but nonzero eigenvalues.

This is a well known method called spectral clustering.

Further reading: The Elements of Statistical Learning: Data Mining, Inference, and Prediction. by Trevor Hastie, Robert Tibshirani and Jerome Friedman

which is available for free from the authors page http://www-stat.stanford.edu/~tibs/ElemStatLearn/

Isn't it what is implemented in scikit-learn SpectralClustering? — Juh_, Dec 08 '13 at 15:23
@Juh_ probably, I didn't know about scikit-learn at the time of writing. — Moonwalker, Dec 08 '13 at 20:08

Networkx graph clustering

1 Answers1