I transformed the following edgelist:
Source Target Weight
A B 12
A C 14
A D 56
B C 17
B F 14
B G 10
To the following adjaceny matrix:
{'A': {'B': {'weight': 12},
'C': {'weight': 14},
'D': {'weight': 56},
...
'B': {'C': {'weight': 17},
'F': {'weight': 14},
'G': {'weight': 10},
...
where the source column is the sender and the target column the receiver of investment; the weight column is the volume of the investment. I want to perform hierarchical clustering on this weighted network to find out which actors can be clustered together based on their mutual investment (the higher the mutual investment, the "closer" the actors are).
I am using SciPy's hierarchical clustering package (scipy.cluster.hierarchy
) and my core problem is to transform the edge list above into a distance matrix that the package will know how to read properly. The distances should be smaller, the higher the weight of the ties is (and vice versa), but distance_matrix
from scipy.spatial
gives only errors when using the above-mentioned dataframe as input.
Is there a way to compute the distance matrix in a way that it reflects the edge weights in the described way? The main point is just to know ways of transforming the adjacency matrix to a distance matrix to make it usable for the scipy package