Converting adjacency matrix into distance matrix in python

Question

I transformed the following edgelist:

Source Target Weight
    A   B     12
    A   C     14
    A   D     56
    B   C     17
    B   F     14
    B   G     10

To the following adjaceny matrix:

{'A': {'B': {'weight': 12},
  'C': {'weight': 14},
  'D': {'weight': 56},
...

'B': {'C': {'weight': 17},
  'F': {'weight': 14},
  'G': {'weight': 10},
...

where the source column is the sender and the target column the receiver of investment; the weight column is the volume of the investment. I want to perform hierarchical clustering on this weighted network to find out which actors can be clustered together based on their mutual investment (the higher the mutual investment, the "closer" the actors are).

I am using SciPy's hierarchical clustering package (scipy.cluster.hierarchy) and my core problem is to transform the edge list above into a distance matrix that the package will know how to read properly. The distances should be smaller, the higher the weight of the ties is (and vice versa), but distance_matrix from scipy.spatial gives only errors when using the above-mentioned dataframe as input.

Is there a way to compute the distance matrix in a way that it reflects the edge weights in the described way? The main point is just to know ways of transforming the adjacency matrix to a distance matrix to make it usable for the scipy package

how is the edge list represented? is it in a list or a flat file right now? — LeKhan9, Oct 24 '18 at 14:27

score 0 · Answer 1 · answered Oct 24 '18 at 15:16

Assuming your edge list is represented as such:

ls = [ ['Source', 'Target', 'Weight'],
       ['A',   'B',     12],
       ['A',   'C',     14],
       ['A',   'D',     56],
       ['B',   'C',     17],
       ['B',   'F',     14],
       ['B',   'G',     10]
      ]

You can build your graph deliberately like this:

graph = {}
for connection in ls[1:]:
    source, sink, weight = connection[0], connection[1], connection[2]
    if source not in graph:
        graph[source] = {}

    if sink not in graph[source]:
        graph[source][sink]  = {}

    graph[source][sink]['weight'] = weight

graph printed:

{
    "A": {
        "B": {
            "weight": 12
        }, 
        "C": {
            "weight": 14
        }, 
        "D": {
            "weight": 56
        }
    }, 
    "B": {
        "C": {
            "weight": 17
        }, 
        "F": {
            "weight": 14
        }, 
        "G": {
            "weight": 10
        }
    }
}

Converting adjacency matrix into distance matrix in python

1 Answers1