I'm new to Python and I would like to use the package graph-tool to estimate the optimal number of communities in my network using the stochastic block model (nested and non-nested) approach.
I read the documentation related to the core functions "Graph" (to create a graph) and then "minimize_blockmodel_dl" and "minimize_nested_blockmodel_dl" to finally have what I need, but I couldn't find anything specific for bipartite networks.
It seems that function Graph doesn't allow one to create a bipartite graph, but that would be strange...
For that reason, I just saw how to create one using the networkx package and then transform it into a Graph object using the functions that I found online:
#def get_prop_type(value, key=None):
# Deal with the value
if isinstance(value, bool):
tname = 'bool'
elif isinstance(value, int):
tname = 'float'
value = float(value)
elif isinstance(value, float):
tname = 'float'
elif isinstance(value, str):
tname = 'string'
elif isinstance(value, dict):
tname = 'object'
else:
tname = 'string'
value = str(value)
return tname, value, key
#
def nx2gt(nxG):
# Phase 0: Create a directed or undirected graph-tool Graph
gtG = Graph(directed=False)
# Add the Graph properties as "internal properties"
for key, value in nxG.graph.items():
# Convert the value and key into a type for graph-tool
tname, value, key = get_prop_type(value, key)
prop = gtG.new_graph_property(tname) # Create the PropertyMap
gtG.graph_properties[key] = prop # Set the PropertyMap
gtG.graph_properties[key] = value # Set the actual value
# Phase 1: Add the vertex and edge property maps
# Go through all nodes and edges and add seen properties
# Add the node properties first
nprops = set() # cache keys to only add properties once
for node, data in nxG.nodes(data=True):
# Go through all the properties if not seen and add them.
for key, val in data.items():
if key in nprops: continue # Skip properties already added
# Convert the value and key into a type for graph-tool
tname, _, key = get_prop_type(val, key)
prop = gtG.new_vertex_property(tname) # Create the PropertyMap
gtG.vertex_properties[key] = prop # Set the PropertyMap
# Add the key to the already seen properties
nprops.add(key)
# Also add the node id: in NetworkX a node can be any hashable type, but
# in graph-tool node are defined as indices. So we capture any strings
# in a special PropertyMap called 'id' -- modify as needed!
gtG.vertex_properties['id'] = gtG.new_vertex_property('string')
# Add the edge properties second
eprops = set() # cache keys to only add properties once
for src, dst, data in nxG.edges(data=True):
# Go through all the edge properties if not seen and add them.
for key, val in data.items():
if key in eprops: continue # Skip properties already added
# Convert the value and key into a type for graph-tool
tname, _, key = get_prop_type(val, key)
prop = gtG.new_edge_property(tname) # Create the PropertyMap
gtG.edge_properties[key] = prop # Set the PropertyMap
# Add the key to the already seen properties
eprops.add(key)
# Phase 2: Actually add all the nodes and vertices with their properties
# Add the nodes
vertices = {} # vertex mapping for tracking edges later
for node, data in nxG.nodes(data=True):
# Create the vertex and annotate for our edges later
v = gtG.add_vertex()
vertices[node] = v
# Set the vertex properties, not forgetting the id property
data['id'] = str(node)
for key, value in data.items():
gtG.vp[key][v] = value # vp is short for vertex_properties
# Add the edges
for src, dst, data in nxG.edges(data=True):
# Look up the vertex structs from our vertices mapping and add edge.
e = gtG.add_edge(vertices[src], vertices[dst])
# Add the edge properties
for key, value in data.items():
gtG.ep[key][e] = value # ep is short for edge_properties
return gtG
#
So, using its method list_properties() I see the following:
directed (graph) (type: bool, val: 0) bipartite (vertex) (type: string) id (vertex) (type: string) weight (edge) (type: double)
OK, undirected, bipartite, with vertices that have the integer sequence as labels and weights for the edges.
So far, it seems everything's fine.
Finally, trying to give the new Graph object to the function minimize_blockmodel_dl and using the method get_blocks() to get the final labels for each vertex in the network, I realize that actually it happens that vertices belonging to different sets of the bipartite network are grouped in clusters together with vertices of the other set of the network. This means that the initial property of being bipartite is not there anymore and the model doesn't apply this constraint.
Why?
I hope some of you who has been using these functions can help me solve my problem. Thanks!