How to match subgraphs with edge attributes in networkx correctly?

Question

I'm new to subgraph matching and I found that networkx has a pretty convenient way to do that. However I'm not sure how to match subgraphs correctly on graphs with edge attributes.

For example, say I have the following graph that has edge attributes:

import networkx as nx
import networkx.algorithms.isomorphism as iso

G = nx.MultiDiGraph()
G.add_edge(1, 2, label="A")
G.add_edge(1, 2, label="D")
G.add_edge(2, 1, label="B")
G.add_edge(2, 3, label="C")

And I'm trying to find a subgraph with the following structure:

SG = nx.MultiDiGraph()
SG.add_edge(5, 6)
SG.add_edge(6, 7)

Note that this subgraph does not have edge attribute so I was trying to find the subgraphs with any edges that fits the node structure (in this case a simple path of length 2). I found these 2 SO questions (one, two) which directed me to try the following 2 approaches:

Approach 1

gm = iso.MultiDiGraphMatcher(G, SG, edge_match=iso.categorical_edge_match("label", None))
print(gm.subgraph_is_monomorphic())

for subgraph_x in gm.subgraph_monomorphisms_iter():
    print(subgraph_x)

This prints:

True
{1: 5, 2: 6, 3: 7}

However, I assumed that this will print two different subgraphs since there are 2 different edges between the nodes 1 and 2. Meaning, the two subgraphs should be ((1,2) with edge label "A", (2,3) with edge label "C"), and ((1,2) with edge label "D", (2,3) with edge label "C"). I also couldn't find an approach to output which edge labels correspond to the output nodes which would have helped understand which path this ouputs.

Approach 2

SG = nx.MultiDiGraph()
SG.add_edge(5, 6, label="A")
SG.add_edge(6, 7, label="C")

gm = iso.MultiDiGraphMatcher(G, SG, edge_match=lambda x, y: x[0]['label'] == y[0]['label'])
print(gm.subgraph_is_monomorphic())
for subgraph_x in gm.subgraph_monomorphisms_iter():
    print(subgraph_x)

This prints

True
{1: 5, 2: 6, 3: 7}

Here I specified which edges I want in the subgraph and the output makes sense, since there is only 1 subgraph with that structure. For example, if I change SG.add_edge(6, 7, label="C") to SG.add_edge(6, 7, label="D"), this prints False.

However, for my use case I need the first approach to work, since my input is a subgraph without edge attributes.

I also tried using a graph without edge attributes and changing between subgraph_isomorphisms_iter and subgraph_monomorphisms_iter:

G = nx.MultiDiGraph([(1,2), (1,3), (1,4), (1,4), (4,5), (5,6), (1,7), (7,8)])
subgraph = nx.MultiDiGraph([(10,20), (10,30), (10,40), (40,50)])

subgraphs = []
GM = iso.MultiDiGraphMatcher(G,subgraph,edge_match=iso.categorical_edge_match([],[]))
for subgraph_x in GM.subgraph_monomorphisms_iter():
    subgraphs.append(subgraph_x)
subgraphs

using subgraph_monomorphisms_iter this prints:

[{1: 10, 2: 20, 3: 30, 4: 40, 5: 50},
 {1: 10, 2: 20, 3: 30, 7: 40, 8: 50},
 {1: 10, 2: 20, 4: 30, 7: 40, 8: 50},
 {1: 10, 2: 20, 7: 30, 4: 40, 5: 50},
 {1: 10, 3: 20, 2: 30, 4: 40, 5: 50},
 {1: 10, 3: 20, 2: 30, 7: 40, 8: 50},
 {1: 10, 3: 20, 4: 30, 7: 40, 8: 50},
 {1: 10, 3: 20, 7: 30, 4: 40, 5: 50},
 {1: 10, 4: 20, 2: 30, 7: 40, 8: 50},
 {1: 10, 4: 20, 3: 30, 7: 40, 8: 50},
 {1: 10, 7: 20, 2: 30, 4: 40, 5: 50},
 {1: 10, 7: 20, 3: 30, 4: 40, 5: 50}]

Which seems to be very redundant as there should only be 3 subgraphs, but even using subgraph_isomorphisms_iter it seems to find 2 which are redundant:

subgraphs = []
GM = nx.algorithms.isomorphism.DiGraphMatcher(G,subgraph) #dictionary that maps nodes of main graph to nodes of subgraph.
for subgraph_x in GM.subgraph_isomorphisms_iter():
    subgraphs.append(subgraph_x)

This prints

[{1: 10, 2: 20, 3: 30, 7: 40, 8: 50}, {1: 10, 3: 20, 2: 30, 7: 40, 8: 50}]

However, when I tried switching the first 2 approaches to subgraph_isomorphisms_iter it didn't find any subgraphs. So i'm very confused.

Update I found that this SO question explains the difference between isomorphism and monomorphism in networkx well. And found the source code for the matcher if that helps. It also seems from here that the redundancy is a result of symmetries and if I understand correctly networkx doesn't take symmetries into account.

score 1 · Accepted Answer · answered Aug 11 '22 at 11:21

My approach:

Expand MultiDiGraph to DiGraph, perform isomorphism check on expanded DiGraph, this way you account for multiple edges as they are counted seperately:

import networkx as nx
import networkx.algorithms.isomorphism as iso

G = nx.MultiDiGraph()
G.add_edge(1, 2, label="A")
G.add_edge(1, 2, label="D")
G.add_edge(2, 1, label="B")
G.add_edge(2, 3, label="C")

SG = nx.MultiDiGraph()
SG.add_edge(5, 6)
SG.add_edge(6, 7)

def expand_graph(G):
    '''
    Expand the graph such that for each edge e from u to v:
    e=(u,v) you create an edge specific vertex: e -> (u,e_)(e_,u)
    This way your MultiDiGraph becomes a DiGraph
    '''
    H = nx.DiGraph()
    for u in G.nodes():
        key = 0
        for e in G.out_edges(u, data=True):
            H.add_edge(u, str(u)+'_out_'+str(key))
            H.add_edge(str(u)+'_out_'+str(key), e[1])
            key+=1
            
    nx.set_node_attributes(H, {n:0 for n in H.nodes()}, 'original')
    nx.set_node_attributes(H, {n:1 for n in G.nodes()}, 'original')
    return H

def collapse_graph(G_):
    '''
    Turns a DiGraph generated with the expand_graph function back 
    to a MultiDiGraph, adds edge attribtute label to distinguish edges
    '''
    H = nx.MultiDiGraph()
    for u, d  in G_.nodes(data=True):
        if d['original']==1:
            for u, e_ in G_.out_edges(u):
                for e_, v in G_.out_edges(e_):
                    H.add_edge(u, v, label = e_.split('_')[-1])
    return H

G_ = expand_graph(G)
SG_ = expand_graph(SG)

subgraphs = []
GM = nx.algorithms.isomorphism.DiGraphMatcher(G_,SG_) 

#Perform the check solely for isomorph subgraphs, 
#note this includes also isomorph subgraphs on the set of vertices e_
for subgraph_x in GM.subgraph_isomorphisms_iter():
    subgraphs.append(subgraph_x)

result=[]
subgraphs_valid = []

#This gets rid of subgraphs that are isomorph but not on the right set of nodes,
#i.e. we want to check whether the nodes are of mixed type (e_ mixed with u)
    
for s in subgraphs:
    if all([True if type(u)== type(v) else False for u,v in s.items()]):
        subgraphs_valid.append(s)
        
#Iterate over the isomorph subgraphs, extract the subgraphs and parse them as results
#Collapse them in order to get a edge representations with labels for the edges

for s in subgraphs_valid:  
    f = [k for k, v in s.items()]
    F_ = G_.subgraph(f)
    F = collapse_graph(F_)
    result.append(list(F.edges(data=True)))
    
print(result)

I am sure there is an easier solution... But maybe the approach helps you.

Thanks a lot! I'm still going over it and testing a few things, but one question I have is about the labels. The output of this seems to be: `[[(1, 2, {'label': '0'}), (2, 3, {'label': '1'})], [(1, 2, {'label': '1'}), (2, 3, {'label': '1'})]]`, how can I get the original edge labels? That is, instead of `(1, 2, {'label': '0'})` output (1, 2, {'label': 'A'}) — Penguin, Aug 11 '22 at 15:54
I also find that this have the same issue with the duplicates. For example, if my inputs are: `G = nx.MultiDiGraph([(1,2), (1,3), (1,4), (4,5), (5,6), (1,7), (7,8), (8,9)])` and `SG = nx.MultiDiGraph([(10,20), (10,30), (10,40), (10,70), (70,80), (80,90)])`. The results have 12 subgraphs, which should only be 2 I believe. For example: `result[0]` and `result[2]` print the same thing: `[(1, 2, {'label': '0'}), (1, 3, {'label': '1'}), (1, 4, {'label': '2'}), (1, 7, {'label': '3'}), (4, 5, {'label': '0'}), (5, 6, {'label': '0'})]` — Penguin, Aug 11 '22 at 16:05
Though I find that a simple solution to the duplicates is just to create a function that takes the results, and create a new list which append each item from the result if it's not there already. Though I imagine that the duplicates can take a lot of space with larger graph so I'm not sure if there's a better solution to filter the duplicates before — Penguin, Aug 11 '22 at 16:20

How to match subgraphs with edge attributes in networkx correctly?

1 Answers1