TLDR: How do you use node_match attributes to get NetworkX to recognise C+ and C atoms as different?
Here is an example of a pair of molecules I have calculated GED for.
I got a value of 0 for the GED using the following code:
import networkx as nx
def get_graph(mol):
atoms = [atom.GetAtomicNum() for atom in mol.GetAtoms()]
am = Chem.GetAdjacencyMatrix(mol,useBO=True)
for i,atom in enumerate(atoms):
am[i,i] = atom
G = nx.from_numpy_matrix(am)
return G
G1 = get_graph(mol1)
G2 = get_graph(mol2)
GED= nx.graph_edit_distance(G1, G2, edge_match=lambda a,b: a['weight'] == b['weight'])
print(GED)
So my understanding of the edge_match=lambda in this case is that it is being used to distinguish between single bonds and double bonds, is this correct? I believe this to be the case because when I run the code for propene and propane it gives a GED of 1, which to me would signify the change of the edge (double bond to single bond). However, I believe that the reason that this code gives a GED of 0 for these two molecules is because it is considering the C+ and C atoms to be the same? Therefore considering the two structures as identical. How would I encode for the graph structure to recognise the C+ and C as different? I have been reading the NetworkX documentation for atom_match attributes but I really don't understand how I can use this to do what I want to do. If this isn't the solution then would I have to encode the Hydrogen numbers somehow?
(Side note: When using the same code for the same structures but with B in place of C, it gives a GED of 2, which I believe is because the B is set as BH where C is just C+. Picture of molecules below)