1

I am running some data analysis with a Jupyter notebook where I have a query with a variable length matching like this one:

MATCH p=(s:Skill)-[:BROADER*0..3]->(s)
WHERE s.label='py2neo' or s.label='Python'
RETURN p

I would like to plot its result as a graph, using networkx.

So far I have found two unsatisfactory solutions. Based on an notebook here, I can generate a graph using cypher magic whose result is directly understood by the networkx module.

result = %cypher MATCH p=(s:Skill)-[:BROADER*0..3]->(s) WHERE s.label='py2neo' or s.label='Python' RETURN p

nx.draw(result.get_graph())

However, then I am unable to find a way to add the labels to the plot.

That solution bypasses py2neo. With py2neo I can put labels on a graph, as long as I don't use a variable length pattern.

Example:

query='''MATCH p=(s1:Skill)-[:BROADER]->(s2)
WHERE s1.label='py2neo' or s1.label='Python'
RETURN s1.label as child, s2.label as parent'''

df = sgraph.data(query)

And then, copying from a response here in Stackoverflow (which I will link later) I can build the graph manually

G=nx.DiGraph()   
G.add_nodes_from(list(set(list(df.iloc[:,0]) + list(df.iloc[:,1]))))

#Add edges

tuples = [tuple(x) for x in df.values] 
G.add_edges_from(tuples)
G.number_of_edges()

#Perform Graph Drawing
#A star network  (sort of)
nx.draw_networkx(G)
plt.show()

With this I get a graph with labels, but to get something like the variable length matching I should use multiple queries.

But how can I get the best of both worlds? I would prefer a py2neo solution. Rephrasing: How can I get py2neo to return a graph (not a table) and then be able to pass such information to networkx, being able to determine which, from the multiple possible labels, are the ones to be shown in the graph?

HerrIvan
  • 650
  • 4
  • 17

1 Answers1

0

The question at the end was how can I get a table containing all the edges out of a subgraph that matches a certain query.

The Cypher that does the trick is:

MATCH (source:Skill)-[:BROADER*0..7]->(dest:Skill)
WHERE source.label_en in ['skill1','skill2'] 
WITH COLLECT(DISTINCT source)+COLLECT(dest) AS myNodes
UNWIND myNodes as myNode
MATCH p=(myNode)-[:BROADER]->(neighbor)
WHERE neighbor in myNodes
RETURN myNode.label_en as child ,neighbor.label_en as parent

The first two lines get the nodes belonging to said subgraph. The last five unwind it as pairs of nodes connected by a directed edge. The 0 in the second MATCH allows for collecting isolated nodes that belong to the original list.


as in 2019, with current py2neopackages, a way that this thing would work is

query = '''
MATCH (source:Skill)-[:BROADER*0..7]->(dest:Skill)
WHERE source.label_en in ['skill1','skill2'] 
WITH COLLECT(DISTINCT source)+COLLECT(dest) AS myNodes
UNWIND myNodes as myNode
MATCH p=(myNode)-[:BROADER]->(neighbor)
WHERE neighbor in myNodes
RETURN myNode.label_en as child ,neighbor.label_en as parent
'''

df = pd.DataFrame(graph.run(query).data())

G=nx.DiGraph()   
G.add_nodes_from(list(set(list(df['child']) + list(df.loc['parent']))))

#Add edges

tuples = [tuple(x) for x in df.values] 
G.add_edges_from(tuples)
G.number_of_edges()

#Perform Graph Drawing
#A star network  (sort of)
nx.draw_networkx(G)
plt.show()
HerrIvan
  • 650
  • 4
  • 17
  • I liked your question as I am in the same place as you were - but your answer does not really help me to draw my subnetwork (the query result) with networkx. I appreciate if you elaborate, e.g. add the code related to bridging the result with networkx – Areza Nov 27 '19 at 11:57
  • @user702846 I cannot check the solution right now, but I think that the could above should work. – HerrIvan Nov 27 '19 at 14:17