22

I am trying to extract from a big graph the sub-graph of all connected nodes containing a specific node.

Is there a solution in the Networkx library?

[EDIT]
My graph is a DiGraph

[EDIT]
Rephrased simply:
I want the part of my graph that contain my specific node N_i and and all the nodes that are connected directly or indirectly (passing by other nodes) using any incoming or outcoming edges.
Example:

>>> g = nx.DiGraph()
>>> g.add_path(['A','B','C',])
>>> g.add_path(['X','Y','Z',])
>>> g.edges()
[('A', 'B'), ('B', 'C'), ('Y', 'Z'), ('X', 'Y')]

My desired result would be:

>>> g2 = getSubGraph(g, 'B')
>>> g2.nodes()
['A', 'B', 'C']
>>> g2.edges()
[('A', 'B'), ('B', 'C')]
Joel
  • 22,598
  • 6
  • 69
  • 93
Alban Soupper
  • 671
  • 1
  • 5
  • 20
  • 1
    It's not clear from your question what subgraph you want. If you want a subgraph that contains node N_i with no isolated nodes then e.g. the neighbors of N_i satisfy that. If you want the largest subgraph containing N_i but with with no isolated nodes then removing all isolated nodes from the graph would work (as long as N_i isn't degree 0). That graph won't necessarily be connected. If you want all of the nodes reachable from N_i consider nx.shortest_path(G,N_i)... – Aric Dec 18 '12 at 00:57
  • Not sure if you're checking this, but please check the edit I did of your title. What you had was not actually the question you ended up asking. – Joel Nov 19 '15 at 05:21

4 Answers4

27

You can use shortest_path() to find all of the nodes reachable from a given node. In your case you need to first convert the graph to an undirected representation so both in- and out-edges are followed.

In [1]: import networkx as nx

In [2]: >>> g = nx.DiGraph()

In [3]: >>> g.add_path(['A','B','C',])

In [4]: >>> g.add_path(['X','Y','Z',])

In [5]: u = g.to_undirected()

In [6]: nodes = nx.shortest_path(u,'B').keys()

In [7]: nodes
Out[7]: ['A', 'C', 'B']

In [8]: s = g.subgraph(nodes)

In [9]: s.edges()
Out[9]: [('A', 'B'), ('B', 'C')]

Or in one line

In [10]: s = g.subgraph(nx.shortest_path(g.to_undirected(),'B'))

In [11]: s.edges()
Out[11]: [('A', 'B'), ('B', 'C')]
Aric
  • 24,511
  • 5
  • 78
  • 77
14

Simply loop through the subgraphs until the target node is contained within the subgraph.

For directed graphs, I assume a subgraph is a graph such that every node is accessible from every other node. This is a strongly connected subgraph and the networkx function for that is strongly_connected_component_subgraphs.

(MWE) Minimal working example:

import networkx as nx
import pylab as plt

G = nx.erdos_renyi_graph(30,.05)
target_node = 13

pos=nx.graphviz_layout(G,prog="neato")

for h in nx.connected_component_subgraphs(G):
    if target_node in h:
        nx.draw(h,pos,node_color='red')
    else:
        nx.draw(h,pos,node_color='white')

plt.show()

enter image description here

For a directed subgraph (digraph) example change the corresponding lines to:

G = nx.erdos_renyi_graph(30,.05, directed=True)
...
for h in nx.strongly_connected_component_subgraphs(G):

enter image description here

Note that one of the nodes is in the connected component but not in the strongly connected component!

Hooked
  • 84,485
  • 43
  • 192
  • 261
  • @AlbanSoupper As noted `strongly_connected_component_subgraphs` **does** work with directed graphs (digraphs). – Hooked Dec 17 '12 at 15:47
  • I am novice in graph, but to my point of view in a directed graph the subgraph is not strongly connected... Think about a directed path_graph... – Alban Soupper Dec 17 '12 at 15:53
  • @AlbanSoupper It is not clear what you are intending when you say subgraph then... without a link to a mathematical definition it will be hard to help you. Also details like a directed or undirected graph are important when you first ask the question! – Hooked Dec 17 '12 at 15:56
  • Sorry, I will rephrase my question, let me 5 min – Alban Soupper Dec 17 '12 at 15:59
  • I accept your answer because it would be the perfect answer if I was working with a non-directed graph :) Thanks – Alban Soupper Dec 17 '12 at 21:06
3

I found three solution to solve your requirement, just same as mine. The size of my Digraph are between 6000 to 12000 nodes, and max subgraph size will up to 3700. Three function I used are:

def create_subgraph_dfs(G, node):
    """ bidirection, O(1)"""
    edges = nx.dfs_successors(G, node)
    nodes = []
    for k,v in edges.items():
        nodes.extend([k])
        nodes.extend(v)
    return G.subgraph(nodes)

def create_subgraph_shortpath(G, node):
    """ unidirection, O(1)"""
    nodes = nx.single_source_shortest_path(G,node).keys()
    return G.subgraph(nodes)

def create_subgraph_recursive(G, sub_G, start_node):
    """ bidirection, O(nlogn)"""
    for n in G.successors_iter(start_node):
        sub_G.add_path([start_node, n])
        create_subgraph_recursive(G, sub_G, n)

The test result is summary as follow:

timeit ms

Jesse
  • 3,243
  • 1
  • 22
  • 29
  • `create_subgraph_shortpath(G, node)` worked for me to find the connected component of a directed graph. However, it feels like I might be overlooking an obvious function from the API to get such a result directly for a DiGraph. For `Graphs` using `connected_components()` is so easy in comparison. – timmwagener Aug 20 '17 at 19:18
  • @timmwagener yes, but that will yield a generator for all of them within G, we want just the connected component containing a given node. – MathCrackExchange Sep 20 '22 at 19:14
2

Use the example at the end of the page connected_component_subgraphs.

Just ensure to refer the last element from the list rather than the first

>>> G=nx.path_graph(4)
>>> G.add_edge(5,6)
>>> H=nx.connected_component_subgraphs(G)[-1]
Abhijit
  • 62,056
  • 18
  • 131
  • 204
  • 1
    It seems to me that this will extract the _largest_ subgraph not the subgraph (as OP notes) "containing a specific node". – Hooked Dec 17 '12 at 15:45
  • @Hooked: I got mislead with the subject `extract the smallest connected subgraph`, which in this case would as the list returned is sorted in descending order of size,. Instead if OP wants the subgraph with the specific node, your solution makes sense – Abhijit Dec 17 '12 at 15:56
  • 1
    It appears the most relevant page in the current documentation is [this](https://networkx.github.io/documentation/stable/reference/algorithms/generated/networkx.algorithms.components.connected_components.html#networkx.algorithms.components.connected_components) (`connected_component_subgraphs` itself being gone.) – jtniehof Oct 28 '19 at 13:45
  • `connected_component_subgraphs` has been [deprecated](https://github.com/networkx/networkx/pull/2819). [To create the induced subgraph of each component](https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.components.connected_components.html#connected-components) use: `S = [G.subgraph(c).copy() for c in nx.connected_components(G)]` – moi Dec 06 '22 at 09:25