0

I have a very large undirected graph dataset in the form of .txt, which I converted to networkx graph. I need to extract a connected subgraph containing N nodes (and E edges if possible, but not necessary). How do I go about it? This is the code I wrote for finding largest connected subgraph :

def find_subgraph(graph):
  connected_component_subgraphs = (graph.subgraph(c) for c in nx.connected_components(graph))
  largest_subgraph = max(connected_component_subgraphs, key=len)
  return largest_subgraph

The subgraph I want is somewhere in the middle of min and max subgraphs. Any help help will be appreciated.

Edit : It can be any N nodes, no specific nodes required

wamika
  • 21
  • 1
  • 8

1 Answers1

0

To find a connected subgraph with exactly N nodes

Select connected subgraph with node count closest but greater than N
delete_success = true
WHILE( delete_success )
    delete_success = false
    Loop n over nodes in subgraph
        Delete n from subgraph
        IF subgraph no longer connected
            restore n
            continue
        IF subgraph contains N nodes
            DONE
        delete_success = true
    END LOOP
END WHILE
report failure.

Note that this is not guaranteed to succeed. For example, if two nodes, that are connected together are removed together, then the remaining nodes might still be connected and greater than or equal to N.

ravenspoint
  • 19,093
  • 6
  • 57
  • 103