I have a large set of vertices/nodes that represent a set of graphs. Note that there could be many independent graphs within this complete set. The goal is to find the min number of vertices across all these graphs that correspond to the largest total sum of weights across all the edges captured by those selected vertices. I have the adjacency matrix in pandas and I am using networkx.
Below is a sample dataframe with the three columns where Number_Of_Trips is the weight. I could provide a weight of node = 10*trips in order to merge the two metrics together. I.e. maximizing # of Trips - 10*NumberOfNodes
Number_Of_Trips dropoff_gh7 pickup_gh7
0 304 9tbqhsx 9tbqj4g
1 271 9tbqj4f 9tbqhsx
2 263 9tbqt4s 9tbqhsx
3 258 9tbqdye 9tbqdsr
4 256 9tbqhgh 9tbqjfv
5 236 9tbqhsw 9tbqj4g
6 233 9tbqt4g 9tbqv03
7 229 9tbqhsx 9tbqj4c
8 218 9tbqy3f 9tbqt4s
9 213 9tbq5v4 9tbqh41
10 210 9tbqhgh 9tbqhsw
11 192 9tbqhgh 9tbqje4
12 186 9tbqy3f 9tbqt4g
13 184 9tbqhgh 9tbqj4z
14 183 9tbqe3d 9tbqe9e
15 170 9tbq3xn 9tbq39w
16 167 9tbq5bw 9tbqht6
17 163 9tbqhsx 9tbqh0x
18 162 9tbqdk1 9tbq7p2
19 160 9tbqsch 9tbqt4s
x = nx.from_pandas_dataframe(df,"dropoff_gh7","pickup_gh7","Number_Of_Trips")
graphs = list(nx.connected_component_subgraphs(x))