-1

I am currently working with a clustering algorithm in python. My data is a sparse matrix with 40,000 node and 400,000 edges. For example:

(0, 10)    1
(0, 14)    1

My clustering result is a python list will be similar to the following but much larger:

[ 9  9  9  9  9  9  9  9  9  9  2  2  2  2  2  2  3  3  3  3  3  3  3  2  6  6  0  2  7  4  2  2  2  2  4  4  4  4  4 10  6  6  6  2  7  7  5  5 1  0  0 10 10 10  0  0  0  1  1  1  1  1  1  1  1  1  1  1  6  6  6  6 2  8  8  6  1]

I originally use networkx to draw the graph but it only works on smaller cases Here is my code:

`
def plotCluster(W, predict):
    color = list(col.cnames.keys())
    # G = nx.from_numpy_matrix(W)
    G = nx.from_scipy_sparse_matrix(W)
    print(type(G))
    color_map = []
    for key in predict:
        for i in predict[key]:
            color_map.append(color[i+10])
    nx.draw(G, node_color=color_map, with_labels=True)
    plt.show()`

The result will look like this:

enter image description here

I want to use gephi now, but I can only write my original data as a gexf file and open in gephi. I do not know how to using my own clustering result and draw a similar graph as the python.

Nipun Thennakoon
  • 3,586
  • 1
  • 18
  • 26
Ti Chen
  • 11
  • 4

2 Answers2

0
  1. Store cluster IDs for each node as a node attribute (say, "part"; use nx.set_node_attributes()).
  2. Save the file as GraphML using nx.write_graphml().
  3. Open the file in Gephi.
  4. Apply a partition based on the "part" attribute.
DYZ
  • 55,249
  • 10
  • 64
  • 93
0

I was desperately looking for a way to color nodes on the basis of node properties we define. For example, I was dealing with some pricing and wanted to color nodes if the price of an item in node lies in some range. Thanks, DYZ for the steps, I used your steps to make things work. I am going to post more detailed answer, so it can be helpful for others who want to do the same

For those reading the data from an extra file and want to color the graph on the basis of this data I used the following approach. my property data file was saved in dictionary format by using dumpfn from monty, so I am reading this by using loadfn

from monty.serialization import dumpfn, loadfn
from collections import defaultdict
property_data = loadfn("property_data_file")
attribute_color_dict = defaultdict(dict)

I assume one has networkx graph (say graph) of whose node you want to color

#iterate through graph nodes
for i in graph.nodes:

    #get property you want to color and make a range for each color
    target_property_to_color = property_data[i]['property_you_want_to_color']

    if float(target_property_to_color) <=2000:
        attribute_color_dict[i]['color'] = "#0000FF" #blue color

    elif 2000< float(target_property_to_color) <10000:
        attribute_color_dict[i]['color'] = "#008000" #green color

    else: 
        attribute_color_dict[i]['color'] = '#FF0000' #red color

attributes = dict(attribute_color_dict)

#add attributes in the graph    
nx.set_node_attributes(graph, attributes)  

#save the graph in graphml format
nx.write_graphml(graph, "colored_nodes.graphml" )

Now, you can open your gephi app and then load the graph. At first you will see the edges colored according to node color but you can get rid of this by unchecking node color option in gephi (as shown below). enter image description here

hemanta
  • 1,405
  • 2
  • 13
  • 23