3

I have been using both k-means and Fuzzy c means for a few days now on a tricky data set, its yielding okish results but I want to visualize and manipulate the graphical outputs and I found a fantastic visual tool Gephi. If you click on the picture on the main page it will load a video that you can watch.

On gephis supported graph formats page here they have a list of possible import formats:

* GEXF
* GDF
* GML
* GraphML
* Pajek NET
* GraphViz DOT
* CSV
* UCINET DL
* Tulip TPL
* Netdraw VNA
* Spreadsheet

Looking at matlab the format I could output my cluster data could be in csv. On gehpis site here they explain the formats, edge list, mixed, matrix.

Im not really sure what they mean. Using FCM in matlab I get 3 outputs centers, U and objFun.

[centers, U, objFun] = fcm(data, clusters, options);

So my question is how can I build CSV files from this data in the format that they require.

https://gephi.org/users/supported-graph-formats/spreadsheet/

http://forum.gephi.org/viewtopic.php?t=1896

I will reward anyone who can help with 100 points with a bounty, as this visualization tool is what I want to use from now on and as of yet there isnt any questions on stack which explain how this can be done. So it may be useful for the future and the community for gephi/matlab users.

G Gr
  • 6,030
  • 20
  • 91
  • 184
  • 5
    Gephi is a tool for visualizing networks (nodes and links); it is not designed specifically to visualize clustered data points. Your problem is not just "how do I write out a CSV file in the right format". You need to ask "how do I want to turn my clustered data points into a network". That is a data analysis question, not a programming question. – DGrady Jul 18 '12 at 23:31
  • [Related](https://stackoverflow.com/a/45620525/4157124). – user4157124 Sep 25 '17 at 00:51

1 Answers1

3

The issue here is that you need to be able to represent your data as a graph. Even if your data is not a graph, it can still be represented as one for visualization. You need to identify what in your data can represent nodes and what can represent edges. Once you do that, writing the data out to a file that can be imported by Gephi (or other graph/network visualization tools) is fairly straight forward. Since you have not posted an example of your data it is difficult to suggest how this can be done.

Ask yourself the following questions about your data:

  1. What can be represented as a node?
  2. What can be represented as an edge to link the nodes defined in #1?

Each node must have a unique identifier associated with it (this can be a simple numerical value or string).

This is the difficult part because representing your cluster data as a graph, if done incorrectly, can provide misleading interpretation from visualization.

Once you have this accomplished, the easiest way to get it into a file format is an edge list.

BgRva
  • 1,521
  • 12
  • 26