1

I am trying to depict the relationship of different data entities with my ETL (extraction transformation loading) pipeline. The final output is a large directed graph. So far I am using Python to extract data relationship. Pydot helps me generate svg file which I can open up using a browser. The graph that I generate is static.

Pydot lets me setup tooltip and allows me to link other html pages with nodes or edges. I am looking for more than that.

A small portion of the graph is shown below

Directed Graph

enter image description here

I want to do several things with this graph.

  • Every node can have several attributes (including name). It is not possible for me to display those attributes because of paucity of space. But as users mouse over (or do other mouse based action) I would like those attributes to show up as "floating" table which user can dismiss if not interested.
  • Not all node attributes are integer or strings. It can be graphs as well. For example for one of the nodes I may have a bar chart showing how often this data entity is getting loaded in last 7 days. I would like that bar chart to float over as the user moves (clicks) mouse over that node. Currently I am using matplotlib to generate bar / pie charts associated (please see above) with nodes. I link those diagrams with my original directed graph using setURL in pydot. But the user experience is not great since it takes user over to a new page.
  • I am happy with node layout etc that I get by default from Pydot/GraphViz. I prefer not to do everything from ground up unless it is absolutely necessary.
  • Ability to highlight only certain part of the graph based on query in node or edge attributes

I read this forum and came across several options in response to questions similar to mine.

  • gephi
  • igraph (I played around with igraph which lets me query by vertex or edge. I couldn't figure out how I can make my final graph
    interactive based on user input e.g. floating table on mouse over of a node etc)
  • Javascript libraries - sigma.js, arbor.js, d3.js nodebox
  • networkX
  • nodebox

I have Python skill but quite novice on Javascript side. I would like to know from experts what can be my best bet (from functionality and ease of use point of view). A browser based solution is preferred.

Any suggestion / help will be really appreciated.

Thanks Abhijit

1 Answers1

1

Try NetworkX. Node attributes can be anything hashable, so that addresses (at least) your first two bullets.

You will still be using matplotlib to generate the charts. I don't know of a better solution than that.

Chris Barker
  • 2,279
  • 14
  • 15
  • Like NetworkX, igraph also supports that but to me the bigger challenge is how do I make the graph interactive (e.g. display some of these attributes as user interacts with the graph inside the browser, overlay the chart generated using matplotlib) – Abhijit Bhattacharya Jun 17 '13 at 22:41